Let’s consider the following Action:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
using System; namespace HelloWorld { public class Program { public static void Main() { int i = 0; Action a = () => i++; a(); a(); a(); Console.WriteLine(i); Console.Read(); } } } |
From the lesson Func and Action, you remember that an Action is just a delegate that returns void and takes between 0 and 16 parameters of any type. But, in my example above, () indicate that we are taking no parameters in the Action. If you look a little closer at this code: () => i++;, you recognize it is a lambda expression, which fundamentally is a method. That piece code is the entire scope of that method, and there is no i defined in that scope; i is defined outside, within Main() body. Yet, I am still able to read and write to i variable inside the lambda expression.
Let’s take a more advanced example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
using System; namespace HelloWorld { public class Program { public static void Main() { Action a = ReturnALambda(); a(); a(); a(); Console.Read(); } static Action ReturnALambda() { int i = 0; return () => i++; } } } |
I’ve created a static method, ReturnALambda, which returns a lambda expression inside its body, as an Action. Then, I call this method and assign the returned result to the a Action. So, when I invoke a three times, I am actually calling ReturnALambda three times, and store the returned result in a.
Also, when I invoke a, it calls ReturnALambda, which, inside its body, captures the i‘s scope. This is called a closure. If I ran my program in debugging mode, I would get to the end curly bracket of ReturnALambda, which technically would mean the end of i‘s existence:
Now, if I advance my code execution, I will get to the point where a gets invoked again, and ReturnALmabda gets called too:
And you can notice, when I want to invoke a a second time, i has the value 1, even if it is declared as 0 inside ReturnALambda. So, i‘s scope is continuing on, so to say, even after its existence should have ended. In fact, any sane programmer sees that i is initialized with 0 every time we call ReturnALambda, so, even if we ran it once and i would get incremented to 1, and somehow survive to be passed to the next invocation, it should be re-initialized to 0, right?
If you remember from the delegates lesson, I was explaining there that a delegate gets converted by the compiler into a class, and I showed you that in MSIL (Microsoft Intermediate Language), using a tool called ILSpy. Based on that, you would think that in the case of closures (since Action is also a delegate of some sort), the compiler would also generate standalone classes for them, and you would be right.
Lets inspect our executable once again using ILSpy:
For now, I am only interested in ReturnALambda function. If we analyze it, we see that the compiler created a class named <>c__DisplayClass1_0, instantiated it with the name <>c__DisplayClass1_, it reset i to 0 for that instance, and then it returned a new Action with that instance’s ReturnALambda.
So, this starts to explain how the compiler keeps i‘s scope alive after the ending curly bracket of ReturnALambda is hit: it instantiates a class, and if we look at that class, we can see that it is a nested class of the Program class.
The compiler could have done this directly inside the Program class, to unwrap the lambdas:
1 2 3 4 5 6 |
static int i; static void SomeRandomName() { i++; } |
but the problem is, that wouldn’t keep i unique. For example, let’s modify our code a bit, to show this aspect:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
using System; namespace HelloWorld { public class Program { public static void Main() { Action a = ReturnALambda(); Action b = ReturnALambda(); b(); a(); a(); a(); b(); Console.Read(); } static Action ReturnALambda() { int i = 0; return new Action(() => i++); } static int i; static void SomeRandomName() { i++; } } } |
The first thing you gather from the above example is that b is different than a. Every time we call ReturnALambda, remember, the compiler actually converts the returned lambda expression into an Action: return new Action(() => i++);. This means that every time I call ReturnALambda, I get a new Action, so a is referencing a different Action than b, and both of them have a different, unique i. If I run my code now and debug it, I can see that a‘s i has the value of 3 (displayed as 2 in the below image, because the return statement where i gets incremented wasn’t executed yet):
while b‘s i has the value of 2 (again, displayed as 1 in the image below, because the return statement, where i gets incremented was not yet ran):
So, this is nice, closures keep them separate. But if the compiler unwrapped our lambda expression as a static i inside Program class, both a and b would reference the same i, so, regardless if I would invoke a or b, they would increment the same i. And that’s definitely not the effect we want with closures.
This is why the compiler, instead of putting i as a static variable inside Program class, it creates a nested class for it (<>c__DisplayClass1_0), and remember, delegates reference the method that is going to be invoked and the object which that method will be invoked upon. It is only logic that every time we create an Action, such as a or b, this nested class gets instantiated, we assign a new instance to our Action, with a separate ReturnALambda method, and a separate i.
As a last step, let’s look at how <>c__DisplayClass1_0 looks like:
All it does is store i, and notice it is not a static member, and a method that increments i. This means that the lambda expression does not get unwrapped in our Program class, as the compiler does with all lambda expressions, it gets unwrapped inside a nested class of its own, for the single reason of being a closure.
Finally, let’s modify my example a bit, to create a more complex environment for studying closures:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
using System; namespace HelloWorld { public class Program { public static void Main() { Action a = ReturnALambda(); Action b = ReturnALambda(); b(); a(); a(); a(); b(); Console.Read(); } static Action ReturnALambda() { Action ret = null; int i = 5; ret += () => i++; ret += () => i += 2; return ret; } } } |
In the first example we had a single lambda expression, but now I am using two of them, () => i++; and () => i += 2;. Now, both of these lambdas are capturing the scope of i, so, each one makes a closure.
In order to better visualize what will happen in this case, I will manually do everything that the compiler does automatically, and I will unwrap my method. First of all, I need to declare an inner class, so that the closure instances are separated:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
using System; namespace HelloWorld { public class Program { public static void Main() { Action a = ReturnALambda(); Action b = ReturnALambda(); b(); a(); a(); a(); b(); Console.Read(); } static Action ReturnALambda() { Action ret = null; ret += ret += return ret; } class AnyCompilerGeneratedClassName { private int i = 5; () => i++; () => i += 2; } } } |
Of course, at this stage, the above code is not a valid one, but I just wanted to show you how the compiler actually creates a nested class and then moves the i declaration and the two lambdas inside that class.
Then, inside the class, the lambda expressions get converted to normal methods:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
using System; namespace HelloWorld { public class Program { public static void Main() { Action a = ReturnALambda(); Action b = ReturnALambda(); b(); a(); a(); a(); b(); Console.Read(); } static Action ReturnALambda() { Action ret = null; ret += ret += return ret; } class AnyCompilerGeneratedClassName { private int i = 5; public void SomeCompilerGeneratedLambdaName() { i++; } public void AnotherCompilerGeneratedLambdaName() { i += 2; } } } } |
And now, inside ReturnALambda, I need to create instances of this class (I couldn’t use two instances of this class, because each of these instances would be a completely different object, with its own i variable) and then assign (be careful, we assign methods to delegates, not invoke them!) these two methods to the Action, effectively creating a delegate chain:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
using System; namespace HelloWorld { public class Program { public static void Main() { Action a = ReturnALambda(); Action b = ReturnALambda(); b(); a(); a(); a(); b(); Console.Read(); } static Action ReturnALambda() { Action ret = null; var temp = new AnyCompilerGeneratedClassName(); ret += temp.SomeCompilerGeneratedLambdaName; ret += temp.AnotherCompilerGeneratedLambdaName; return ret; } class AnyCompilerGeneratedClassName { private int i = 5; public void SomeCompilerGeneratedLambdaName() { i++; } public void AnotherCompilerGeneratedLambdaName() { i += 2; } } } } |
If we debug our program at this point, we will get the same effect as closures in the first example, namely, variable i from a will now have the value 14, while variable i from b will only get 11 (this time I am not just incrementing i, I am also adding 2 to the incrementation, each time the methods get called, and I also initialized i with the value of 5).
The conclusion of this example is that whenever two lambdas capture the same variable, those lambdas will be added to the same compiler generated class, and they will both influence the same variable.
Tags: Action, closures, delegate chaining, delegate chains, delegates, lambda expression