Tuesday, May 11, 2021 13:23

Table of contents >> LINQ > Extension methods

Extension methods

Sometimes, programmers find themselves in need of adding new functionality to already existing codes, in order to improve or complete them. If the said source code is available, the task is simple – they only need to add the required functionality and recompile. However, they will often encounter situations when the code itself is not available, such as when they are using a referenced assembly (.dll or .exe file). In this case, in order to achieve their goals, they have a few options available. One of them is inheritance, when they can simply inherit the class they need to extend, and add the required functionality in the derived class. This has the major disadvantage of being difficult to apply, due to the fact that they would have to change all the instances of the base class with instances of the derived class. Aside of this, there is always the danger that the class they want to inherit is marked as sealed, which means it cannot be inherited.

The second option available is extension methods.  They allow us to extend existing types (class, structure or interface) without the need to change their codes or to use inheritance, even when the existing types are marked as sealed.

Extension methods must always be declared as static, and they can only be declared inside static classes. But, to better visualize them, let’s take a simple example where they might be useful. Let’s say we have these codes:

We have two simple DateTime instances, date and time, and we print them to the console. You can notice that in the case of date, since we are not declaring any time part, it will have the default value of 12:00 AM, like this:

Printing DateTime to consoleNow, suppose we want to combine the two objects, so that we end up with a DateTime instance that contains the date part of date and the time part of time. One way we can do this is by creating a new DateTime instance, and assign its values by using date and time members:

This is how the result looks like:

Combined DateTime objectsIn case you need this functionality in a lot of places, we can obviously create a separate method that returns this functionality:

And now, we ended up with a method named Combine(), which takes two DateTime objects as parameters and returns a new DateTime value, which represents the combination of date and time of the two arguments. It is even somewhat syntactically pleasing: Combine(date, time), “combine date and time”.

But, I don’t know about you, but if you are a little bit more like me, you’d find it even more logical to have it in this form:

Now, obviously, this gives as an error, since DateTime does not have a Combine() method that we can use:

Missing Combine method on DateTime typeAgain, syntactically, this is just as pleasing: “hey, date, why don’t you combine with time?”, so they are both the same. However, I personally prefer the latter.

At this point, in order to get rid of that error, I already explained we have two options: inheritance or extension methods. But, if we go to the definition of DateTime, we notice that it is a struct:

Definition of DateTime structureYou may not know, but structures are implicitly declared as sealed, so, this rules out inheritance from start. Therefor, let’s implement the only remaining variant, an extension method.

As I said, extension methods can only live inside static classes, so we need to mark our Program class as static. I also said that extension methods are always static, so we need to mark our Combine() method as static too. Finally, the last thing we need to do in order to inform the compiler that our Combine() method is an extension method, and not just any regular method, is to use the this keyword in front of its first parameter:

The reason we are using this in front of the first parameter is to indicate to the compiler the type we want to extend with an extension method. In other words, when we use this DateTime, the compiler understands that our extension method is a method that will be added to a DateTime type. If we used this string, we would have created an extension method for the string class, and so on. Notice that the this keyword must always be used in front of the first parameter of the extension method, and not on subsequent parameters.

At this point, you will notice that the error in Visual Studio is gone. This means two things: first, it appears that our date instance of DateTime DOES contain a method named Combine(), which we can call by supplying a single parameter, time:

But, wait! Didn’t we declare our Combine() extension method with TWO parameters? Why isn’t the compiler complaining about using only one of them? The reason for this is that since we used the this keyword, and the compiler knows this is an extension method, it implicitly understands that if we use this method on a DateTime object (date, in our example), the first parameter, the this DateTime one, is the same as the one on which the extension method is used upon. In other words, if we use date.Combine(), the compiler implicitly knows that the this DateTime parameter is the same as date, because we are calling the Combine() method on the date object.

The second thing we get from this is that we can use extension methods by calling them explicitly, as we do here:

In this case, since we are not calling the Combine() method on a DateTime instance, we are forced to specify both the first and the second parameters, so that the compiler understands on which DateTime object should it combine the time to. However, since this is an extension method, it is unnatural to use it in this way, and it is not considered a good practice.

The thing to take from here is that extension methods are just normal static methods, and can be used as static methods, but they have the advantage of also allowing us to use them as instance methods on the types that they extend.

In fact, at MSIL (Microsoft Intermediate Language) level, extension methods are not valid code. The compiler simply “cuts” the instance on which the extension method is used and paste it as the first parameter of the static method call, effectively converting it to a normal static method call.

Notice that trough extension methods we can add “implemented methods” even to interfaces. Of course, by this time we all know that interfaces cannot contain functionality, they are used only to define members, properties or methods signatures. Not entirely true. Extension methods can also extend interfaces, in which case they “add” functionality to an interface, in the same way they do on concrete types. So, if any wiseguy employer asks you if interfaces can contain functionality, offer a wiseguy response, and reply that they can, only through extension methods.

Extension methods also have a few caveats. One of them is the fact that, obviously, they cannot access the private members of the types they extend. Another is the fact that programmers can get to a point where they have a ton of extension methods, just to try and avoid inheritance. Personally, as a professional programmer, I like to use extension methods here and there, just keeping them as a nice tool to have in the toolbox. But the place where these extension methods really shine is LINQ (Language Integrated Query), of which we will learn in the following lessons.

This lesson ends here for the regular users. For those of you who want to get deeper, to the nitty-gritty stuff, I will also explain why do we have to use the this keyword when we are declaring extension methods. Lets take the following example:

We have a class Book, which contains two field members, copiesSold and bookName, and a public method, SellBook(), in which we just increment copiesSold and display the result on the console. Inside our Main() method, I declare an instance of the Book class, firstBook, and I call the SellBook() method three times. The result looks like this:

Calling instance methods in C#Nothing out of the ordinary so far. But, take a look at this piece of code: Book firstBook = new Book(). Well, think a little about what the compiler needs to do when it encounters the new keyword. We know that it must go on the Heap memory and create enough room for a new Book. In this case, how big is a Book? Well, we can see that each Book has a unique copy of copiesSold and bookName, since they are instance fields. If they were marked as static, we’d know they are shared among all instances of Book, but they are not, so they are copied inside each instance of type Book. That means that so far, a Book is just a size of an int, plus the size of a string. If we didn’t name our books, and we would have declared just the copiesSold variable, a Book would be the size of a single int, 4 bytes. This is because the SellBook() method’s codes are NOT copied in RAM, they only exist once, because all Books can SHARE this SellBook() method. I emphasized the “SHARE” word because this should hint you about another thing: how do we call things that can be shared among instances, in .NET? Of course, the answer is static. And this is the bonus lesson for the day, a thing that few programmers, even professional ones, know or realize, the fact that in .NET, ALL methods, without exception, ARE STATIC. Even when we declare a private method, even when we call it on an instance, the compiler actually converts it to a static method that only exists once in RAM, and is shared across any instance of the type that contains them.

One question that might arise now is: well, if the SellBook() method is actually static, how does it know whose Book‘s copiesSold variable to increment? We all know that static methods do not belong to a particular instance, so, when we write copiesSold++, which copiesSold do we mean?

Few of you know that when we use instance members inside methods, the compiler actually adds the this keyword in front of them, like this:

When we type them directly, without the this keyword, this is assumed, it is implicit, it is added by the compiler in the background. If I were to declare another book and sell it:

the result would look like this:

and it would make perfect sense, when we incremented copiesSold for both books, we incremented a different copy of that variable for each of the Book instances. In that case, this.copiesSold++; made perfect sense, it meant that we were incrementing the copiesSold variable of THAT instance alone.

To prove that all methods are static, let’s mark our SellBook() method as static, just like the compiler would. In order to solve the error of not being able to use instance members inside static ones, we also need to take control and provide the static method with our own version of this:

But, in this case, we cannot call the static method on instances of Book, we have to call it explicitly, like any static method:

In this case we took control and instead of letting the compiler implicitly add the this keyword, we explicitly supplied our own version of this we wanted to use. If I would call the SellBook() method on secondBook, the result would be the same as the first time, we would still obtain different values for the numbers of copies sold for each book instance.

True, it is much nicer that we are able to call SellBook() on Book instances, like this: firstBook.SellBook();, instead of calling it explicitly, like this: Book.SellBook(firstBook);.

And maybe you are already starting to observe some similarities to the extension methods from the start of this lesson: I said that we can call extension methods explicitly, as normal static methods, in which case you need to explicitly supply the this parameter, the parameter type upon which the extension method is applied on, or we can call them implicitly, as instance methods of that particular type we are extending, in which case the compiler assumes the this parameter is the instance on which we are calling the extension method upon.

To notice this similarity even better, we can convert SellBook() to an extension method that would work just as fine:

Of course, I had to move the SellBook() method in a separate static class, in order to be able to still instantiate the Book class, since static classes can’t be instantiated, and instance methods must live inside static classes, and I also had to make copiesSold public, because extension methods cannot access the private members of the classes they extend, but you can notice that I am able to call SellBook() as a static method and an instance method just as fine, since it is an extension method.

Tags: , , ,

Leave a Reply

Follow the white rabbit