Tuesday, June 25, 2024 22:35

Table of contents >> Strings And Text Processing > Splitting a string by a separator

Splitting a string by a separator

There are many cases when we have a string that contains some elements separated by a separator, and we would need to get these elements. For this, we can use the Split() function, which returns an array of strings. The alternative would be to manually search for the separator character using IndexOf() function, then to retrieve the individual substring, and all the extra horrendous work that comes with it.

Lets take an example:

The result will be this:

C# Split

What Split() function did was to look inside our string for all the occurrences of the character ‘ ‘ (space). Then, it divided our string in multiple strings, breaking it wherever the space character was found. Finally, it stored all these divided string pieces inside an array, which we can iterate to display its values, using a foreach loop.

You notice that some of the names have a comma after them. You also notice that I have placed the parameter for the Split() function between single quotation marks, and not double. This specifies that we are providing a char type parameter, not a string. And, indeed, the Split() function only accepts type char as parameter. If we try to add more than a single character, we will get an error, because a char can only contain a single character. Also, if we try to supply a string instead of a char, we will still get an error, because there is no overload of the Split() function that takes a string as a parameter. So, how can we split a string by providing more than a single separator character?

Like this:

This time, we declared an array of type char, with some values, such as space, comma, etc. The Split() function has an overload that accepts a char array, and it will split our string by these character separators existing in the array. All substrings among which are space, comma or dot will be removed and stored in the elements array. If we iterate the array and print its elements one by one, the result will be: “Jane”, “”, “John”, “”, “Marry”, “” and “Bruce”. We get 7 results, instead of the expected 4. The reason is that during the text splitting, three substrings are found which contain two separator characters one next to the other (for example a comma, followed by a space). In this case the empty string between the two separators is also part of the returned result. If we want to ignore the empty strings from the splitting results, one possible solution is to make checks on their printing:

And now we would get the correct display:

But this approach does not remove the empty strings from the array. It just does not print them. So we can change the arguments we are passing to the Split() function, by passing a special option:

This will ensure that the resulting array will not contain any empty elements.

Tags: , ,

Leave a Reply

Follow the white rabbit