Thursday, October 18, 2018 20:11

Dictionary

The data structure Dictionary suggests storing key-value pairs and provides a quick search by key. In common language, this means that instead of elements receiving a numerical index, as in the array case, they receive a specific type as a key (this is what the K stands for in the Dictionary<K, T> concept; T stands for the type of the values stored in the dictionary) for accessing elements. Most of the times, this key is a string – and this allows us to give names to our elements. So, instead of accessing the fifth element of an array by using the integer 4 as the “key” (remember, arrays are 0 index based!), we can access the fifth element by the value we gave to its key, which could be a word, for instance. Just like in a real dictionary, we search for a specific word (which is the “key”), and when we find it, we read the definition of that word (which is the stored “value” for that element). Just like in a real dictionary, we are not allowed to have more than one element with the same key (in a real dictionary, a word is not defined twice).

Dictionary, along with the List, are the most important and used data structures (except maybe for the array). The implementation with a hash table (the class Dictionary<K, T>) has a very fast add, search and remove of elements – constant complexity at the average case. The operation access through index is not available, because the elements in the hash-table have no order, i.e. an almost random order. Dictionary<K, T> keeps internally the elements in an array and puts every element at the position calculated by the hash-function. Thus the array is partially filled – in some cells there is a value, others are empty. If more than one element should be placed in a single cell, elements are stored in a linked list. It is called chaining. This is one of the few ways to resolve the collision problem. When the dictionary number of elements exceeds 75% of its capacity, the size is doubled and all the elements occupy new positions. This operation has a linear complexity, but it is executed so rarely, that the amortized complexity remains a constant. Hash-table has one peculiarity: if we choose a bad hash-function causing many collisions, the basic operations can become very inefficient and reach linear complexity. In practice, however, this hardly happens. Hash-table is considered to be the fastest data structure, which provides adding and searching by key. Hash-table in .NET Framework permits each key to be put only once. If we add two elements with the same key consecutively, the last will replace the first and we will eventually lose an element. This important feature should be considered.

From time to time one key will have to keep multiple values. This is not standardly supported, but we can store the values matching this key in a List as a sequence of elements. For example if we need a hash-table Dictionary<int, string>, in which to accumulate pairs {integer, string} with duplicates, we can use Dictionary<int, List<string>>. Some external
libraries have ready to use data structure called MultiDictionary<K, V>. Hash-table is recommended to be used every time we need fast addition and fast search by key. For example if we have to count how many times each word is encountered in a set of words in a text file, we can use Dictionary<string, int> – the key will be a particular word, the value – how many times we have seen it. Use a dictionary when you want to add and search by key very fast.

Since dictionaries are generic, the types of the key and value must be specified when creating a dictionary, for example:

The above line of code will create a new dictionary in which the keys will be strings, and the stored values will be integers.

Adding data to a dictionary is similar to adding values to a List, except that it takes a key and value argument:

Once data is stored inside a C# dictionary, you use the key values as indices to retrieve values. To put it in context, a List uses numbers as keys (indices) to get its values, while a Dictionary uses the defined keys:

Traversing a dictionary (going through each element) is tricky because the keys can be anything within its data type. The only way to access all the values in a C# dictionary is to know all the keys. That is when the Keys property comes in, which holds a collection of any key added. Here is an example on iterating through a C# dictionary:

Two things to consider: insertion time and searching time. Because elements are not automatically sorted, inserting elements is faster. Similarly because elements are not sorted, it makes searching more complicated, meaning searching is slower.

To better understand how a dictionary uses the concepts of keys and values, let’s take the following example:

So, we declared a new Dictionary of type string-int, and we added few physics constants, like the boiling and freezing temperature of water. We could add those values in an array too, but it would become extremely difficult to remember what all the values represent, specially when they get many. And not to mention duplicate values.. who can tell what they each represent? So, in this case, by using a dictionary and giving significant names to the keys, we can easily access the values by their corresponding key, and also know what those values represent, at the same time. The only rule we have to obey is not having duplicate keys. If we try to add a key that already exists, Visual Studio will generate an exception of type System.ArgumentException: ‘An item with the same key has already been added.’.

The most important methods of the Dictionary class are:

ContainsKey() – will check if a given key string is present in the dictionary. It will return True if the key was found, or False otherwise.

TryGetValue() – This is actually the recommended way of checking whether a key is present in a dictionary or not. If the key exists, it will return its value.

ContainsValue() – This method lacks the constant-time look up speed of ContainsKey. It instead searches the entire collection. It is linear in complexity. It will loop through all elements in the Dictionary until it finds a match, or there are no more elements to check.

Clear() – This will clear all the keys and value pair in the dictionary. The same result can be achieved by assigning the dictionary to null.

Remove() – It will remove a specific element that we pass as an argument to this method. In case the key doesn’t exist, it will do nothing.

Count() – will return an int representing the number of elements in the dictionary.

Equals() – We can use this method to check whether two dictionaries are identical.

The property that is most used WITH the Dictionary class is:

Keys – The Keys property returns a collection of type KeyCollection, not an actual List. We can convert it into a List.

Be aware that if we try to directly access a key that does not exist, we will get an exception called KeyNotFoundException. With Dictionary we must test keys for existence first, with ContainsKey or TryGetValue.

Comments

comments

Tags: ,

2 Responses to “Dictionary”

  1. harga epoxy lantai says:

    First off I want to say great blog! I had a quick question which I’d
    like to ask if you do not mind. I was interested to find
    out how you center yourself and clear your thoughts prior to writing.

    I have had a hard time clearing my thoughts in getting my ideas out.
    I truly do enjoy writing however it just seems like the
    first 10 to 15 minutes tend to be wasted simply just
    trying to figure out how to begin. Any ideas
    or hints? Many thanks!

    • rusoaica says:

      When I write, I usually just plan beforehand what I want to express, so that it comes more natural. Also, listening to Mozart or Bach on a low volume helps focusing too 🙂

Leave a Reply