Open Mind Tree: An Overview Of System.Collections.Generic

System.Collections.Generic.SortedSet<T>

SortedSet is new to .NET 4.0. It is actually the backing store for the SortedDictionary data type, but unlike the SortedDictionary there is only values and no keys. Also, the SortedSet exposes some set operations unlike the SortedDictionary. Since it is the backing store for the SortedDictionary it has the same performance characteristics.

The SortedSet represents a set abstract data structure. A set is a collection that contains unordered unique values. It is an abstract data structure because a set does not imply any particular implementation. This implementation uses a Red-black tree, which is a sorted data structure, in order to maintain the rules of the set. While technically a set does not have any order, this is a sorted set, and therefore if you iterate over all of the items within the set they will be in sorted order.

The SortedSet is best used when you are storing a unique set of values which need to be sorted and which might require ranged lookups, or other set based operations (intersects, unions, etc…).

Example of most common operations:

var sortedSet1 = new SortedSet<string>();

sortedSet1.Add("one");

sortedSet1.Add("two");

sortedSet1.Add("three");

sortedSet1.Add("four");

var sortedSet2 = new SortedSet<string>();

sortedSet2.Add("two");

sortedSet2.Add("four");

sortedSet1.Contains("three");

// sortedSet1 now contains one, three

sortedSet1.ExceptWith(sortedSet2);

// sortedSet1 now contains two, four

sortedSet1.IntersectWith(sortedSet2);

// a subset is a set whose elements are all in

// another set

sortedSet1.IsSubsetOf(sortedSet2);

// a superset is a set which contains all elements

// in another set

sortedSet1.IsSupersetOf(sortedSet2);

// a proper subset is a subset which is not equal

sortedSet1.IsProperSubsetOf(sortedSet2);

// a proper superset is a superset which is not equal

sortedSet1.IsProperSupersetOf(sortedSet2);            

// returns bool if both sets contain any of the same items

sortedSet1.Overlaps(sortedSet2);

// removes the element with value "two"

sortedSet1.Remove("two");

// removes all elements that match the predicate

sortedSet1.RemoveWhere(v => v == "three");

// returns true if both sets are identical

sortedSet1.SetEquals(sortedSet2);

// sortedSet1 now contains one, three

// only items which are in one or the other, not both

sortedSet1.SymmetricExceptWith(sortedSet2);

// sortedSet1 now contains one, two, three, four

// all items in both sets

sortedSet1.UnionWith(sortedSet2);

// returns all items between the given values

SortedSet<string> result = sortedSet1.GetViewBetween("two", "four");

// elements are in sorted order

foreach (string value in sortedSet1)

{

}

// remove all items in both sets

sortedSet1.Clear();

Bottom Line:
Add	Fast – O(log N)
Inserts	Fast – O(log N)
Lookup	Not Applicable
Sorted Lookup	Fast – O(log N)
Delete	Fast – O(log N)
Versatility	Medium

System.Collections.Generic.HashSet<T>

HashSet is another set data structure that uses hashes in order to maintain a unique set of items. Internally the items are stored in an array, which requires expansion when the array fills up. The hash table size is increased to each subsequence prime number in size as the buckets fill up (this is done because some hash functions will only evenly hash values if the number of available buckets is prime).

Because hashes are used as a backing store, the data is unsorted, which means that finding ranges is an O(N) operation since all data would need to be searched through. At the same time, hashes are very fast with insertions and lookups and so performing set operations such as unions, intersects, etc… could be much faster on a HashSet than a SortedSet. Keep in mind that inserting can cause the array to be expanded, which would cause the operation to be much more expensive.

Since HashSet and SortedSet both implement ISet, the only real reason to use SortedSet is if you have data that you might need to look up in ranges or if you have data which cannot be easily hashed. If neither of those applies to your situation, then the HashSet will probably be the faster of the two sets to use, and it will use less memory.

Example of most common operations:

var hashSet1 = new HashSet<string>();

hashSet1.Add("one");

hashSet1.Add("two");

hashSet1.Add("three");

hashSet1.Add("four");

var hashSet2 = new HashSet<string>();

hashSet2.Add("two");

hashSet2.Add("four");

hashSet1.Contains("three");

// sortedSet1 now contains one, three

hashSet1.ExceptWith(hashSet2);

// sortedSet1 now contains two, four

hashSet1.IntersectWith(hashSet2);

// a subset is a set whose elements are all in

// another set

hashSet1.IsSubsetOf(hashSet2);

// a superset is a set which contains all elements

// in another set

hashSet1.IsSupersetOf(hashSet2);

// a proper subset is a subset which is not equal

hashSet1.IsProperSubsetOf(hashSet2);

// a proper superset is a superset which is not equal

hashSet1.IsProperSupersetOf(hashSet2);            

// returns bool if both sets contain any of the same items

hashSet1.Overlaps(hashSet2);

// removes the element with value "two"

hashSet1.Remove("two");

// removes all elements that match the predicate

hashSet1.RemoveWhere(v => v == "three");

// returns true if both sets are identical

hashSet1.SetEquals(hashSet2);

// sortedSet1 now contains one, three

// only items which are in one or the other, not both

hashSet1.SymmetricExceptWith(hashSet2);

// sortedSet1 now contains one, two, three, four

// all items in both sets

hashSet1.UnionWith(hashSet2);

// elements are no in any guaranteed order

foreach (string value in hashSet1)

{

}

// remove all items in both sets

hashSet1.Clear();

Bottom Line:
Add	Very Fast – O(1)
Inserts	Not Applicable
Lookup	Average – O(N)
Indexed Lookup	Very Fast – O(1)
Delete	Very Fast – O(1)
Versatility	Low

System.Collections.Generic.Queue<T>

The Queue is a special purpose data structure which implements a queue data structure. A queue is a FIFO (First In First Out) collection with two main operations: Enqueue and Dequeue. Items are placed into one end of the queue, and then are taken off the other end:

The Queue is really only useful if you need the exact semantics that the queue data structure provides. One reason might be to place work items into the queue and then be able to take the off and process them in order. If you need this kind of functionality, then the Queue is for you. In .NET 4.0 a ConcurrentQueue was also implemented which allows for items to be enqueued and dequeued on multiple threads.

Example of most common operations:

var queue = new Queue<string>();

// pushes an item on the end of the queue

queue.Enqueue("one");

queue.Enqueue("two");

queue.Enqueue("three");

queue.Enqueue("four");

// returns true

queue.Contains("four");

// takes the item off the front of the queue

string item = queue.Dequeue();

// gets the item off the front of the queue

// without removing it

string peekItem = queue.Peek();

// remove all items from the queue

queue.Clear();

Bottom Line:
Add	Very Fast – O(1)
Inserts	Not Applicable
Lookup	Average – O(N)
Sorted Lookup	Not Applicable
Delete	Very Fast – O(1) – Only from end.
Versatility	Very Low

System.Collections.Generic.Stack<T>

The Stack is another special purpose data structure which implements a stack data structure. A stack is a LIFO (Last In First Out) collection with two operations: Push and Pop. Items are placed on the top of the queue and then are taken off the top:

Again, the Stack is a special purpose data structure that has applicability only within a subset of problems. In .NET 4.0 a ConcurrentStack was implemented which allows you to push and pop items on multiple threads without locking.

Example of most common operations:

var stack = new Stack<string>();

// puts an item on the top of the stack

stack.Push("one");

stack.Push("two");

stack.Push("three");

stack.Push("four");

// removes an item from the top of the stack

string item = stack.Pop();

// gets the item on the top of the stack

// without removing it

string peekedItem = stack.Peek();

// returns true if item is in stack

stack.Contains("two");

// removes all items from the stack

stack.Clear();

Bottom Line:
Add	Very Fast – O(1)
Inserts	Not Applicable
Lookup	Average – O(N)
Sorted Lookup	Not Applicable
Delete	Very Fast – O(1) – Only from end.
Versatility	Very Low

Open Mind Tree

An Overview Of System.Collections.Generic

System.Collections.Generic.SortedSet<T>

System.Collections.Generic.HashSet<T>

System.Collections.Generic.Queue<T>

System.Collections.Generic.Stack<T>

0 comments:

Popular Posts

Visitors

Archives

Infolinks In Text Ads

Featured Posts

Blogger Tips