<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Brad Heap &#187; Sorting</title>
	<atom:link href="http://www.bradheap.id.au/blog/tag/sorting/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.bradheap.id.au/blog</link>
	<description>One kiwi&#039;s news and views on politics, science, computers, god, religion, and other ramblings from Sydney, Australia</description>
	<lastBuildDate>Thu, 02 Feb 2012 08:11:01 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>The C# battle of the SortedList, SortedDictionary, and List</title>
		<link>http://www.bradheap.id.au/blog/2009/01/the-c-battle-of-the-sortedlist-sorteddictionary-and-list/</link>
		<comments>http://www.bradheap.id.au/blog/2009/01/the-c-battle-of-the-sortedlist-sorteddictionary-and-list/#comments</comments>
		<pubDate>Fri, 16 Jan 2009 09:28:23 +0000</pubDate>
		<dc:creator>Brad Heap</dc:creator>
				<category><![CDATA[Computers]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Sorting]]></category>

		<guid isPermaLink="false">http://www.brad.net.nz/blog/?p=1010</guid>
		<description><![CDATA[Okay as part of my internship I have to deal with a huge number of text strings. These strings come into the program unsorted and they must be sorted and each one must be unique. (i.e. no duplicates). Now there &#8230; <a href="http://www.bradheap.id.au/blog/2009/01/the-c-battle-of-the-sortedlist-sorteddictionary-and-list/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Okay as part of my internship I have to deal with a huge number of text strings.</p>
<p>These strings come into the program unsorted and they must be sorted and each one must be unique. (i.e. no duplicates).</p>
<p>Now there are a few different ways to be able to store this data.</p>
<p>A SortedList, SortedDictionary, or two different forms of lists, the first where before each add you check to make sure that the data doesn&#8217;t already exist, and the second where you just add then sort then remove duplicates at a later time.</p>
<p>Two work out what one would be best I wrote a program to determine which form of storage operated the fastest on input. The results follow and then the code for how I did it.</p>
<p>As you can see the SortedDictionary worked the best, however at the early stages both the SortedList and Duplicate List gave it a run for it&#8217;s money.</p>
<p>Results</p>
<pre>Sorted List Test:        00:00:00.1169883 Input:    5000    4000 List Size:    2838
Sorted Dictionary Test:  00:00:00.1339866 Input:    5000    4000 List Size:    2858
Unique List Test:        00:00:00.1119888 Input:    5000    4000 List Size:    2862
Duplicate List Test:     00:00:00.0239976 Input:    5000    4000 List Size:    2832</pre>
<pre>Sorted List Test:        00:00:01.3768623 Input:   50000   40000 List Size:   28516
Sorted Dictionary Test:  00:00:01.0678932 Input:   50000   40000 List Size:   28466
Unique List Test:        00:00:12.3097689 Input:   50000   40000 List Size:   28384
Duplicate List Test:     00:00:01.2058794 Input:   50000   40000 List Size:   28549</pre>
<pre>Sorted List Test:        00:01:36.9733017 Input:  500000  400000 List Size:  285367
Sorted Dictionary Test:  00:00:12.3307668 Input:  500000  400000 List Size:  285422
Duplicate List Test:     00:02:32.7467238 Input:  500000  400000 List Size:  285506</pre>
<pre>Sorted Dictionary Test:  00:02:37.8040000 Input: 5000000 4000000 List Size: 2854095</pre>
<p><span id="more-1010"></span></p>
<p>Code for Program.cs</p>
<p><code>using System;<br />
using System.Collections.Generic;<br />
using System.Text;</code></p>
<p><code>namespace SortTest<br />
{<br />
class Program<br />
{<br />
public static void Main()<br />
{<br />
// 5,000 item list<br />
SortedListTest.run(5000,4000);<br />
SortedDictionaryTest.run(5000, 4000);<br />
UniqueListTest.run(5000, 4000);<br />
DuplicateListTest.run(5000, 4000);</code></p>
<p><code>// 50,000 item list<br />
SortedListTest.run(50000, 40000);<br />
SortedDictionaryTest.run(50000, 40000);<br />
UniqueListTest.run(50000, 40000);<br />
DuplicateListTest.run(50000, 40000);</code></p>
<p><code>// 500,000 item list<br />
SortedListTest.run(500000, 400000);<br />
SortedDictionaryTest.run(500000, 400000);<br />
//UniqueListTest.run(500000, 400000);<br />
DuplicateListTest.run(500000, 400000);</code></p>
<p><code>// 5,000,000 item list<br />
//SortedListTest.run(5000000, 4000000);<br />
SortedDictionaryTest.run(5000000, 4000000);<br />
//UniqueListTest.run(5000000, 4000000);<br />
//DuplicateListTest.run(5000000, 4000000);</code></p>
<p><code>}<br />
}<br />
}</code></p>
<p>Code for SortedListTest.cs<br />
<code>using System;<br />
using System.Collections.Generic;<br />
using System.Text;</code></p>
<p><code>namespace SortTest<br />
{<br />
    class SortedListTest<br />
    {<br />
        private static SortedList<string ,char> words = null;</string></code></p>
<p><code>        public static void run(int loopSize, int multiplier)<br />
        {<br />
            words = new SortedList<string , char>();<br />
            DateTime startTime = DateTime.Now;<br />
            Random rand = new Random();<br />
            int i;<br />
            for (int j = 0; j != loopSize; j++)<br />
            {<br />
                i = (int)(rand.NextDouble() * multiplier);<br />
                try<br />
                {<br />
                    string s = i.ToString();<br />
                    words.Add(s, s[0]);<br />
                }<br />
                catch (Exception e)<br />
                {<br />
                    continue;<br />
                }<br />
            }<br />
            DateTime stopTime = DateTime.Now;<br />
            TimeSpan duration = stopTime - startTime;<br />
            Console.WriteLine("Sorted List Test: " + duration.ToString() + " Input: " + loopSize + " " + multiplier + " List Size: " + words.Count);<br />
        }<br />
    }<br />
}</string></code></p>
<p>Code for SortedDictionaryTest.cs</p>
<p><code>using System;<br />
using System.Collections.Generic;<br />
using System.Text;</code></p>
<p><code>namespace SortTest<br />
{<br />
    class SortedDictionaryTest<br />
    {<br />
        private static SortedDictionary<string , char> words = null;</string></code></p>
<p><code>        public static void run(int loopSize, int multiplier)<br />
        {<br />
            words = new SortedDictionary<string , char>();<br />
            DateTime startTime = DateTime.Now;<br />
            Random rand = new Random();<br />
            int i;<br />
            for (int j = 0; j != loopSize; j++)<br />
            {<br />
                i = (int)(rand.NextDouble() * multiplier);<br />
                try<br />
                {<br />
                    string s = i.ToString();<br />
                    words.Add(s, s[0]);<br />
                }<br />
                catch (Exception e)<br />
                {<br />
                    continue;<br />
                }<br />
            }<br />
            DateTime stopTime = DateTime.Now;<br />
            TimeSpan duration = stopTime - startTime;<br />
            Console.WriteLine("Sorted Dictionary Test: " + duration.ToString() + " Input: " + loopSize + " " + multiplier + " List Size: " + words.Count);<br />
        }<br />
    }<br />
}</string></code></p>
<p>Code for UniqueListTest.cs</p>
<p><code>using System;<br />
using System.Collections.Generic;<br />
using System.Text;</code></p>
<p><code>namespace SortTest<br />
{<br />
    class UniqueListTest<br />
    {<br />
        private static List<string> words = null;</string></code></p>
<p><code>        public static void run(int loopSize, int multiplier)<br />
        {<br />
            words = new List<string>();<br />
            DateTime startTime = DateTime.Now;<br />
            Random rand = new Random();<br />
            int i;<br />
            for (int j = 0; j != loopSize; j++)<br />
            {<br />
                i = (int)(rand.NextDouble() * multiplier);<br />
                try<br />
                {<br />
                    string s = i.ToString();<br />
                    bool matchfound = false;<br />
                    foreach (string w in words) // ignore duplicates<br />
                    {</string></code></p>
<p><code>                        if (w == s)<br />
                        {<br />
                            matchfound = true;<br />
                            break;<br />
                        }<br />
                    }<br />
                    if (!matchfound) words.Add(s);<br />
                }<br />
                catch (Exception e)<br />
                {<br />
                    continue;<br />
                }<br />
            }<br />
            words.Sort();<br />
            DateTime stopTime = DateTime.Now;<br />
            TimeSpan duration = stopTime - startTime;<br />
            Console.WriteLine("Unique List Test: " + duration.ToString() + " Input: " + loopSize + " " + multiplier + " List Size: " + words.Count);<br />
        }<br />
    }<br />
}</code></p>
<p>Code for DuplicateListTest.cs</p>
<p><code>using System;<br />
using System.Collections.Generic;<br />
using System.Text;</code></p>
<p><code>namespace SortTest<br />
{<br />
    class DuplicateListTest<br />
    {</code></p>
<p>        <code>private static List<string> words = null;</string></code></p>
<p><code>        public static void run(int loopSize, int multiplier)<br />
        {<br />
            words = new List<string>();<br />
            DateTime startTime = DateTime.Now;<br />
            Random rand = new Random();<br />
            int i;<br />
            for (int j = 0; j != loopSize; j++)<br />
            {<br />
                i = (int)(rand.NextDouble() * multiplier);<br />
                try<br />
                {<br />
                    string s = i.ToString();<br />
                    words.Add(s);<br />
                }<br />
                catch (Exception e)<br />
                {<br />
                    continue;<br />
                }<br />
            }<br />
            words.Sort();<br />
            string prev = null;<br />
            for(int counter = 0; counter < words.Count; counter++)<br />
            {</code></p>
<p><code>                if (prev == words[counter])<br />
                {<br />
                    words.RemoveAt(counter);<br />
                    counter--;<br />
                }<br />
                prev = words[counter];<br />
            }<br />
            DateTime stopTime = DateTime.Now;<br />
            TimeSpan duration = stopTime - startTime;<br />
            Console.WriteLine("Duplicate List Test: " + duration.ToString() + " Input: " + loopSize + " " + multiplier + " List Size: " + words.Count);<br />
        }<br />
    }<br />
}</code></p>
<p></string></code>
<div class="social4i" style="height:29px;">
<div class="social4in" style="height:29px;float: left;">
<div class="socialicons s4twitter" style="float:left;margin-right: 10px;background:url(&quot;http://goo.gl/zjqd1&quot;) no-repeat;"><a href="http://twitter.com/share" data-url="http://www.bradheap.id.au/blog/2009/01/the-c-battle-of-the-sortedlist-sorteddictionary-and-list/" data-counturl="http://www.bradheap.id.au/blog/2009/01/the-c-battle-of-the-sortedlist-sorteddictionary-and-list/" data-text="The C# battle of the SortedList, SortedDictionary, and List" class="twitter-share-button" data-count="horizontal" data-via=""></a></div>
<div class="socialicons s4fblike" style="float:left;margin-right: 10px;">
<div id="fb-root"></div>
<p><fb:like href="http%3A%2F%2Fwww.bradheap.id.au%2Fblog%2F2009%2F01%2Fthe-c-battle-of-the-sortedlist-sorteddictionary-and-list%2F" send="false" layout="button_count" width="100" height="21" show_faces="false" font=""></fb:like></div>
<div class="socialicons s4plusone" style="float:left;margin-right: 10px;"><g:plusone size="medium" href="http://www.bradheap.id.au/blog/2009/01/the-c-battle-of-the-sortedlist-sorteddictionary-and-list/" count="true"></g:plusone></div>
</div>
<div style="clear:both"></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.bradheap.id.au/blog/2009/01/the-c-battle-of-the-sortedlist-sorteddictionary-and-list/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

