Code Structure, Random Number Generation and Conditional Probability

I had an interesting discussion with a group of students this afternoon which highlighted how important it is to understand conditional probability and how code structure can produce unexpected results.

The students’ code contained three lists and a random number generator. The random number generator was meant to randomly chose which list to take something from with equal probability (that is 1/3 chance of selecting from each of the lists). However, somehow one of their lists was being selected far more than the others.

Their code looked similar to the following:

if (random.nextInt(3) == 0) {
  // select from list 1
} else if (random.nextInt(3) == 1) {
  // select from list 2
} else {
  // select from list 3
}

Can you spot the bug? Why would this not generate equal probabilities of selecting from each list?

I wrote some of my own code to test this code structure almost 100 million times:
int count0 = 0;
int count1 = 0;
int count2 = 0;

// loop almost 100 million times
for (int i = 0; i < 99999999; i++) {
  if (r.nextInt(3) == 0) {
    count0++;
  } else if (r.nextInt(3) == 1) {
    count1++;
  } else {
    count2++;
  }
}

Which generated the results:
Count0 = 33333271
Count1 = 22221788
Count2 = 44444940

Which demonstrates this code doesn’t generate equal probabilities.

The problem is the second random number generation:

if (r.nextInt(3) == 0) {
  count0++; // this has 1/3 probability (0.333333333333333)
} else if (r.nextInt(3) == 1) { // this else will run 2/3 of the time
                                // and resolve true 1/3 of the times it runs

  count1++; // 1/3 * 2/3 this line will run... i.e. 2/9 probability (0.22222222222)
} else {
  count2++; // 2/3 * 2/3 this line will run... i.e. 4/9 (0.4444444444444)
}

Restructuring the code to only generate the random number once and storing this in a temporary variable makes a huge difference:

int rInt = r.nextInt(3); // only generate a random number once.
if (rInt == 0) {
  count0++; // this will run 1/3 of the time.
} else if (rInt == 1) {
  count1++; // this will run 1/3 of the time.
} else {
  count2++; // this will run 1/3 of the time.
}

And the results of this are:
Count0 = 33333907
Count1 = 33332586
Count2 = 33333506

Which looks much better.

(Note: alternatively you could make the second random number generation be out of 2 rather than 3 but this would be slower as the random number generator will take a few CPU cycles to calculate.)

2014 Annual Blog Statistics

These are some statistics from Google Analytics, the figures from 2013 are in brackets.

Overall in 2014 there was a very slight increase in the number of unique visitors (2.2%), however falls in the total number of visits (1.0%) and page views (3.6%).

Total

  • Visits: 9,363 (9,458)
  • Unique Visitors: 8,818 (8,626)
  • Page Views: 12,157 (12,609)

Visitors from came from 146 Countries. The United States replaced Australia with the most visitors, with Australia falling to fourth place behind New Zealand and the United Kingdom. 67.9% of all visits came from AUS/USA/NZ/UK up almost 4% from 2013.  Continue reading “2014 Annual Blog Statistics”

2013 Annual Blog Statistics

These are some stats from Google Analytics, the figures from 2012 are in brackets.

Overall in 2013 there were about 5,000 fewer visits, from 4,700 fewer visitors than in 2012. This is the third year in a row of declining visitor numbers. However, this continues to be correlated to the reduction of posts to this blog – and the change in focus to more personal adventure than just random commentary.

Total

  • Visits: 9,458 (14,714)
  • Unique Visitors: 8,626 (13,371)
  • Page Views: 12,609 (24,504)

Visitors from came from 165 Countries. Australia, United States, New Zealand continue to hold the top three visitor locations. 64.3% of all visits came from Aus/USA/NZ/UK down 1% from 2012. Continue reading “2013 Annual Blog Statistics”

2012 Annual Blog Statistics

These are some stats from Google Analytics, the figures from 2011 are in brackets.

Overall in 2012 there were about 2,500 fewer visits, from 1,900 fewer visitors than in 2011. This is also off the back of a similar decline in the year before. However, during the last two years there has been a large reduction in the quantity of posts to the site.

Total

  • Visits: 14,714 (17,175)
  • Unique Visitors: 13,371 (15,313)
  • Page Views: 24,504 (26,609)

Continue reading “2012 Annual Blog Statistics”

Wolfram Alpha’s Facebook Report

A few weeks back Lance Wiggs blogged about Wolfram Alpha’s Facebook Report tool.

Running the tool on my profile brings up some interesting results.

Firstly, posted statuses, links, and photos:

One of my goals this year has to been to reduce the amount I post on Facebook, and for the first half of this year that has been achieved. However, in recent weeks the numbers have started to increase again.

There is analysis of post frequency, word frequency, and comment frequencies. The word cloud based on this is interesting:

Continue reading “Wolfram Alpha’s Facebook Report”

2011 Annual Blog Statistics

These are some stats from Google Analytics, the figures from 2010 are in brackets.

Overall there were about 2,500 fewer visits, from 1,700 fewer visitors. This is probably a result of me blogging less and the content of the blogs I did post were focused more on personal adventures than in previous years.

I am still very pleased that a website that was started in 2005 as a bit of computer geek vanity is 7 years later generating visits from all corners of the globe.

Most Popular Visitor Origins

Total
Visits: 17,175 (19,774)
Unique Visitors: 15,313 (17,081)
Page Views: 26,609 (28,822)

Visitors from 174 Countries
New Zealand 21.6%
United States 15.9%
Australia 15.5%
United Kingdom 4.3%
India 2.9%
Canada 2.7%
Germany 2.5%
France 2.5%
Spain 2.0%
Brazil 1.8%

Visitors from 4,576 Cities
Auckland 12.5%
Sydney 9.3%
Wellington 2.4%
Melbourne 2.4%
Christchurch 1.9%
Brisbane 1.4%
Perth 0.9%
Hamilton 0.9%

Browsers
Firefox 43.46%
Internet Explorer 23.9%
Chrome 22.1%
Safari 7.2%
Opera 1.5%

Operating System
Windows 66.0%
Linux 22.0%
Mac 9.5%
iPad 1.3%
Android 0.1%

N.B. I suspect that the mobile version of the site is not being represented in these statistics.