Wikipedia Reaches 3 Million Articles


The English Wikipedia reached the 3 million article mark today. This is undoubtedly a sign of it's continuous extension and an opportunity to examine some numbers to talk about wikipedia's future growth. In this post I look at how the growth of wikipedia progresses.

First see Jimmy Wales, it's co-founder talk about the 3 million article milestone.



Recently it was discussed at slashdot whether wikipedia is approaching the limit of its growth. Slashdot members have been connected to early growth of wikipedia and I think it's save to say that many slashdot members have contributed at least once to wikipedia. Many comments in the discussion complained about disruptive behavior by other wikipedia contributors and the increasingly closed nature of it's community.

Has wikipedia reached it's limits? Let's look at the numbers.

I took data from wikipedia pools, where people take bets on when certain milestones are reached and plotted them. The abscissa gives the number of days since wikipedia's inception (taken as January 16th, 2001), the ordinate the number of articles. (Click to enlarge.)


I admit that this is of course a very crude statistics consisting of only 5 points, however it was fast to plot. I remember back in 2005 or early 2006 on the wikipedia statistics page there was a discussion whether wikipedia's growth was exponential with more credibility given to the exponential growth hypothesis, however most of the graphs there are outdated. To get more data I could have downloaded the whole wikipedia history (several GBs) or taken data from the milestone page, but I think the conclusion is warranted from this plot: the growth is definitely not exponential, at least since early 2006.

This probably means that there are bottlenecks to it's growth. Growth itself, as accumulation of new articles as measured here, has not to be seen necessarily as positive. Within the wikipedia community there are essentially two fractions as to adding new articles:

  • there are the inclusionists who want to include more articles (and relax politics on notability) and
  • exclusionist, who want stricter controls on which articles are to be included within wikipedia. Their position would be to favor quality over quantity.
Whether quality is actually improving is another question that I will not look at in this post.  

Matlab scripts used to calculate and plot
statcalc.m
% data taken from http://en.wikipedia.org/wiki/Wikipedia:Pools 
pools={'March 17, 2005','August 4, 2005','March 1, 2006','September 9, 2007','August 17, 2009'}; 
number_of_articles=[500000,666666,1000000,2000000,3000000];  
for i=1:numel(pools) days(i)=since_project(pools{i}); end 
figure; plot(days,number_of_articles,'*-'); ylabel('number of articles reached'); xlabel('days passed');


since_project.m  
function d=since_project(datestr)  
% calculates time passed since wikipedia exists % uses time_passed function. 
% wikipedia existed since 16 January 2001, see http://en.wikipedia.org/wiki/Wikipedia:Wikipedia_records 
d=time_passed('16 January 2001',datestr);


time_passed.m  
function d=time_passed(date1str,date2str)  
% calculate time passed between two dates in days % time in dd Dec YYYY format.  
% used for wikipedia statistics  
[a,d1]=unix(['date -d "' date1str '" +%j']); 
d1=str2num(d1); [a,d2]=unix(['date -d "' date2str '" +%j']); d2=str2num(d2); dy=(str2num(date2str(end-3:end))-str2num(date1str(end-3:end)))*365; d=dy+d2-d1;

1 Response to "Wikipedia Reaches 3 Million Articles"

Wikipedia is very useful for me, I often search and get information from it. And best of all, it is free

  Subscribe to replies to this post

 
This conversation is missing your voice. Your feedback is appreciated.
Post a Comment


You can use some HTML tags, such as <b>, <i>, <a>

If you see a message that says "your request could not be processed" press preview first and then post.
 
You can follow the discussion of this post by subscribing.


 
You are free to include information from this article on your own site if you provide a backlink. You can use the following markup:
<a href="http://www.myoutsourcedbrain.com/2009/08/wikipeda-reaches-3-million-articles.html">Wikipedia Reaches 3 Million Articles</a>