Archive for June, 2008

Wikipedia Hypocrisy

June 17, 2008

I wrote this article last week during the height of the edits that Comet article that stripped it of all useful content. I decided not to post, but after seeing the interest on reddit I decided to reconsider. So here it is.

Wikipedia rejects any self-published sources as being inherently biased, inconsequential, and ultimately non-notable. This is undoubtedly a product of Wikipedia’s early days when the site struggled (and to some extent is still struggling) to gain legitimacy in the hard-copy world of authors, journals, and publishers. I actually know little about the founding of wikipedia, or even much about Wikipedia culture beyond what I’ve witnessed with a recent disagreement about the validity of the term Comet on Wikipedia. I suspect though, that the founders of Wikipedia figured that they could gain the legitimacy they so craved if they adopted the same guidelines and rules of the hard-copy world. They figured they would be as rigorous as paper-based encyclopedias, and eventually everyone would accept WP content as authoratative.

I think thats a reasonable approach. Indeed, many students reference wikipedia in their academic endeavors. When I’m at work, and I need to know the meaning of some concept, I almost always check Wikipedia. Since August 2007 alone, the site has increased by 50% in daily reach ( But Is that because I and others have read Wikipedia’s guidelines and we trust that the same rigor applied to Britannica has also been applied to each word we read online? Well, no. As I said, I knew very little about that rigor until recently.

The answer, is search engines. When I read an a tech article somewhere that talked about Bubble Sort I wondered when the algorithm was first discovered. I googled the term and low and behold, the first link was Wikipedia. The same is true for a vast number of search results on Google. *People didn’t start using Wikipedia because they was impressed with the high standards in sourcing material, rather, its that the site kept coming up first on google searches.*

A major advantage of the internet after all is that it is trivial to disseminate information — there’s no a third party you have to convince in order to put your thoughts online. What makes this so ironic, is that Wikipedia is a self-published source. They didn’t ask Brittanica to publish their content, they just published it online. The reason Wikipedia gained so much legitimacy is not because of their guidelines, its their Google Page Rank. Because other sources on the internet linked to their website, they gained authority on all manner of subject. Page Rank, after all, is the modern measure of authority.

Now consider that Wikipedia rejects blog entries as non-authorative. The hypocrisy is clear: Wikipedia is a self-published source, and is *only* successful due to the modern measure of legitimacy known as Google Page Rank. Yet, in spite of their success, by policy they are in stark opposition to this modern process of content dissemenation that we call the internet. All content on wikipedia must be validated through out-dated mechanisms, such as academic journals or large publishing congolmerates.

Unfortunately, the world of technology is abandonding these out-dated venues due to the difficulty and extremely drawn out turnaround time on putting out new information. When a popular Open Source project comes up with a new innovation, they blog about it. Someone else might come along and write a book about it, but much of the time the author is someone who is only peripherally related to the topic and isn’t innovator or even the expert on the subject matter! The worst part though is that there is no way to publish a book in less than a year, and generally it takes much longer.

The result, is a collision on topics like Comet, as evidenced by the recent controversy. While there are dozens of trade conference presentations about Comet this year, many white papers, and two upcoming books, they are all far less authoritative than, say, Alex Russell’s blog, or Comet Daily, a professional blog about Comet. The result is that the Wikipedia article must be crippled by a) the months turnaround for content from a blog to appear in an “authorative source” and b) the loss of clarity because we can only refer to those authors who aren’t on the cutting edge but instead are re-hashing the blogs from last year in book or conference form.

Is there a solution to this problem? Maybe some combination of editors to summarize sources, and page rank to determine the content that has actual authority. I don’t really know, but I do know that wholesale deletions are *not* the answer, and its a childish reaction, at best, on the part of Wikipedia editors. Maybe someone else knows the answer?