Wikipedia notes

From LLN

Wikipedia notes

Contents

The facts are clear enough:

  • Wikipedia is an enormous resource, with more than seven million articles in its many editions, more than two million in the English edition alone (as of December 26, 2007). (Next largest editions? Deutsch, Français, Polski, Nihongo, Nederlands and Italiano in that order, with many others.)
  • Wikipedia is the largest example of collaborative text on the web.
  • Most people who use the web use Wikipedia--almost inevitably, given the prominence of Wikipedia results in Google and other search engines.
  • "Anyone can edit Wikipedia" and contribute to it--but there's an Animal Farm quality to that statement, since some contributors and editors are most definitely more equal than others. On the other hand, Wikipedia fans are quick to use that assertion as a defense against any criticism of Wikipedia: "If something's wrong, it's up to you to go fix it."

Notes from Leader's Digest

by Leslie Dillon

Wikipedia and the meaning of truth

Leader's Digest October 2008

Wikipedia has redefined the word “truth,” according to Simson L. Garfinkel’s article in the November/December 2008 issue of MIT’s Technology Review. That’s important because “Wikipedia’s articles are the first- or second-ranked results for most Internet searches.”

Studies show that Wikipedia’s articles are quite accurate—in large part because the Wikipedia community has organically evolved policies for removing inaccurate information. Wikipedia’s official policies for inclusion of content are based on “verifiability, not truth,” “no original research” and “neutral point of view.”

Verifiability at Wikipedia is actually authoritativeness—articles must have footnotes and references, and these should be “reliable, third-party published sources.” Garfinkel believes that Wikipedia’s standard for truth makes good sense. But, unfortunately, many publications (e.g., Dun & Bradstreet) do insufficient fact checking or none at all.

Garfinkel, a computer scientist, uses Wikipedia daily and finds its technical discussions “generally excellent.” When they aren’t excellent and he knows better, he fixes them.

“So what is Truth?… Wikipedia’s standard for inclusion has become its de facto standard for truth, and since Wikipedia is the most widely read online reference on the planet, it’s the standard of truth that most people are implicitly using when they type a search term into Google or Yahoo. On Wikipedia, truth is received truth: the consensus view of a subject.”

(Simson L. Garfinkel, “Wikipedia and the meaning of truth,” Technology Review, Nov./Dec. 2008.)

Wikipedia use grew 8,000% in five years

by Leslie Dillon from Leader's Digest June 2008

Wikipedia traffic has skyrocketed over the last five years, increasing nearly 8,000 percent between April 2003 and April 2008. Most of Wikipedia's visitors came via Google and Yahoo. “The site’s rapid ascent ... demonstrates the success of its collaborative nature..." according to Nielson Online.

(OCLC Abstracts, June 9, 2008.)

Wikipedia--the new Information Commons

Leader's Digest July 2007

If you haven’t already read the story on Wikipedia in the July 1 New York Times Magazine, read it now! If you can't, read this.

Whether you hate Wikipedia or love it, you can’t deny its success--6.8 million registered users worldwide, 1.8 million English-language articles and it now “accounts for a staggering one out of every 200 page views on the entire Internet”.

One of the more interesting things about Wikipedia is that it’s becoming a news site as well as an encyclopedia. The article describes how the all-volunteer Wikipedia members get it right.

A few of the main points:

  • Wikipedia works because it’s a collaboration.
  • The Wikipedia culture is radically decentralized.
  • Wikipedia’s mistakes are most often the work of deliberate vandalism.
  • Many of those who do the “hard-core editing on a breaking news story” are young--often high school and college students.
  • The only way to get into a “position of authority on Wikipedia is to care about it enough.”
  • Wikipedia’s chain of authority consists of 1,200 “admins”; above them are the “bureaucrats,” who are empowered to appoint admins; at the next level are 30 stewards, appointed by the seven Wikipmedia Foundation Board members.
  • “Pride of ownership” is what drives Wikipedia members to get their facts right.
  • Wikipedia’s aim is to function as a “bias-free digest of what others have already reported elsewhere.” There is no original research.
  • Many Wikipedia entries are deleted within days, hours or even minutes by its gatekeepers.
  • Wikipedia is “centered almost entirely on the carefully written word.”
  • Wikipedia is also very much about the community behind it.

(One of the things that puzzles me is why librarians haven’t embraced Wikipedia more. We know how well collaboration can succeed. We believe in and support information commons. I sure would like to see more initiatives like the one at University of Washington’s Digital Initiatives unit, which is inserting links into Wikipedia. Here’s an http://www.dlib.org/dlib/may07/lally/05lally.html article] in D-Lib Magazine on the project if you want to read more about it.)

(Jonathan Dee, “All the news that’s fit to print out,” The New York Times Magazine, July 1, 2007.)

36% of online Americans use Wikipedia

Leader's Digest May 2007

In case you haven't already seen it, a Pew Internet and American Life report reveals that 36% of American adult internet users consult Wikipedia. On a typical day in the winter of 2007, 8% of online Americans consulted Wikipedia. It's particularly popular with the well-educated and with college-age students. Wikipedia's popularity stems from the sheer extent and currency of its content and from the fact that the huge number of links give it very high Google rankings. "In fact, Wikipedia has become the #1 external site visited after Google's search page, receiving over half of its traffic from the search engine." (Pew Internet and American Life, Wikipedia Users, Apr. 24, 2007 via [http://www.resourceshelf.com/2007/04/25/wikipedia-users-a-new-report-from-the-pew-internet-and-american-life-project/

Study finds Wikipedia accurate

Leader's Digest December 2006

A study by librarian Thomas Chesney found that information at Wikipedia is generally accurate. More interesting is this unexpected conclusion: Those who are not subject experts are less likely to believe what they read there than those who are subject experts. ("An empirical examination of Wikipedia's credibility", First Monday, via The Virtual Chase, Nov. 27, 2006.) ResourceShelf], Apr. 25, 2007.)

Knol: Google’s answer to Wikipedia

Leader's Digest January 2008

Google’s Knol is a new experimental website that intends to make online information easier to find and more authoritative. Unlike Wikipedia, “Knol articles will have individual authors, whose pictures and credentials will be prominently displayed alongside their work.” Right now, participation is invitation only, but will eventually be open to the public. Readers will be able to rate articles; the better the rating, “the higher it will rank in Google’s search results.”

The term “knol” is meant to denote a knowledge unit. According to Udi Manber, a Google vice president of engineering, “the key idea behind the Knol project is to highlight authors.”

Because authors are attributed, articles are more likely to have legitimate information. Knol’s features to establish an article’s credibility include references to its sources and information about the author. This may well “attract experts who...prefer the style of attribution common in journalistic and academic publications.”

(Andrew Schrock, “Google’s Answer to Wikipedia,” Technology Review, Jan. 14, 2008.)

Here’s Barbara Quint’s take:

The mainstream press viewed Knol as a Google offensive against Wikipedia. “However, a closer look at the model and a closer read of Manber’s announcement … indicates … a strategy of building Google into a powerhouse publisher, possibly integrating with Google Books, Google Scholar, Google Custom Search Engine, and even Google Base.”

(Barbara Quint, “Google Knol: The ‘Grassy Knoll’ for Publishers or Just Wikipedia?” Information Today NewsBreaks, Jan. 7, 2008.)

Editor's note: Knol may have Google's name behind it, but Citizendium has been building an authoritative resource since March 2007.

Wiki 2.0: the future library?

From Leader's Digest August 2008

Using the library as metaphor, Wikipedia’s Jimmy Wales describes the future of the Web as becoming increasingly collaborative.

He believes that we “are still very much at the beginning of Web 2.0," which he defines “as a medium marked…by collaborative production…”

In a library the encyclopedias are “only a tiny fraction of all the works in the library.” It’s the same with Wikipedia; while comprehensive, it’s only part of a much greater whole. So Wikia, the Web hosting service Wales established in 2004, is now “building the rest of the library.” The Wiki “search project will allow mass collaboration on the creation of search results. Projects that others are working on will allow mass collaboration on video production, music and more.”

Wales asks us to consider as an example the production of a documentary about attitudes toward global warming in many different countries. It would be very difficult and expensive for a single “film crew to conduct hundreds of…interviews worldwide and edit them together into a compelling narrative.” But people collaborating over the Web could get this done.

Wales sees this model as extending to solve “problems afflicting the Internet itself… Rather than confronting a stark choice between anarchy and top-down control, we may find that communal efforts can yield a reasonable solution.”

(Jimmy Wales, "Wiki 2.0," MIT Technology Review, July/August 2008.)

Editor's note: Wikia is a for-profit company that, by most accounts, hopes to use "crowdsourcing"--free labor by thousands of people--to build heavily ad-supported sites, using Wikipedia as a model for the glories of crowdsourcing.

Notes from Cites & Insights

by Walt Crawford

Wikipedia and worth

This perspective, which appeared in Cites & Insights 4:12, October 2004, considered the "Halavais test" and some commentaries on Wikipedia in 2004. Some extracts:

The Halavais test

Remember the Halavais test? Alex Halavais of the School of Informatics at Buffalo University made 13 changes in the English language Wikipedia, “anticipating that most would remain intact and he’d have to remove them in two weeks.” Presumably, if that had happened, there would have been evidence that the ease of modifying Wikipedia makes it suspect as a resource. Some people attacked Halavais for this "deliberate vandalism," even though it was clear that he fully intended to reverse the changes. As it turns out, all the changes were identified and removed within a couple of hours. Halavais reported this and found himself “impressed.” Vandalism may be less of a problem than some might have thought—if it’s readily detectable vandalism, e.g., simple graffitiesque changes or changing facts that can be readily verified by a Wikipedia contributor or editor. (Another tester made a series of more subtle changes and says none of them were corrected over the test period.)

One Wikipedia technical team member noted “some of the hurdles a vandal has to deal with”: a “Recent Changes Patrol,” personal watchlists that inform contributors of changes made to articles they’ve registered interests in, the ease of tracking all edits from a given IP address when one edit has been identified as vandalism, “the people” and the enormous rate of Wikipedia edits, and tools for dealing with persistent vandals. It’s an interesting list (frassle.rura.org, August 30, 2004). I could take issue with part of one paragraph, following the note that there were almost a million edits in June 2004:

The articles are being improved at a tremendous rate and even obscure changes are likely to be noticed within weeks or months, with the time depending on just how obscure the article is. Obscure is potentially harmful to fewer people and perhaps more likely to be seen by those who have knowledge of the topic sufficient to spot clear mistakes.

“Improved” is a value judgment not automatically implicit in a fast rate of change. Maybe all those edits are improvements; maybe not. My real problem is with the idea that errors (deliberate or otherwise) in obscure topics are less important. I think it’s the other way around. Obscure topics can’t be verified as readily against other sources. If Wikipedia had 1869 as the end of the Civil War, it would be an obvious and readily-verifiable error. If Wikipedia asserted that HTTP GET should never be used for URLs in excess of 256 characters (as opposed to the reality, that a fairly old RFC notes that some old servers may not handle very long HTTP GETs properly), a user might not have an easy way to double-check.

Other commentaries

Ed Felten (who blogs at Freedom to tinker) noted two sides of an argument that continues: “Critics say that Wikipedia can’t be trusted because any fool can edit it, and because nobody is being paid to do quality control. Advocates say that Wikipedia allows domain experts to write entries, and that quality control is good because anybody who spots an error can correct it.” He added that much of the debate ignores the best evidence: The actual content of Wikipedia.

Felten took a look at its entries on “things I know very well: Princeton University, Princeton Township, myself, virtual memory, public-key cryptography, and the Microsoft antitrust case.” His findings? The first two entries were excellent. The entry on Edward Felten was “accurate, but might be criticized for its choice of what to emphasize.” (They also had his birth date as uncertain, which he corrected.) The technical entries “were certainly accurate, which is a real achievement” and were both backed with the kind of detailed information that wouldn’t be feasible in a traditional encyclopedia—but neither did a great job making the concepts accessible to non-experts. As he notes, that’s a quibble.

Unfortunately, the article on the Microsoft case was “riddled with errors”—-factual errors, mischaracterization, terminology errors. His conclusion?

Until I read the Microsoft-case page, I was ready to declare Wikipedia a clear success. Now I’m not so sure. Yes, that page will improve over time; but new pages will be added. If the present state of Wikipedia is any indication, most of them will be very good; but a few will lead high-school report writers astray.

David Mattison The Ten Thousand Year blog addressed the overall issue of whether a wiki is appropriate for scholarly communication. His answer: “Banks are probably not appropriate for keeping money and valuables because they get robbed.” Thus, many banks and wikis have gatekeeping and security protocols to keep the valuable cash and data from being tampered with--but wikis can operate with totally open-door policies. “It’s the very nature...of this ideal type of wiki that makes some of us nervous and thrills others for various reasons, not all of them socially acceptable.”

Mattison goes on to say that a wiki can be highly appropriate for scholarly communication if all the scholars trust one another, are collaborating on something, and use appropriate security and rollback mechanisms. These concluding paragraphs firmly separate Mattison from extreme “the community is always right” advocates:

Wikis are just another tool in what I, borrowing from others, call the Collaborative Web: technology and applications that let individuals work together or independently directly through the Web browser without a gatekeeper (e.g., a Webmaster) standing in the way.
The question of whether what emerges from that collaboration is authoritative or scholarly depends on other factors often above and beyond the collaborative process itself.

My own conclusions at the time--along with noting that I doubt Wikipedia will "eclipse" traditional encyclopedias just as I doubt that blogs will replace newspapers or that econtent will sweep away all print media:

Wikipedia is certainly not worthless. Wikipedia is also not automatically better than a traditional encyclopedia because of the community of writers. I would tend to use Wikipedia entries as starting points, to be used on a “Trust but verify” basis. But isn’t “trust but verify” the base heuristic for almost all resources, traditional or new?
My assumption is that lots of specialists have contributed good work to Wikipedia, particularly in areas related to the web and digital resources. My assumption is also that some Wikipedia content is faulty, biased or wildly incomplete. In the latter case, I’d make the same assumption about a traditional encyclopedia, up to and including Britannica.

Wikipedia and worth [revisited]

This perspective appeared in the February 2005 Cites & Insights. It's a much longer piece than its predecessor, beginning with Robert McHenry's assault on Wikipedia as "The faith-based encyclopedia" in Tech central station (the article is no longer available and the site has changed names to TCS Daily). Note that McHenry is former editor in chief of Britannica. He reviewed some of the history of Wikipedia, disagreed with its worth and, frankly, came off as a vendor of sour grapes.

John Scott, a computer technology historian whose blog (ASCII by Jason Scott) charmingly offers two personal attacks on Scott in its banner, wrote "The great failure of Wikipedia" on November 19, 2004. It's a long post discussing his own experience attempting to work on Wikipedia. He begins:

I have now tried extended interaction with Wikipedia. I consider it a failure. In doing so, I will describe why, instead of just slinking off into the night on my projects. Maybe it will do some good. Maybe it will not. I'm sure, at the end of the day, there must be hundreds like me at this point. Burned, slapped, ejected from the mothership for not following the rules, no matter how intricate and foolish. Let me at least go with some smoke.
The concept of Wikipedia is a very engaging and exciting one, especially to someone like myself who spends an awful lot of time collecting information and then presenting it to people. Normally, the work I do is the work that's done. That is, if I don't give much attention to a specific section of my sites, that section will stay static, even if it's in need of improvement. This is not very enjoyable. In collaboration, you will put your tools down for the night, and when you wake up the next morning, more work is done. This is very exciting, very enjoyable. It's why people work in teams in the first place.

Scott's primary criticism of Wikipedia, which I've seen repeated in different forms many times since then:

This is what the inherent failure of wikipedia is. It's that there's a small set of content generators, a massive amount of wonks and twiddlers, and then a heaping amount of procedural whackjobs. And the mass of twiddlers and procedural whackjobs means that the content generators stop being so and have to become content defenders. Woe be that your take on things is off from the majority. Even if you can prove something, you're now in the situation that anybody can change it. And while that's all great in a happy-go-lucky flower shower sort of way, it's when you realize that the people who are going to change it could have absolutely no experience with the subject whatsoever, then you see where we are.

There's a lot more to Scott's article--but in some ways it was overshadowed by a December 31, 2004 essay by Larry Sanger (cofounder of Wikipedia), "Why Wikipedia must jettison its anti-elitism." While Sanger left Wikipedia (under circumstances that Sanger and Wikipedia's chief honcho, "Jimbo" Wales, disagree about), he still regards it as important--but has issues with it. Quoting from my earlier summary:

“First problem: lack of public perception of credibility, particularly in areas of detail.” He’s not saying Wikipedia is unreliable—but that it’s perceived as inadequately reliable “by many librarians, teachers, and academics.” Saying “but it gets used a lot” does nothing to negate that problem: “people use many sources that they themselves believe to be unreliable, via Google searches, for example.” He goes on to point out the benefits of credibility—and to suggest there’s a real problem with credibility in specialized topics outside the interests of most Wikipedia contributors. (He suggests comparing Wikipedia’s philosophy section to the Stanford Encyclopedia of Philosophy, as one example.)
“Second problem: the dominance of difficult people, trolls, and their enablers.” See Scott’s article, above. “Far too much credence and respect accorded to people who in other Internet contexts would be labeled ‘trolls.’” He thinks this is a generic problem with unmoderated Usenet groups that’s infected Wikipedia—although he notes that Wikipedia takes steps to control the problem in its most extreme cases.
“The root problem: anti-elitism, or lack of respect for expertise.” “[A]s a community, Wikipedia lacks the habit or tradition of respect for expertise.” He comments on his efforts to overcome anti-elitism (including snubs and disrespect of expertise) and the consequences of the current situation. Experts with relatively limited time and patience don’t participate, because they have to defend their positions and are shouted down if they complain about whack jobs.
Sanger believes that Wikipedia’s openness does not require disrespect toward expertise. He believes Wikipedia would be much better if more experts contributed. He anticipates a “more academic fork of the project” at some point.

That article--it's fairly long and definitely worth reading--drew more than 400 comments of all stripes, easily a book's worth. If you have lots of interest and even more time, you might read them--and note that this article may have been the opening salvo in what has since become Citizendium, a relatively new project that's definitely worth watching.

I summarized many other commentaries on Wikipedia--really too many to excerpt here (the essay's 6,000 words). If you want a short stroll down memory lane on library and other comments about Wikipedia, you might read the piece.

What about Wikipedia?

This "Net Media Perspective" appeared in Cites & Insights 6:13, November 2006. I discuss the great "Wikipedia vs. Britannica" debate--the Nature comparison that found a tiny sampling of science articles in the two sources to have roughly similar error rates, Britannica's extended rejoinder to that comparison and a range of comments related to that. There's also a commentary by danah boyd "on being notable" and the problems with Wikipedia entries for living people, Seth Finkelstein's similar difficulties (and those of a former Wikimedia board member!), and notes on a long and quite good article on Wikipedia in the New Yorker and one that I found, shall we say, less convincing in Atlantic Monthly. The latter article seems to conclude that truth is whatever "common knowledge" says it is--that facts are a matter of majority rule. There's also quite a bit about Citizendium, both for and against (it's interesting to note the number of opinion leaders who are ready to kill off a fledgling attempt to do a better version of something they believe in, before it even starts).

Wikipedia revisited

This Net Media Perspective, in the March 2007 Cites & Insights, recounts earlier C&I coverage (including items not noted here, some going back as far as 2002). I also discuss some Wikipedia controversies from 2006 and early 2007, including Jason Lanier's "Digital Maoism," Brock Read's "Can Wikipedia ever make the grade?" (Chronicle of Higher Education) and several others.


Your turn: Talk about it

Personal tools
Home