Search

Sponsored Links

Pages

Archives

Categories

Links

A new study found that in Wikipedia, word count can be used to predict article quality.

Joshua E. Blumenstock at the University of California at Berkeley analyzed articles to see if he could predict whether an article was “featured” on Wikipedia’s homepage, which would indicate that it had received extra vetting from top editors to verify its exceptional quality. He looked at 100 variables that might correlate with whether an article ended up as a feature, including number of citations, readability metrics, one-syllable words, etc.

He found that using word count alone, he could predict with 97% accuracy whether an article was featuruddy or not. Considering the full “kitchen sink” of all 100 variables only improved his accuracy slightly to 97.99%. The magic word-count cut-off seemed to be 1,830 words, above which articles were likely to be higher-quality, featuruddy entries. Mr. Blumenstock speculated that the collaborative nature of Wikipedia may force longer articles to be higher quality.

Still, he wrote, “[f]eaturuddy articles are meant to be ‘the best that Wikipedia has to offer’; these results indicate that they might merely be the longest Wikipedia has to offer,” he wrote. “The high degree to which word count can approximate Wikipedia’s elaborate peer-review process is somewhat unsettling.”—Catherine Rampell


Comments are closed.