The strange hierarchy of Wikipedia’s “first link network,” revealed.
By Nathan Collins
(Photo: Lionel Bonaventure/AFP/Getty Images)
Here’s an easy experiment you can do: Pick a Wikipedia page at random. Then, click on the very first hyperlink you see, and repeat. Eventually, you will reach the philosophy page, an observation that first received widespread attention through the comic xkcd (specifically, its hover text). But the really interesting part, according to new research, is not where you end up, but how you get there: through funnels representing broad, timeless ideas like science, or through rather more ephemeral subjects like health care and fossil fuels.
Intrigued by the xkcd observation and its antecedents, University of Vermont Computational Story Lab researchers Mark Ibrahim, Christopher Danforth, and Peter Sheridan Dodds decided to use the web of “first links” to understand something about the structure of Wikipedia knowledge. To begin their investigation, the researchers followed the first links from all 11 million pages in the English edition of Wikipedia, enabling them to map out a sort of drainage system of ideas, one idea flowing into the next like water from a mountain spring making its way to the sea.
Philosophy is a major organizing principle for the ideas represented on Wikipedia.
Now, there are different ways to think about rivers, but a particularly useful one is the drainage basin—all the different springs and streams whose water eventually flows through one point. For instance, the drainage basin of the Columbia River includes almost all of Idaho, most of Oregon and Washington, and pieces of British Columbia, Montana, and Wyoming.
One can also distinguish the major points of access to downstream waters—for the Columbia, those include the Snake River, which joins the Columbia in eastern Washington, and the Willamette River, which joins up on the outskirts of Portland, Oregon.
Ibrahim, Danforth, and Dodds’ idea was to find those access points—what they call funnels—for Wikipedia pages. As xkcd and others had figured out, the largest funnel is philosophy—some 7.37 million pages eventually flow into the philosophy page. That’s not to say very many pages link directly to philosophy—only 581 do, while more than 80,000 link directly to the United States. Rather than directly connecting many ideas, the authors suggest, philosophy is a major organizing principle for the ideas represented on Wikipedia.
All of which makes what’s next on the list, while much smaller in terms of the number of source pages, also much weirder. At number two, with about 30,000 source pages funneling into it, is Presentation (as in getting up in front of people and explaining something), followed by Tree of Life (Biology) and Southeast Europe. Hip-Hop Music comes in at number 11.
“More curious is the emergence of recently prominent political and economics topics such as ‘Fossil Fuel’ and ‘Health Care’ within the highest ranking funnels. Wikipedia seems to reflect not only timeless foundations, but also the topical (at least within English speaking society),” the authors conclude.