The Wayback Machine recently captured its trillionth web page – here are 5 surprising facts about the ‘living history of the internet’
(Image credit: Shutterstock)
The Wayback Machine is a nifty tool, but I didn’t realize just how important, or expansive, this library of internet history actually is – and what the Internet Archive, which maintains this collection, also gets up to.
In a recent report, CNN revealed a whole lot of interesting facts (via PC Gamer) about the Wayback Machine and, more broadly, the Internet Archive. At its heart, Wayback is a time machine that lets you travel back to previous versions of web pages – or, if you like, it’s a TARDIW (Time And Relative Dimensions In Web).
There’s more to the Wayback Machine than this, though, and indeed the broader activities of the Internet Archive, a nonprofit run by software engineers and librarians – so here’s my list of five surprising facts.
1. A trillion pages in the book of the web
Just over a month ago, the Wayback Machine logged its trillionth web page, which is mind-boggling. We’re told that in contemporary times, this library is being added to with closing in on 150TB (that’s 150,000GB) worth of web pages on a daily basis.
2. A sanctuary of servers
The Internet Archive is based in the Richmond District of San Francisco in the US, in a building that was the ‘Fourth Church of Christ, Scientist’ – a seriously striking piece of architecture featuring eight huge columns along the front (it resembles the organization’s logo).
The church still has its stained-glass windows, but inside, there are now a bunch of servers storing precious data for the Wayback Machine – although, of course, there aren’t a trillion-plus web pages within the walls of the headquarters. The majority of the Internet Archive’s servers are in a big warehouse outside of San Francisco.
The servers in the ex-church are symbolically located in the building’s main sanctuary.
3. The importance of preserving the web
There’s great value in keeping historical snapshots of web pages, whether those sites are run by governments, corporations, other organizations, or indeed the blogs of individuals. The ability to see changes can throw light on the motives of said organizations and preserve pieces of written content that would otherwise be erased from our collective memory (eventually).
In the case of governments making changes to official websites, it can be vital for journalists to have access to previous versions of web pages to clearly see the impact of any alterations.
Eliza digitizing books at the Internet Archive – YouTube
4. Not just web pages
The Internet Archive isn’t just about preserving the history of the web, but also digitizing books (see above), and other media such as vintage vinyl records (from as far back as the 1920s), CDs, cassette tapes, VHS, TV shows, and video games too.
I wasn’t remotely aware of the diversity of the historical records that the Archive keeps in this respect.
5. Founded by an internet pioneer
Brewster Kahle, the founder of the Internet Archive and the Wayback Machine (in 2001), was an internet pioneer and previously one of the creators of the precursor to the World Wide Web. This was the WAIS (Wide Area Information Servers), which was the first distributed-search and document-retrieval system to grace the internet.
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.
