Deep Web. Invisible Web. The Hidden Internet. Every knowledge professional, from librarian to data mining specialist, should know that the Deep Web exists.
The term “Deep Web” refers to data and resources that are not accessed by typical Googling. Examples include university-hosted databases, organizational Wikis, topic-specific forums, and other volatile locations difficult to index with typical bots and spiders. Notably, some of these massive websites can be very information-intensive – and also nigh-impossible to crawl via current public methods.
The Invisible Web is the special part of the web that provides convenient and direct access to data. This hidden information is typically open-source and requires a higher level of user filtering than normal usage of Google or Bing. Often, interpreting the results requires a combination of technical skills (for understanding) and research skills (for locating).
Corporations and fast-moving organizations can especially benefit from delving into the refined data-streams of the more major and less volatile resources. Non-advertised, often scientific, and tucked away in a digital corner, Deep Web resources are highly prized for their low data:noise ratio.
The technical resources that the Invisible Web covers make it vital for hard-core research. Deep Web materials include the National Institute of Health’s extensive libraries, Google Patents, open-source community blogs and Wikis, Metafilter‘s rambling yet insightful community, and the millions of data-intensive sites that have never been fully indexed by a search engine. Good library science programs intensively teach about this hidden gem, with the caveat that it is fruitful but slow to work with.
The Deep Web is a key foundation of the visible Web that Yahoo!, Bing, and Technorati allow us to access. Assessing and incorporating the information takes some serious spelunking, but the depths yield beautiful (and useful) rewards.
Images from Public Domain Pictures.