Getting at Content

The downfall of the internet is its failure to know what it contains and
provide it back to humans in an organized fashion. Its successes include the
ability to perform to the level of its failure to any degree at all. In 1998, only 3% to 34% of the
indexable Web was indexed
. In 1999, 16% was
being claimed
. In 2001, Google
was claiming up to 42%
. In February 2003, many
serious problems with efficacy
were still were not solved. Add to this
failure to even be able to contact all internet based information on a subject
the compounding problem of the inability to assimilate and evalute the
information

The December issue of
Wired
leverages a great article – on efforts to place content into
the hands of internet users – with the point that we do not know how little was
know:

Kahle hates the idea that when people think of
information, they think only of what’s accessible via Google. “Seventy-one
percent of college students use the Internet as their research tool of first
resort,” he says, citing figures from a 2001 PEW Internet Study. “Personally, I
think this number is low. For most students today, if something is not on the
Net, it doesn’t exist.” And yet most books are not on the Net. This means that
students, among others, are blind to the most important artifacts of human
knowledge. For many students, the Internet actually contracts the universe of
knowledge, because it makes the most casual and ephemeral sources the most
accessible, while ignoring the published books. “It’s shameful,” Kahle
continues, “because we have the tools to make all books available to everybody.

Amazon has been
taking steps to place non-digital text on the web
in a manner which works
with copyright while also supporting, rather than corroding their core mission
to sell text on paper. Amazon’s
tool is here
. The results of a search for the word “ale” in a book not
primarily about beer can
be found here
. The article also refers to the new feature on the Wayback Machine, a beta search engine of the 11
billion pages of text held there
.

What these tools recognize is that irretrievable information is useless. I
would add dangerous as it is a form of stupification. What I would like to learn
about is how information can be organized automatically before it is presented
so that while it is not unless as inaccessible it is useless as overwhelming –
another form of stupification.