Wednesday, February 4

The Perseus Digital Library is impressive. It provides classical texts with each word linked to dictionary entries and usage statistics, which in trying to read something like the Ars amatoria is an improvement on flipping dictionary and grammar pages by hand, at least for speed. It's this sort of resource that makes me yearn for something really lightweight and portable with wireless and a really nice screen that I can treat more or less like a book.

On the other hand, having imposed a degree of separation between much of my life and access to computers, I suppose it's probably all right that it will be a while before I own one again. By the time I'm willing to engage the networked world that much, maybe I will have learned to use the tools instead of being used by them.

Part of that will probably be finding a machine which imposes little or no constraint on my physical location. Tying myself to a single point in space for such long stretches of time is one of the stupider and more self-limiting things I've ever done, and no set of feel-good spatial metaphors for networks or information quite overcomes that. I blame William Gibson.

Anyway, back to Perseus for a moment: One of those web scripting projects I've thought about for a while and probably won't attempt any time soon would be really dense automatic hyperlinking something like the dictionary-entry-per-word method.

It has probably occurred to a lot of us by now that doing a quick Google search for what we assume to be the most widely known or definitive resource on a given referent, then hard coding a link, is in a lot of ways the sort of repetive task that ought to be automated. (I should perhaps say "in all ways but one", since picking a single useful resource out of search engine results for a given term is still Way Hard to automate, Google's "I feel lucky" button notwithstanding.) And that the value it adds to your text is usually minimal compared to the time you spend. Furthermore, the web is full of good resources for which the same pattern holds - some of them much more likely to return valuable information through an automated link than even a system like Google. Examples that spring to mind are dict.org, Wikipedia, and Everything2. You probably have your own set of references, but those would be good places to start.

So here's the idea: why not feed web text through filters on the server side that break it into reasonable particles and mark up the better part of them as links to appropriate resources, so that an entire document becomes kind of a clickable reference without the writer having to wade through and rehash search engine results?

Well, for one thing because it would probably be really hard to do well. Trivial to do badly, I'm guessing, but extraordinarily difficult if you wanted the links to be broken up logically and pointed at the right kinds of resources. For another, it would probably tend to be a disaster in interface terms, and it would violate most of the conventions that have grown up around the uses of the link, which tend to imply things like intent, deliberate reference, and endorsement.

Still, maybe this would be worth taking a few hours' crack at some time. There might be uses for a weak implementation.