Yahoo! announced today that it will be supporting Semantic Web and microformats to improve search results for structured data (as reported in ReadWrite Web: And Nerds Became Kings: Yahoo! to Announce Semantic Web Support - ReadWriteWeb). The Semantic Web has been a dream of Tim Berners-Lee for a long, long time, and up until now, pretty much way behind schedule because it just seemed, well, too hard. Things are changing.
They always do.
You know how RSS allows you to get feeds from your favorite blogs and other newsy Websites? That functionality is one example of how we are able today to break the offerings on a Webpage up into small parts and send them zipping around the Web. The text is separated from the formatting on our page, the way the text is displayed isn't carried around with it. That enables a snippet of our text, maybe the first paragraph for example, to be displayed by someone, anyone who subscribes to our feed.
Semantic Web potentially micro-bites the content even further -- into little bits that are identified as to precise type: this part is a last name; this part is a first name; this part is a phone number; this part is a set of key words; this part is an abstract, etc. People might tag text down to this level to enable its extraction and manipulation, its readability by computers (see Michael Jensen's article, The New Metrics of Scholarly Authority, about the importance to Authority 3.0 of being computable); its reorganization for other purposes. It gets treated like data rather than information or knowledge (don't let's debate what those things are just now).
What might this mean for copyright policy and practice? Wow, it just sends the mind reeling. I can't begin to imagine the implications, but one thing seems clear: a Semantic Web has the potential to further dramatically reconfigure the relationship between copyright owners and those who wish to access and use their copyrighted works. Implicit in the markup for computer recognition, extraction and manipulation is a license to actually do those things. Atomized text and images, sounds, audio-visuals. Wow. Might a whole new round of fear and loathing be right around the corner? Or will this just add to the steady pressure on copyright owners to open up their works to use and reuse -- if they want attention at all?
