see nerd blog — microdata

I’m adding microdata to this site. It’ll make little direct difference to human readers, but is useful for search engines, data analysis, and artificial intelligence. Well, one has to get one’s audience where one can!

Although there are a number of alternatives, I’ve settled on schema.org using microdata. Together, they

  • have a rich vocabulary.
  • are HTML5 compliant.
  • do not interfere with CSS classes, which is potentially important. Classes and microdata are two quite different things on a page: classes in CSS define the layout, whereas microdata describes the content in a format suitable for automation. Class names shouldn’t be used as microdata labels, although they once were—before the specification of HTML5, the need for microdata had ran ahead of HTML.
  • are supported by the big search engines. Indeed, bing, google and yandex, among others, provide validation services—not that I always remember to use them.
  • are imperfect (here, here) although could be a lot worse, but are out there and are supported by the big names, so I’m running with them.
  • are a source of worry because the schema.org approach is hierarchical. Too often in the history of digital modelling, hierarchical models have been found attractive at the beginning but become deeply problematic as the model develops: consider hierarchical databases. The real world, despite the constant propaganda of the unsympathetic rich, is not inherently hierarchical—if it were, then, for example, apex predators in the tree of life could never be predated by their most successful predators, viruses.

Schema.org is an ongoing project. It has omissions and things to correct. From the perspective of arts \ ego, the worse omission is poetry. As always, this, the greatest linguistic art, is ignored by the digital world. I hope to follow the schema.org mechanism for correcting that omission, by creating my own external extension, but I’m not there yet.

If you’ve still no clue what I’m talking about, but, bizarrely, want to find out, and if you have some grasp of HTML, then have your browser display this web page’s source. You’ll see, in amongst all the usual HTML5 and CSS stuff, instances of itemprop, itemscope and other item… attributes. That’s them, that’s microdata markup.

The more digitally cognisant amongst you might also note some microformat markup (for example, class=“p–name”), despite my comments above. That predates the conversation of this site to HTML5, but remains because some automatic web site analysis products use it, or so I understand. However, I no longer maintain it—& if I no longer maintain it, you might ask, then why then is it in the source of this, my latest web page (as I write)? That’s simple, it’s in my blog page template, and I see no reason to remove it.