Semantics in Markup of Ranked Tag Lists

Some concepts are composed of such obscure scholastics that I find myself right at home among discussions fussing about them. Like ‘Semantics in Markup of Ranked Tag Lists’. Let’s talk about machine intelligence as it relates to a strange graphical representation of social software’s edgy reinvention of keywords!

Joe Clark said that Technorati’s tag box could use semantically better markup than inline style attributes specifying font-size. “Self-evidently, a tag collection of this sort is an unordered list with an assumed default size that is modified by big and smallâ€?. I’d cursorily agreed, because I’m sympathetic to the argument that semantics are sometimes forced into inappropriate uses—the <i> tag is not completely replaceable with with the <em> tag.

So when Matt linked to Joe Clark’s post I chimed in with a rationale for using <big>, saying that it’s a more appropriate element than <em> in this context. Joe Clark’s update has a good point, that the essense of weighted lists is presentational anyway, hence the core of their meaning is the dimension: big and small.

However, as Tantek says, <big> and <small> are about “fonts and nothing more”. So when he updated Technorati’s tag box in response, he used nested <em>s. But I agree with a commentor over at Matt’s—dave— when he states that nested <em>s benefit neither man nor machine.

Dimitri Glazkov articulates the abstract problem in a comment at Anne Van Kesteren’s:

What you are dealing [with] here is attempting to express two dimensions of relationships (alphabetical + importance) for a set of items using one dimension (serialized presentation of content, aka markup). If we learned anything from the math classes in college, you can’t express N dimensions using N-1 dimension without sacrificing some integrity and having to resort to “assumed” relationships, which are expressed by an additional semantical structure. [...]

HTML/CSS doctrine itself does not provide a clear pattern on how to handle these situations, and thus the interpretations of the Bible begin, with all the semantic sophistry that always accompanies them.

There are two ordered lists here: a list of keywords sorted alphabetically, and a list of keywords sorted by popularity. The on-screen representation shows one as the linearized display of elements (top-to-bottom, left-to-right for English), and the other as the rendered dimension of these elements (larger means more significant.) HTML lists specify just one order (or maybe they should have a sort="alphabetical" attribute—then the tags can be marked up as a list ordered by popularity.)

So, turning off the CSS on Tantek’s implementation loses the tag popularity information, and Joe Clark’s suggestion mixes semantics and presentation. I’d go for the former, but a pox on both houses, I say; dave had the suggestion of using a table with term and popularity columns. It does change the meaning of the weighted list, turning it into a table instead, but that sort of thing degrades excellently. Intruiged by his suggestion that table rows can be displayed inline to recreate the tag-box style, and because putting the tags in a table and using class attributes on the rows allows the popularity to be conveyed by other ways than term size—colour intensity, for example, or a bar chart—I made a demo.

Tags CSS Demo

Behaves well in Firefox (yay for Gecko), kinda works in Opera, and completely fails in IE. This can probably be worked around, though, because IE fails more because of the selectors than because of the properties. A sprinkling of class and id attributes should take care of that.

Ultimately, I don’t expect this to be much of a problem, because as soon as one veers into bar graph and pie chart chart territory, it becomes clear that such representations of data are best displayed in graphic formats like jpeg or svg. Charts are fundamentally visual, and the only way to make them accessible, or otherwise readable as text, is to summarize the trends they convey.


Incidentally, Dimitri is right about the ‘semantic sophistry’ these discussions entail. Tantek’s aside on the HTML spec’s “abuse of a <dl>/<dd>” reminded me of my being irked by this part of the spec:

Another application of DL, for example, is for marking up dialogues, with each DT naming a speaker, and each DD containing his or her words.

Well, that’s hardly a definition list, now is it? It’s not a set of key-value pairs either, because the keys aren’t unique; each speaker may have hundreds of different interjections in the overall dialogue, as in a play. What’s required is a way to specify a list where one element can be paired with the next, loosely associated for whatever purpose the document author wishes—definition lists are just one example.

But none of this really matters. Neither a ‘definition list’ nor the ‘meaning’ of nofollow matter outside of an English-speaking context. I wonder whether 50 years from now these discussions will be archaic, because markup will have become as democratized as language, where the only ‘proper’ way of saying something is the way everyone else understands it. On the other hand, the 90s HTML mess was begotten of just such a democratization, and the following contemporary mindset holds quite contrary beliefs. There will always be company for those fond of obsessing over slight pedantic vagaries.

Related: Tim Bray on markup and meaning: “I think that this whole area of thought is what over in the W3C TAG we refer to as a “rat-holeâ€?. I.e., something you can vanish down never to re-appear, or at least a place where you can waste a lot of time scurrying along twisty little passages. [...] [W]e shouldn’t try to kid ourselves that meaning is inherent in those pointy bracketsâ€?.

16 Responses to “Semantics in Markup of Ranked Tag Lists”


  1. 1 Photo Matt Unlucky In Cards Feb 4th, 2005 at 10:26 pm

    Semantics in Markup of Ranked Tag Lists

  2. 2 Photo Matt » Unlucky In Cards Feb 4th, 2005 at 10:26 pm
  3. 3 eclecticism Feb 5th, 2005 at 5:09 pm

    » Semantics in Markup of Ranked Tag Lists Let’s talk about machine intelligence as it relates to a strange graphical representation of social software’s edgy reinvention of keywords!

  4. 4 Seismography Feb 6th, 2005 at 5:17 am

    Link Dump Random links with some cool stuff: Semantics in Markup of Ranked Tag Lists via Photo Matt worldKit via Six Apart Posted on 02/05/05 at 12:00 PM

  5. 5 Tantek's Thoughts Feb 18th, 2005 at 3:57 pm

    independently came up with the idea of using nested <em> elements a week ago. Comments [IMG and other blogs commenting on this post] Joe Clark Niall Kennedy Anne van Kesteren Paul Hammond Ian Irving Chris J. Davis Matt Mullenweg

  6. 6 Trials and Tribulations Mar 10th, 2005 at 12:53 am

    There’s a post on firasd.org that talks about how Technorati’s Tag Box uses some HTML tags. Joe Clark says that the box could use better markup semantics. It’s quite a good read, you should check it out. Posted in

  7. 7 Gir’s Brain Mar 10th, 2005 at 12:53 am

    There’s a post on firasd.org that talks about how Technorati’s Tag Box uses some HTML tags. Joe Clark says that the box could use better markup semantics. It’s quite a good read, you should check it out. Posted in

  8. 8 False Positives Mar 18th, 2005 at 2:55 am

    could work too, but might start the whole Semantic Web fight again. Weighted Tag Lists? Category:tags. Update:1) Tantek has kindly linked back to me, and some other addtional commentary. 2) Firasd pipes in with Semantics in Markup of Ranked Tag Lists , and argues for a two column table of data (Tag Name and Significance), with is correct use of the table emlement, and then applies various CSS styles to get the desired effects as demonstrated. This has the bonus of being a bit more informed if the

  9. 9 False Positives Mar 19th, 2005 at 11:45 pm

    could work too, but might start the whole Semantic Web fight again. Weighted Tag Lists? Category:tags. Update:1) Tantek has kindly linked back to me, and some other addtional commentary. 2) Firasd pipes in with Semantics in Markup of Ranked Tag Lists , and argues for a two column table of data (Tag Name and Significance), with is correct use of the table emlement, and then applies various CSS styles to get the desired effects as demonstrated. This has the bonus of being a bit more informed if the

  10. 10 Jens Meiert Apr 25th, 2005 at 10:51 am

    [...] the <i> tag is not completely replaceable with with the <em> tag.

    It’s generally not replaceable with the em element, as strong is no replacement for b, either. It’s just visualized the same way.

  11. 11 Firas Apr 25th, 2005 at 2:54 pm

    I think it generally is, Jens. I’d venture that 90% of the time when someone italicizes text or makes it bold, they mean to deliver emphasis.

  12. 12 Jens Meiert Apr 25th, 2005 at 6:33 pm

    Well, I guess you’re right that in practice these elements are simply replaced, but nonetheless they’re not the same, they should not be used except for semantic emphasis.

  13. 13 g-cool May 12th, 2005 at 2:42 pm

    Semantics in Markup of Ranked Tag Lists

  14. 14 Boragyday Jan 18th, 2008 at 3:30 pm

    I’d prefer reading in my native language, because my knowledge of your languange is no so well. But it was interesting! Look for some my links:

  1. 1 Seismography Trackback on Feb 5th, 2005 at 1:11 pm
  2. 2 Trials and Tribulations Trackback on Feb 27th, 2005 at 11:05 pm

Leave a Reply