microformats2 & HTML5 -
The Evolution of Web Data

microformats2 & HTML5

The Next Evolutionary Step For Web Data

HTML5 microformats



evolution of data on the web

  1. microformats today
  2. challenges & lessons
  3. microformats2 & HTML5

1. microformats today

structured data 2012:
~70% microformats

Web Data Commons pie chart of domains with structured data

simplicity & openness

public domain / CC0

19 translations!

19 translations on microformats wiki home page

why microformats?

  1. site features/UI - Download hCard to Address Book Download hCalendar to Calenar
  2. search results Google Microsoft Bing Yandex
    Rich Snippet search result of a restaurant with rating
  3. enable others to build on top
  4. cheap DRY API - compare to XML[1]/JSON
  5. apps/sites consuming microformats
    Readability, Spinn3r, Foursquare
  6. simplest solution - least code

which microformats?

  • hAtom - Google Readability (hNews), numerous Spinn3r customers
  • hCard - Google Microsoft Bing Yandex Readability H2VX Foursquare Firefox Operator Microformats for Chrome
  • hCalendar - Google Microsoft Bing Yandex H2VX Firefox Operator Microformats for Chrome
  • hMedia - Google
  • hProduct - Google Microsoft Bing Yandex
  • hRecipe - Google Microsoft Bing Yandex Microformats for Chrome
  • hResume - Guardian Jobs, Madgex Labs library clients
  • hReview - Google Microsoft Bing Yandex Microformats for Chrome
  • hReview-aggregate - Google Microsoft Bing Microformats for Chrome
  • rel-me - Google, RelMeAuth/IndieAuth
  • rel-author - Google
  • rel-license - Google advanced search

2. challenges & lessons

8 years of alternative approaches

  • 2005-2009(?): StructuredBlogging
  • 2005-2011: Google Base schema
  • 2007-2011(?): Google Data API/Elements
  • 2009-2009(?): Yahoo et al CommonTag.org
  • 2009-2011(?): Google rdf.data-vocabulary.org
  • 2010-present Facebook OGP meta tags
  • 2011-present Google+MS(Y!) Schema.org
  • 2012-present Twitter Cards meta tags
  • 2012-present OpenMetadata.org

lessons learned

class collisions and losses

  • class="summary", class="description"
  • site design updates remove & rewrite markup
  • answer: prefixed class names
    • h-*, p-*, u-*, dt-*, e-*
    • avoids collisions: class="p-summary"
    • easier to recognize: class="h-card"
    • enables generic parsing and JSON

too much markup

<span class=vcard><span class=fn>Tantek Çelik</span></span>
<span class="vcard"><span class="fn n">
    <span class="given-name">Glenda</span>
    <span class="additional-name">Watson</span>
    <span class="family-name">Hyatt</span>
even bigger problem: microdata, RDFa (won't fit on a slide)
itemscope itemtype itemprop itemref itemid
vocab typeof property rel

microformats2: less markup

  • flat sets of properties. no subproperties.
  • common markup → common properties
    • <span class="h-card">Tantek Çelik</span>
      → name
    • <a class="h-card" href="http://tantek.com">Tantek Çelik</a>
      → url, name
    • <a class="h-card" href="http://tantek.com">
      <img src="IMG_0123.jpg" alt="Tantek Çelik"/></a>

      → url, photo, name

3. HTML5 & microformats2

a. HTML5 data tables

  • <table> <th id> <td headers>
  • Example: Unofficial XOXO Directory
  • One big <table> with 500+ <tr>s
  • One row of <th>s, the rest <td>s

HTML5 data tables: result

Google search results for XOXO directory show a table

b. HTML5 <time> & <data>

HTML5 new element: <time>

<time class="dt-start" datetime="2013-07-19 15:15:00">
  19 July 2013 at 15:16</time>
or combine with value class pattern:
<span class="dt-start">
  <time class="value">2013-07-19</time> at
  <time class="value">15:15</time>
Trade-off: DRY vs. locale-specific datetimes

HTML5: <time> recent enhancements

  • year: © <time>2013</time>
  • year-month: (email list, blog archives)
    <time datetime="2013-07">July 2013</time>
  • month-day
    Birthdate: <time datetime="--03-11">03-11</time>
  • duration: <time>205s</time>
  • album length: <time datetime="42m 59s">42:59</time>

HTML5 new element: <data>

<span class="geo">
  <data class="latitude"
  <data class="longitude"

c. microformats2

microformats2 summary

  1. prefixed class names (h- p- u- dt- e-)
  2. flat sets of properties
  3. single class markup for common uses
  4. live documentation:

microformats2 implementations

microformats2 as API example

WebFWD.org - Mozilla Incubator

microformats2 as JSON read API

JSON read API WebFWD.org

microformats2: indieweb reply

aaronpk indieweb reply

microformats2: indieweb comments

eschnou indieweb comments

microformats2: indieweb rsvp

aaronpk indieweb rsvp

microformats2: indieweb event

aaronpk indieweb rsvp

which microformats2 vocabularies?

microformats support coming to Firefox OS

Screenshots of FirefoxOS home screen and browser start page.

Q & A


1. Which microformats should I use?

  1. a classic microformat on <body> for main page subject
  2. microformats2 for both main subject and nested data: JSON API
  3. site-specific link previews:
    • optional: OGP meta tags for Facebook
    • optional: Twitter Cards meta tags for Twitter
    • Beware of duplicated invisible data drift!

2. How do I validate my microformats?

3. How do I get involved?

Thank you.

Red panda (Firefox) Photo by Yortw

Tantek Çeliktantek.com@t