tantek.com

How URL started as UDI — a brief conversation with @timberners_lee @W3C #TPAC

on (ttk.me b/4Yu1) using BBEdit

fun: showed @timberners_lee my post[1] on URL naming history.
priceless: Tim explained how "URL" started as "UDI".

The following is from a conversation I had on during this week's W3C TPAC meetings with Tim Berners-Lee, which he gave me permission to post on my site.

Universal Document Identifier

When Tim developed and implemented the concept / technology we now know as a "URL", he originally called it a "UDI" which stood for:

(U)niversal 
(D)ocument
(I)dentifier

When Tim brought UDI to the IETF (Internet Engineering Task Force) for standardization, they formed a working group to work on it called the "URI working group". Then they objected to the naming of "UDI" and insisted on renaming it.

Universal to Uniform

They objected to "Universal". They said to call it universal was hubris, even if the technology actually was universal in its design that allowed any identification mechanism to define its own scheme.

So the IETF changed "Universal" to "Uniform".

Document to Resource

They objected to "Document" - they said that was too specific and that such things were better, more generally, referred to as "Resources".

Identifier to Locator

Finally they objected to "Identifier", because in their minds these kinds of things were either a "name" OR an "address" (not both).

Thus they deliberately changed "Identifier" to "Locator" because the design of UDIs were that they were an address where you went to retrieve something.

They deliberately called them "Locator" to make them sound less reliable, as a warning not to use them as a "name" to identify something. Because they wanted people to use URNs instead (e.g. DOIs etc.).

URLs Identify Things, UDI Clues

Today, people use URLs to identify things, including documents, companies, and even people. URNs not so much.

Yes, "URL" was previously called "UDI", and the IETF made Tim Berners-Lee rename it.

You can find clues of this background in a surviving copy of the 1994-03-21 draft of the "Uniform Resource Locators (URL)" specification[2], buried in the "Acknowledgments" section:

"The paper url3 had been generated from udi2 in the light of discussion at the UDI BOF meeting at the Boston IETF in July 1992."

More Digging

Curiously, the "hypertext form" of the "Uniform Resource Locators (URL)" specification[2] that it mentions, 404s:

http://info.cern.ch/hypertext/WWW/Addressing/URL/Overview.html

However with a little searching I found the undated, yet appearing to be even older (likely 1991) "W3 Naming Schemes"[3] which describes URLs / UDIs without mentioning either by name, including linking to "W3 address syntax: BNF"[4] which provides names for the different parts of the "W3 addressing syntax" like:

anchoraddress
docaddress [ # anchor ]
docaddress
httpaddress | fileaddress | newsaddress | telnetaddress | prosperoaddress | gopheraddress | waisaddress
httpaddress
h t t p : / / hostport [ / path ] [ ? search ]

Look familiar? I'm going to have to update my blog post[1].

References

  1. How many ways can you slice a URL and name the pieces?
  2. Uniform Resource Locators (URL) — A Syntax for the Expression of Access Information of Objects on the Network
  3. W3 Naming Schemes
  4. W3 address syntax: BNF