The Problem with Fragments


This paper explains the problems with URI fragments*, specifically in the context of RDF where I believe they are being used incorrectly.


(In this sense, theoretical refers to a difference in theories, not the more common sense of a problem that doesn't manifest itself in real life.)

They're not URIs:

However, "the URI" that results from such a reference includes only the absolute URI after the fragment identifier (if any) is removed and after any relative URI is resolved to its absolute form.

- RFC 2396*, the RFC* describing URIs.

Their semantics only work in the context of a single representation:

The semantics of a fragment identifier is a property of the data resulting from a retrieval action, regardless of the type of URI used in the reference.

- ibid.


They don't integrate well with the HTTP Web: you can't do OPTIONS, HEAD or access control on an HTTP URI with fragments.


Fragments don't allow redirects: Let's say my homepage had a section of the things I'd published. I gave it an ID, so folks could reference it as Eventually, I may have published so many things that there's no longer room for all of them on my homepage. So I decide to move them to their own page: But how can I do this without breaking links to the publications section? Since fragments aren't supported by HTTP, I can't do a redirect on them.

Fragments aren't consistent across content-negotiation: Imagine I gave a presentation on crocodile hunting. I put it up on my website as a movie, Ogg Vorbis file, and an HTML transcript. The HTML version has fragment IDs for the different sections of the speech, but there's no way for these to carry over onto the other formats, since an Ogg Vorbis file doesn't have support for fragments.

Some fragments break even when you change the document: The XPointer system of fragments for XML documents allow you to cite very specific portions of a document. For example, someone might be refuting the third bullet point in my list of reasons for buying new shoes: xpointer(id("reasons")/li[3]). If I were to rearrange my reasons or modify them, I would be "breaking" this link. Can Cool URIs really go so far that I can't update my website?

(Thanks to Sean B. Palmer for this example.


Fragments in Web Architecture only makes sense when referring to a representation of a resource, not a resource itself. URI references worked in HTML where the set context was of surfing between web pages (representations), and human users could deal with some breakage. However, as we move into formats like RDF, clarity and precision become increasingly important and URI fragments just don't work.

Part of LogicError. Powered by Blogspace, an Aaron Swartz project. Email the webmaster with problems.