Wednesday, June 29, 2022
HomeTechnologyLinking to text fragments in web pages – Ole Begemann

Linking to text fragments in web pages – Ole Begemann


Text fragments are a way for web links to specify a word or phrase a browser should highlight on the destination page. Google Chrome added support for them in version 80 (released in February 2020).

For example, opening the link oleb.net/2020/swift-docker-linux/#:~:text=running,container in Chrome should highlight the first heading of the article:



Chromium highlighting the text fragment specified in the URL.

The obvious use case is search engines: when you click a link in a search result, the browser would automatically highlight your search term(s) on the destination page.

I’ve always wanted this feature. I often find myself visiting a page from a search engine, only to immediately hit Cmd+F and re-type my search query into the browser’s Find on Page text field. Needless to say, this is a pain on mobile devices without a proper keyboard.

But text fragments have other uses beyond search engines:

  1. Linking to a particular sentence or paragraph of a long document. I’d use this all the time when linking to API documentation or forum posts. “Normal” URL fragments only work for anchors the author of the destination page created in advance, and readers usually can’t see what anchor tags are available on a page.

  2. Sharing a specific portion of a page. Browsers could facilitate this by offering to include a text fragment in the URL when sharing a link to a text selection.

Here’s the sample URL from above once more:


https://oleb.net/2020/swift-docker-linux/#:~:text=running,container

This part is the text fragment:


#:~:text=running,container

This fragment finds the first mention of “running” (case-insensitive) on the page and highlights everything from that point until it finds “container”. There are a few more variants of the syntax. Read the text fragments draft spec for details.

Search terms may contain sensitive information that users don’t want to share with the destination server. For good reason, search engines stopped reporting the user’s search terms in the referer header a long time ago as part of the widespread move to HTTPS. It would be bad if a new feature reintroduced this old data leak.

The spec considers this issue. Text fragments are designed to be purely a browser-level feature — they’re not exposed to JavaScript code. This screenshot demonstrates that document.location.hash is blank because Chromium stripped the text fragment away:


Screenshot of the JavaScript console in Chromium demonstrating that the destination page can’t see the text fragment
JavaScript code running in the destination page can’t see the text fragment.

I think this the right behavior, but the spec authors seem to be considering changing it back because it may constrain some legitimate use cases and because JavaScript can already determine what portions of a page a user is reading by monitoring the viewport rect. I don’t know — exposing a sensitive search term seems more invasive to me than the scroll position.

It’s worth noting that the privacy concern exists for browsers that don’t support text fragments: they will treat the gibberish as a normal URL fragment, which can easily be parsed with a bit of JavaScript.

As a precaution, search engines and similar sites should probably only include text fragments in their links if the user’s browser supports the feature (window.location.fragmentDirective).

Chrome is currently the only browser with text fragment support. From what I’ve read, the WebKit and Firefox teams are generally supportive of the idea but have some reservations about specific design choices.

I hope this or something like it becomes widely supported in the near future.

Update June 22, 2020: I neglected to mention fragmentations, an IndieWeb initiative that aims to solve the same problem and is at least six years old. This feature uses normal URL fragments and client-side JavaScript to find the matching text on the destination page (which is only necessary because there’s no native browser support, of course).

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments