# Web Annotation Discovery This document describes a simple mechanism for using [Web Annotation][] in web browsers. The envisioned outcome is that people can discover annotation sources and subscribe to annotation ‘feeds’ while browsing the web, then view the annotations on other pages they visit. ## Introduction ### Context Various tools, often in the form of browser extensions, have attempted to make possible on the web what is easy enough on paper: highlighting passages and scribbling on the document while reading. Moreover, several such tools go beyond what its paper equivalent permits, and enable sharing such annotations with others via a web service. One could for example view their friends’ comments on a blog post, or their grandma’s personal tips on a recipe — without requiring involvement of the website that was annotated. However, annotations created in one annotation tool are often only viewable with exactly the same tool. And sharing annotations, when possible, requires that people use exactly the same tool & service. Ideally, people could use any tool of their choice to view annotations from any source of their choice, and regardless of the software and/or service chosen by the annotator. To facilitate interoperability among annotation tools, the [Web Annotation Data Model][] was specified and standardised by W3C in 2017. The specification outlines an extensible JSON-based format for annotations, which encodes the target document (and the targeted passage/part within it) that is being annotated, the ‘body’ of the annotation (i.e. one’s scribbles, if any), and metadata such as the author, creation time, and motivation of the annotation. The accompanying [Web Annotation Protocol][] specifies an [LDP][]-based protocol for exchanging (collections of) annotations between a client (typically an annotation viewing/editing tool) and a server. That server could e.g. be a personal data store for creating and viewing one’s own annotations, or a read-only collection of annotations that were created/generated by other parties. Together, the two specifications provide a standarised way for any annotation viewer to get a collection of annotations from any annotation server. However, they have not specified a way to *discover* such annotation servers. ### Goal The current proposal addresses the discovery of annotations and annotation sources, so that web browsers can offer users to ‘import’ or ‘subscribe to’ their annotations. The goal is that the web browser becomes an annotation viewer (possibly through an extension), that shows known annotations on pages the user visits. The displayed annotations can come from any source, and importantly need not be supplied or endorsed by the publisher of the visited website itself; allowing for diverse use case like commentary, fact-checking, and many others. In a sense, it introduces the concept of reverse links: the browsers shows documents (annotations) linking *to* the visited page. Ideally, the browser will also supply ways to create annotations and share/publish them, but this is not part of (and orthogonal to) the current proposal. ### Approach To show annotations on a visited page, the web browser needs to somehow obtain these annotations. Various previous annotation projects depend on a single global service to index the annotations by their target, which browsers would query for annotations targeting a particular page. To avoid such centralisation and cater for the diversity of use cases, the browser could instead query any annotation services of the user’s choice. However, querying services for annotations on visited pages has an enormous impact on reader privacy: to find for annotations on pages you read, you have to tell the service which pages you read. Subscribing to multiple sources would reveal this information to even more parties. In many usage scenarios, the annotations a person is actually interested in is limited and from a known source. Centralised services (e.g. [Hypothes.is][]) can help discover annotations from any other user, but are often used for annotating in well-defined groups: in classrooms, among colleagues, etc. In such cases, there is no need for a central global index, and moreover the total set of annotations of interest could easily fit on the user’s device. This would solve the reader privacy issue as no querying is needed — the browser can simply look up if it has any relevant annotations for any visited page (and can thereby be much quicker too). Also for somewhat larger-scale annotation consumption, the total size may well remain managable. For example, if an investigative journalist subscribes to a thousand colleagues each writing ten annotations per day of 1KB each, this produces roughly 4GB in a year — significant, but perhaps worth it for their work (at which privacy may be more important than disk storage). This size could still be reduced by an order of magnitude if, of each annotation, only its own URL and the URL it targets are stored (with a tradeoff for latency and privacy, see [further below](#compacted-storage)). The current proposal omits any querying mechanism and adopts this approach of a local ‘annotation library’. The mechanisms defined below serve to populate this library: How to discover annotation sources and import their current annotations, and subscribe to a source/‘feed’ to obtain their future annotations. To this end, it selects and combines existing parts of the Web Annotation specifications. Two discovery mechanisms are defined: 1. Annotations encountered directly, either served as a file or embedded in a web page. 2. Annotation ‘feeds’: collections of annotations discovered via links in web pages. Both of these are described and specified below. ## Encountering Annotations The simplest way for a web browser to discover annotations is by stumbling upon them. The Web Annotation Data Model defines the canonical serialisation of annotations in the JSON format, either individually or in an [Annotation Collection][]. An individual annotation or a collection can be encountered in two ways: - served directly. - embedded in a ` ``` To embed multiple annotations, one could either use multiple `