Documenting APIs when Preferences Matter

Benjamin YoungAugust 16th, 2016Last Updated: August 12th, 2016

0 106 9 minutes read

I sat down to write a “Document APIs with Open API and JSON Schema” article. That’s still quite possible, of course, and there’s lots of great resources to help you do just that. However, my head’s been ironing out the wrinkles in the Web Annotation Protocol for the last few weeks, and I really wanted to see how the current blend of API documentation and description tooling fared against this rather minimal API.

Here’s what I’ve discovered about the API documentation landscape when it comes to client preferences and describing content negotiation. It’s not bleak… exactly… but it ain’t rosy neither.

Preferential Treatment

Web Annotation Protocol traffics in three things:

Annotations (duh)
Annotation Pages (guess what’s in here?)
Annotation Collections/Containers (essentially the front door)

Annotations are the easy ones. They exist at a single URL. You request them, you get them back in the one and only flavor the specification lets you: Web Annotation Data Model.

Annotation Collections on the other hand work to serve the needs of a wider audience: annotation clients. An annotation client might be a highlighting system, a bookmarking tool, a screenshot app, an image annotation system, a Natural Language Processing bot, or even the comments at the bottom of a web page. Given that wider audience, there are situations where the clients will be written with speed as their prime direction, but there are others who want the biggest, most verbose set of data they can get as one long streamed answer to their question.

Here’s how the spec deals with all these preferences. Clients can ask for:

A minimal Annotation Container containing just a bit of “intro” and links to the Annotation Pages — this one’s optimized clients that want small sets of content, but are happy to make lots of requests.
A richer (and suggested default) Annotation Container which contains the first Annotation Page and it’s “items” (the Annotations).

Both of these variations are dictated by a single Prefer header (see RFC 7240 for more info). Here’s the example request from the spec:

GET /annotations/ HTTP/1.1
Host: example.org
Accept: application/ld+json; profile="http://www.w3.org/ns/anno.jsonld"
Prefer: return=representation;include="http://www.w3.org/ns/ldp#PreferMinimalContainer"

This is the example (simple and harmless as it seems) that strained the gears of the current API documentation machinery. Let’s have a look.

Round 1: OpenAPI (Swagger)

OpenAPI (formerly known as Swagger) has risen to fame and has growing support from an equally growing list of companies thanks mostly to its move from a single-vendor (SmartBear spec to a collaborative project hosted by the Linux Foundation.

Having a collaborative environment (see Governance) gives any specification a richer, longer-lasting foundation than any single vendor product, project, and certainly specification. Foundations and consortiums give things about as much object permanence as one can hope for on the internets, so it’s where I always start.

Sadly, I hit a wall right out the gate.

OpenAPI 2.0 (and its forerunners) are all rather JSON-centric. That didn’t bother me as the Web Annotation Protocol also describes a singular JSON-based data encoding that it traffics in. OpenAPI, like the other options we’ll see here, is also pretty endpoint specific. Also not a real problem. Web Annotation Protocol doesn’t define URLs that you MUST use (that would be a Bad Thing, btw), but it does use some example URLs within the spec to make things more internally cohesive.

My plan was to just mimic all the examples in the Web Annotation Protocol spec, but present them in OpenAPI 2.0, and call it a day. However, things fell over when I hit the section about Preferences:

swagger: '2.0'
info:
  title: Web Annotation Protocol
host: example.org
schemes:
  - https
basePath: /annotations
produces:
  - application/ld+json; profile="http://www.w3.org/ns/anno.jsonld"
paths:
  /annotations:
    get:
      summary: Annotation Collection / Container
      description: |
        Contains Annotations. Srsly.
      parameters:
        - name: Prefer
          in: header
          type: string
          enum:
           - 'return=representation;include="http://www.w3.org/ns/ldp#PreferMinimalContainer"'
      responses:
        200:
          description: |
            An Annotation Collection with links to the
            contained Annotation Pages and possibly the 
            completely contents of the first page
            --depending on the preference of the client.
          schema:
            type: object
            items:
              $ref: '#/definitions/AnnotationCollection'

All seems well at first blush, but things fall a part at the word responses. The Responses Object contains keys based on HTTP status codes and a default (catch-all) response. The value of each of these keys is a singular Response Object. Meaning, at this stage of the game, OpenAPI 2.0 doesn’t support describing how an HTTP header would affect the outcome of a request to a specific URL.

The above OpenAPI document presents the Prefer field and is even prepped for me to add the other options. However, there’s now way to tell the reader (or the related tooling) about how that header may affect the results — unless it results in a unique HTTP status code, which isn’t what I’m needing.

This also blocks describing other forms of HTTP Content Negotiation (see RFC 7231). So on to Round 2!

Round 2: RAML

RESTful API Modeling Language, a.k.a. RAML, is specification built primarily by MuleSoft. It’s been around for some time, and as it’s grown, MuleSoft has done a great job of gathering more input from other companies and the community. They have a core Workgroup which oversees the process and progress of the specification. As I was doing my digging, I was happy to see this gradual shift toward “more brains in the game.”

RAML is primarily written in YAML and at first blush feels very much like OpenAPI. So I was at once happy because of the familiarity but also concerned that I’d hit the same glass ceiling. This time, I decided to dive straight into the spec. Here’s what I found.

Like OpenAPI, RAML:

pivots around Template URIs and URI Parameters;
followed by HTTP Methods;
and then requests requirements including optional Header definitions, which get their own distinct space rather than a general parameters collection.
Responses still group responses by status code.
Resource Types and Traits look like they may provide exactly what I need.

But alas, resource types and traits turned out to mostly be “mixins” conceptually. Very handy to be sure, but unhelpful for fixing this negotiation problem.

Here’s what the RAML (using a resourceType for the Prefer header options) looks like:

#%RAML 1.0
title: Web Annotation Platform
version: 1.0.0
mediaType: application/ld+json; profile="http://www.w3.org/ns/anno.jsonld"

resourceTypes:
  collection:
    get:
      headers:
        Prefer:
          type: string
          enum:
            - 'return=representation;include="http://www.w3.org/ns/ldp#PreferMinimalContainer"'
            - 'return=representation;include="http://www.w3.org/ns/oa#PreferContainedIRIs"'
            - 'return=representation;include="http://www.w3.org/ns/oa#PreferContainedDescriptions"'

/annotations:
  type: collection
  get:
    responses:
      200:
        body:
          application/ld+json; profile="http://www.w3.org/ns/anno.jsonld":
            example: |
              {
                "@context": [
                  "http://www.w3.org/ns/anno.jsonld",
                  "http://www.w3.org/ns/ldp.jsonld"
                ],
                "id": "http://example.org/annotations/",
                "type": ["BasicContainer", "AnnotationCollection"],
                "total": 42023,
                "label": "A Container for Web Annotations",
                "first": "http://example.org/annotations/?page=0",
                "last": "http://example.org/annotations/?page=42"
              }

One thing the does stand out here is the possibility for doing media type-based content negotiation. You can see just below the body key that there’s this media type:

application/ld+json; profile="http://www.w3.org/ns/anno.jsonld"

It’s also possible to stack those, so if application/json could also be used for the same response, then you’d present the above like so:

body:
    application/ld+json; profile="http://www.w3.org/ns/anno.jsonld":
    application/json:
        example: |
            {"some": "json"}

That’s certainly a nice addition as it means you can describe content negotiation based around the Accept header. However, it does leave out the potential for the other core (as in RFC 2616 and its successor RFC 7231) header-based negotiation: Accept-Charset, Accept-Encoding, and Accept-Language. It also means we’re at an impasse.

There sadly doesn’t seem to be a way to negotiate different representations based on an HTTP header in RAML either. So. On we go to Round 3!

Round 3: API Blueprint

Unlike the other two options we’ve covered, API Blueprint isn’t based on YAML and/or JSON. It’s based on Markdown. It is, however, owned and managed by a single vendor: Apiary.io.

The spec is openly licensed under the MIT, and Apiary does work hard to keep the process open and transparent. They also have a “Request for Comments” repo where heavier changes can be proposed via a clearly defined process. That said, Apiary.io is also an OpenAPI Initiative Member and supports OpenAPI documents in its tooling. Likely, there’s lots of mind (and spec/code) sharing in our collective futures — which is great!

API Blueprint’s Markdown-based approach has always intrigued me. It’s the one format of the three that’s written in a language you might write documentation in anyhow.

Getting used to API Blueprint did take a bit, and understanding the various sections and their nesting options was also a little tricky. The APIairy.io editor was (of course) quite helpful in the validation process, and there are command line tools for doing that locally.

The great news though is that it does support what it calls “Multiple Transactions” for a single endpoint! The key text from the API Blueprint spec’s Action section reads:

An action section may consist of multiple HTTP transaction examples for the given HTTP request method.

Here’s what the Web Annotation Protocol’s Prefer header stuff we’ve been hacking on this whole time looks like in API Blueprint:

FORMAT: 1A

# Web Annotation Protocol

Web Annotation Protocol is for retrieving and storing Web Annotation Data Model documents.

## Annotation Collection [/annotations/]

### GET

+ Request PreferMinimalContainer

    + Headers
    
            Prefer: return=representation;include="http://www.w3.org/ns/ldp#PreferMinimalContainer"


+ Response 200 (application/ld+json; profile="http://www.w3.org/ns/anno.jsonld")

    + Schema

            {
              "$schema": "http://json-schema.org/draft-04/schema#",
              "type": "object",
              "properties": {
                "@context": {
                  "type": "array",
                  "items": {
                    "type": "string"
                  }
                },
                "id": {
                  "type": "string"
                },
                "type": {
                  "type": "array",
                  "items": {
                    "type": "string"
                  }
                },
                "total": {
                  "type": "integer"
                },
                "label": {
                  "type": "string"
                },
                "first": {
                  "type": "string"
                },
                "last": {
                  "type": "string"
                }
              },
              "required": [
                "@context",
                "id",
                "type",
                "total",
                "label",
                "first",
                "last"
              ]
            }


+ Request PreferContainedIRIs

    + Headers
    
            Prefer: return=representation;include="http://www.w3.org/ns/oa#PreferContainedIRIs"


+ Response 200 (application/ld+json; profile="http://www.w3.org/ns/anno.jsonld")

    + Schema
    
            {
              "$schema": "http://json-schema.org/draft-04/schema#",
              "type": "object",
              "properties": {
                "@context": {
                  "type": "array",
                  "items": {
                    "type": "string"
                  }
                },
                "first": {
                  "type": "array",
                  "items": {
                    "type": "string"
                  }
                },
                "id": {
                  "type": "string"
                },
                "label": {
                  "type": "string"
                },
                "last": {
                  "type": "string"
                },
                "total": {
                  "type": "integer"
                },
                "type": {
                  "type": "array",
                  "items": {
                    "type": "string"
                  }
                }
              },
              "required": [
                "@context",
                "first",
                "id",
                "label",
                "last",
                "total",
                "type"
              ]
            }

+ Request PreferContianedDescriptions

    + Headers
    
            Prefer: return=representation;include="http://www.w3.org/ns/oa#PreferContainedDescriptions"

+ Response 200  (application/ld+json; profile="http://www.w3.org/ns/anno.jsonld")

    + Schema
    
            {
              "$schema": "http://json-schema.org/draft-04/schema#",
              "type": "object",
              "properties": {
                "@context": {
                  "type": "array",
                  "items": {
                    "type": "string"
                  }
                },
                "first": {
                  "type": "array",
                  "items": {
                    "type": "object",
                    "properties": {
                      "@context": {
                        "type": "string"
                      },
                      "body": {
                        "type": ["string", "object", "array"]
                      },
                      "id": {
                        "type": "string"
                      },
                      "target": {
                        "type": ["string", "object", "array"]
                      },
                      "type": {
                        "type": ["array", "string"]
                      }
                    },
                    "required": [
                      "@context",
                      "id",
                      "target",
                      "type"
                    ]
                  }
                },
                "id": {
                  "type": "string"
                },
                "label": {
                  "type": "string"
                },
                "last": {
                  "type": "string"
                },
                "total": {
                  "type": "integer"
                },
                "type": {
                  "type": "array",
                  "items": {
                    "type": "string"
                  }
                }
              },
              "required": [
                "@context",
                "first",
                "id",
                "label",
                "last",
                "total",
                "type"
              ]
            }

See? It’s just Markdown with some HTTP and JSON pasted in it. Pretty rad, actually.

The Schema sections contain some quick JSON Schemas that I generated and then tweaked to make sure the basics were in place. This setup allowed me to list and name these multiple request options, each one noting a Prefer header key/value combination that would result in the output seen in the “Response” section that follows it.

The end result is exactly what I’d been attempting:

document a single HTTP endpoint
state unique headers for certain negotiated requests
provide a validation system to confirm the response was as expected

Bonus: Doing Stuff With It

Wanting just a bit more than documentation out of the deal, I dove into the tools section and found the fabulous dredd script. Dredd uses the API Blueprint documentation to test a running instance of the documented API!

So, tempting fate, I opened my in-progress, Python-based Web Annotation Protocol implementation and ran: dredd api-documentation.apib http://localhost:8080/

And it worked!

Dredd properly sent the Prefer headers from those Request sections and checked the Responses against the Schema objects.

I’ve elided some fiddly bits about me learning API Blueprint, getting dredd installed locally, dealing with Windows idiosyncrasies, and the like. It’s boring, mind-numbing stuff. I’m adding this note so that when you attempt to climb this mountain (or any other) and find rocks in your shoes: Don’t be alarmed. Stop. Take them out. And climb some more. It’s worth it in the end.

Conclusion: Negotiable?

API Blueprint wins for API documentation formats that support content negotiation on more than the Accept header. Having the option to craft unique requests at a single endpoint is invaluable and puts API Blueprint at the top of my list for its flexibility and support for more of HTTP than its current contenders.

Do you need content negotiation? That’s negotiable.

Content Negotiation is one of the untapped bits of awesome in HTTP. Often it gets overlooked because the tooling (as we’ve seen here) doesn’t lend itself to pointing a clear path for using it. However, even with wrinkles to iron out in some tools, the ability to have a single endpoint identify a conceptual resource from which you can negotiate various representations is invaluable.

Content Negotiation is a key piece of the web’s architecture that can benefit API developers and consumers. API Blueprint has the goods for documenting these APIs and even an ever-growing list of mock servers, testing clients, and proxies. Amazing.

Reference:

Documenting APIs when Preferences Matter from our WCG partner Benjamin Young at the Codeship Blog blog.