Holmusk | Generating documentation from API types

Context
Requirements
Implementation
- Overview
Parsing the API type
Rendering the intermediate structure
Usage
Reflections
Other thoughts

Context

As an Intern at Holmusk, I have been involved in projects which improve the quality of life for our programmers. The latest project I have worked on, servant-docs-simple allowed me to work with exciting things relating to type-level programming.

Our backend is currently built in Haskell, but those who need to use the APIs are Elm and Flutter developers. Hence, we should provide documentation of APIs which is readable by everyone.

Since our API uses Servant, we attempted to use servant-docs to document our API. However, there were differing use cases so we created servant-docs-simple to handle these.

Requirements

Supporting various output formats

We want to output a variety of formats. Currently we support PlainText, JSON and PrettyPrint output formats.

These can be used in a variety of ways:

JSON can be served from an endpoint, allowing developers to query API endpoints and look up their documentation.
PlainText documentation allows people to read through the whole documentation in PlainText.
PrettyPrint allows people to further format text documentation to their liking.

Including format types in our documentation

A format type is a Haskell data type. This type can be serialized to other forms, depending on instances which have been defined for it. For instance, if it has a JSON instance, it can be serialized to and from JSON.

Each endpoint accepts certain format type(s). These format types are specified in the endpoint’s type definition, as part of its parameters. For instance, in its RequestBody parameter, we could say it accepts the CreateUser type and in the Response parameter we could say it returns the UserData type.

Format type in documentation

/api/user/
RequestType:
  Format: CreateUser
  ContentType: ...

Response:
  Format: UserData
  ContentType: ...

We can use these format types to search haskell source files to look up their type definitions. We can also look up our own generated files from these types. For example if a users.proto file is generated for the UserData type, we can search through that file to find the protobuf definition.

Searching for the UserData type in users.proto

message UserData {
    required string name = 1;
    ...
}

message UserCreate {
    ...
}

.
.
.

In the documentation generated by servant-docs, the format types mentioned above (User, Message) are not included. They provide examples instead, which is a different usecase, since we want to use these format types to search our internal source files.

Automatically generating API endpoints documentation

servant-docs generates documentation only if format types have implemented the necessary instances. This is good for extensive documentation of the endpoints as users have to write examples for each format type they introduce.

Instances for format types

instance ToSample User where
    toSamples _ = <some example>

instance ToSample Message where
    toSamples _ = <some example>

However, in our case, we want to document our endpoints automatically, without having to write extra instances for each format type we introduce.

Implementation

Note: Some code snippets below are simplified for understanding. They may be approximations of the actual implementations.

Overview

To allow us to parse to multiple output formats, we should:

Parse the api to an intermediate structure.
Render this structure to multiple output formats.

Parsing the API type

Flattening the API type

We observe that the Servant API type is built by chaining together types. You can observe that :> and :<|> serve as operators which introduce a tree-like structure to the API type.

:> extends the branch and :<|> forks multiple branches.

Sample API type

type API = Route :> ( Request :> Response
                 :<|> Route2 :> Request2 :> Response2
                    )

Since a list is easier to parse than a tree, we can flatten this structure into a type-level list of endpoints. To do so, we use the Endpoints type family, applying it to the API type. For those unfamiliar with type families, you can think of them as functions which act on types.

Flattened API type after applying the Endpoints type family

Endpoints @API = '[ Route :> Request :> Response -- Endpt 1
                  , Route :> Route2 :> Request2 :> Response2 -- Endpt 2
                  ]

The result is a type-level list of endpoints, which we can parse by pattern matching on ': (for separating endpoints) and :> (for separating parts of each endpoint). These operators ': and :> are similar to the cons operator : for lists, but on the type-level.

The intermediate structure

In the documentation, certain fields should come before others.

For example, the route of an endpoint should come before its Request or Response parameters. This allows users to easily find the route they need and access the relevant information for that route.

Hence structure we parse to should preserve the ordering of fields for each endpoint.

When parsing each endpoint, we want this ordering

Route: ...
Request: ...
Response: ...

Rather than this ordering

Response: ...
Route: ...
Request: ...

Intuitively, the final structure should also permit us to use the route and its field names as keys, and their details as values. This is analogous to a JSON-like / map structure as shown below.

“users/login”, “Response”, “Request”, “ContentType” are the keys below, bound by the (:) to their values.

users/login/ : {
    Response: { ContentType: <value> }
    Request: { ... }
}

Hence, this is the Haskell type we want our API documentation to have. Notice the OMap structure used. This gives us the 2 properties we need, ordered key-value pairs and a map structure.

ApiDocs type

import Data.Map.Ordered (OMap)

data ApiDocs = ApiDocs { unApiDocs :: OMap Route Details }
type Route = Text
data Details = Details (OMap Parameter Details) | Detail Text
type Parameter = Text

Converting the API type to a value

How do we convert the API type to a value of type ApiDocs? Can we have a function which takes in types, returning values? Don’t functions primarily take in values?

To answer those questions, first, we recognize that our API type is built out of Servant types and type combinators.

Examples of Servant types

-- Type representing request body
data ReqBody (contentTypes :: [*]) (a :: *)

-- Type representing headers
data Header (sym :: Symbol) a

Example of Servant type combinator

-- Type combinator (:>) which combines Servant types
data (path :: k) :> (a :: *)

-- Type combinator (:>) is able to take in types as parameters
ReqBody '[()] () :> Header "" ()

Then, using these combinators and types, we can build our API type.

Our api with a single endpoint

-- Here, using the (:>) combinator,
-- we combine a type-level string "users",
-- the ReqBody type and the Post type
type API = "users" :> ReqBody '[()] () :> Post '[()] ()

Inductively, if we can convert these types and type combinators to values, we can convert our API type to a value.

Now, we need to think of a way to pattern match on types to get values. The way would be typeclasses!

We can define different instances of a typeclass which match on different types. These instances can give us different values, so long as they conform to the definitions of that typeclass.

Example of transforming types to values using typeclasses

class HasAlphabet a where
    toAlpha :: String


data A

instance HasAlphabet A where
    toAlpha = "A"


data B

instance HasAlphabet B where
    toAlpha = "B"
    

-- Instantiating "toAlpha" for A and B

(toAlpha :: A) == "A"

(toAlpha :: B) == "B"

From this, we now know how to parse different servant types and type combinators. We can create a typeclass, define instances for all Servant types and type combinators, then use that to parse them.

With this knowledge, we can implement a typeclass that parses a single endpoint, since an endpoint is made up of servant types and combinators.

We match on the :> combinator and other types, converting them to Parameter-Detail pairs and inserting them into the Details OMap. We then return a Route-Detail pair.

Parsing each endpoint by pattern matching on :>

class HasParsable api where
    document :: Route -> Details -> (Route, Details)

-- Parses a route
instance HasSymbol p => HasParsable (p :> a) where
    document r d = document @a $ r <> symbolVal (Proxy @p)
                               $ d


-- 'a' contains the rest of the API chain
instance HasParsable a => HasParsable (Auth :> a) where
    document r d = document @a $ r
                               $ insert ("Authentication", Detail "true") d

instance ( HasParsable a
         , Typeable ct
         , Typeable typ
         ) => HasParsable (ReqBody ct typ :> a) where

    document r d = document @a $ r
                               $ insert ( "Request"
                                        , Details ( insert ( "Content-type"
                                                           , typeRep @ct
                                                           )
                                                  . insert ( "Format"
                                                           , typeRep @typ
                                                           )
                                                  $ empty
                                                  )
                                        ) d

We can convert types into their text form by using typeRep from Data.Typeable and type-level strings into their string equivalent by using symbolVal from GHC.TypeLits.

Next, we have a typeclass that destructures the flattened API type. As mentioned earlier, the flattened API type is a type-level list of endpoints. The typeclass recurses through the list, converting each Endpoint type into a Route-Details pair.

Since each endpoint type is an instance of the HasParsable typeclass, we can use the document function for the conversion. We then insert the pair into the rest of the ApiDocs OMap.

Collating the ApiDocs by pattern matching on ':

class HasCollatable api where
    collate :: ApiDocs

instance HasCollatable (endpoint ': b) where
    collate = document endpoint `insert` unApiDocs (collate @b)

instance HasCollatable '[] where
    collate = ApiDocs empty -- An empty OMap

Rendering the intermediate structure

The ApiDocs structure can then be rendered into other formats using another typeclass.

Renderable typeclass

class Renderable a where
    render :: ApiDocs -> a -- Render to specified format types

instance Renderable JSON
...

instance Renderable PlainText
...

Usage

Now, to get the documentation we can parse the API and then render it to the desired format.

Writing the docs as PlainText to a file

type MyAPI = ...

main :: IO ()
main = writeToFile apiDocs

writeToFile :: PlainText -> IO ()
writeToFile = ...

apiDocs :: PlainText
apiDocs = render @PlainText (parse @MyAPI)

Reflections

What can be improved?

Currently there are some issues opened in the repository.

The main ones which come to mind would be:

Trade offs

Documentation may not be sufficiently extensive, as the generated documentation lacks examples by default. These are up to the user to include.

Servant-docs has a much more developed ecosystem. These can be seen through packages such as servant-pandoc which supports many more output formats than servant-docs-simple.

Pros

You get lightweight documentation for free! As long as you have an Api type, you can generate simple documentation in a variety of outputs.

It is also easy to extend. We have included a bunch of tutorial scripts you can refer to. These include writing your own Renderable instances for rendering custom output and HasParsable instances to parse custom types.

Other thoughts

Writing this library was very interesting, as it was the first time I tried type-level programming. It was a similar feeling as when I first learnt about Monoids, Functors and Monads.

When I started working on this project I was worried but excited. After having heard interesting things about Servant, but never really understanding it, I could finally get some experience. Along the way I referenced Servant's documentation and Thinking with Types which have helped me immensely to understand type-level programming in Haskell.

After working on this project, I understand compile-time safety a lot better . By encoding the API as a type, we can more safely reuse it in a variety of ways, such as documentation (as in this case), on the client-side and so on.

I also learned about the nitty-gritty bits of packaging a module, like how to upload to Hackage, Stackage, configure the CI, setting up the project with summoner, among many other things.

For more details, you can reference the README as well as the Hackage docs. We also welcome PRs and Issues to improve the package! Thanks for reading :)