Sometimes API bindings are not officially generated for all languages. Generating code from an API definition makes the binding process of an API a much more simple process than translating it field by field and function by function. When there is no public API definition, one way to overcome those problems is by translating from official code definitions.

Background

This blog is currently being written in Notion and deployed automatically every night in a Github Action. See my last project to know more about this.

The mentioned project currently uses a library written in Go to communicate with Notion’s API. Although, this library is being updated by contributing users (not working for Notion) and it makes it hard to stay updated to the official API; and copying definitions by hand introduces many unnoticeable bugs.

Proposed Solution

One possible solution I read was using Wasmer to create bindings to the Typescript API, so any language (that supports Wasmer) could create bindings over that WebAssembly runtime. This solution has portability as its main trait. I automatically discarded this option because, as you could have guessed, it has the extreme overhead of depending on a WebAssembly runtime; which is a lot for such a small API.

Another solution, proposed by one of the library contributors, was generating the structs based on the official implementation of the API. This would solve all of this problems; and also reduce significantly the amount of time to keep up to date.

Generating structs based on the typescript types from the official JS sdk · Issue #10 · jomei/notionapi
After going back and forth on #9, it feels like making sure the API reference is properly implemented is a quite meticulous task that is quite error prone and may introduce vicious bugs that will b...
This proposed solution mentioned a really interesting method used by the LSP implementation of Go, gopls. Too keep up with the protocol definition of Language Server Protocol (written in typescript), they use the typescript compiler API to inspect the definitions and generate Go code from these.

At first, I thought it was a crazy idea (well, I still think it). But I couldn’t hold myself from trying to implement it, so here is how it went.

Implementation

There are a lot of questions regarding how to translate Typescript type definitions to have usable code and with the same quality as handmade translations.

Type aliases

Notion’s SDK uses Typescript’s type aliases to define the model types of the communication between client and API.

type IdRequest = string | string

type TextRequest = string

type RichTextItemRequest =
  | {
      text: { content: string; link?: { url: TextRequest } | null }
      type?: "text"
      annotations?: {
        bold?: boolean
        italic?: boolean
        strikethrough?: boolean
        underline?: boolean
        code?: boolean
        color?:
          | "default"
          | "gray"
          | "brown"
          | "orange"
          | "yellow"
          | "green"
          | "blue"
          | "purple"
          | "pink"
          | "red"
          | "gray_background"
          | "brown_background"
          | "orange_background"
          | "yellow_background"
          | "green_background"
          | "blue_background"
          | "purple_background"
          | "pink_background"
          | "red_background"
      }
    }
...

We can easily loop for each type alias in a file using the compiler API.

The problem comes with getting the type representation of those. To simplify the AST types, I created two structs: one to hold a type definition and another for attributes definition.

interface TypeDef {
    id?: string;
    // type will be defined if it is a basic type.
    type?: string;
    // value will be defined if it is a literal type
    value?: any;
    // attributes will be defined if it is a type literal.
    attributes?: AttribDef[];
    level?: number;
    isInterface?: boolean;
}
...
interface AttribDef {
    id: string;
    optional?: boolean;
    type: TypeDef;
    jsonName?: string;
}

First of all, sorry for the all optional properties, but I don’t know the idiomatic ways. This structs store some useful information about the types: where are they defined, which identifier and even a value if they are a constant.

The next step seems easy, just navigate the AST recursively storing the types represented with this structs. But reality strikes when you get deeper into Typescript type aliases declarations.

type BlockObjectRequest =
  | {
      heading_1: { text: Array<RichTextItemRequest> }
      type?: "heading_1"
      object?: "block"
    }
  | {
      heading_2: { text: Array<RichTextItemRequest> }
      type?: "heading_2"
      object?: "block"
    }
  | {
      heading_3: { text: Array<RichTextItemRequest> }
      type?: "heading_3"
      object?: "block"
    }
  | {
      embed: { url: string; caption?: Array<RichTextItemRequest> }
      type?: "embed"
      object?: "block"
    }

The code above is an example of a type alias used to define blocks in Notion. For someone who hasn’t touched Typescript ever (like me until 5 days ago), those vertical lines mean it is a union type.

Union types

A union type is a type that can be any of the sub-types in the union. Trying to express the concept of a union type is difficult in Go.

What it is being done in the Go version of the API is defining an interface type Block and implement concrete types for each of the sub-types of the union, such as Heading1, Heading2… Those concrete types implement the interface.

The important concept of this is that we have to extract an interface with the common attributes of the union (in the previous example, type and object attributes). Also, we will later have to decide the name of the concrete types.

Creating the interface

In order to create the interface, we will iterate all the children AST nodes of the union type to get our own representation of a type tree.

Once we have our own representation of the union subtypes, we can intersect the array of attributes so we keep the common ones.

The common attributes will be in all concrete types and they will also be part of the interface as methods to be implemented by the concrete types.

Embedding types

If you look close to the official API, the type contains a lot of embedded structs, which have no name. My first implementation consisted on simply embedding types as the official API did, giving them no name. This worked very well and I was able to generate most of the content easily.

But, as you could have imagined, it couldn’t be perfect. Yes, the types followed the API definitions. Although, being able to have a small codebase and reusing common structs is a good practice that I couldn’t just ignore.

Embedding 100 times the definition of a RichText (text with formatting, links, etc.) is probably the worst decision you can make. Take in mind that you couldn’t be able to write any function that processed the RichText, and creating an interface would be impossible because of the unnamed types.

So… I had to follow another path, I called it context naming.

Context naming

Context naming is a solution for assigning names for unnamed embedded structs. To do so, we use the closest name we can find to be related with the type. For example:

type BlockEmbed struct {
    Embed struct {
        URL string
    }
    ...
}

Could be easily converted into:

type Embed struct {
    Url string
}

type BlockEmbed struct {
    Embed Embed
}

In case we found some previously defined type with the same name, we could compare them and check if they contained the same fields. In case of being able to use the existing type or update it with optional attributes, we wouldn’t need to create this new type.

In case the existing type could not be reused or merged, we could add more context to the name, such as BlockEmbedEmbed.

If we follow Go’s name convention, we also need to convert the names generated to Go ones; from snake_case, as the API uses, to CamelCase. Also, some names such as Id or Url are usually used as all uppercases, like ID.

Bad news 😿

Okay… so here is when I decided to leave this project. I didn’t expect deciding names for the types would become the most tedious task. There are some name conventions in Go, and we also had to follow the style of the API. We could have rewritten a new API implementation using our own name conventions, but the purpose of this project was to automate and simplify (it’s okay, you can laugh) the existing API implementation instead of starting from zero.

Not because the task was too difficult, I could have solved name collisions by splitting parent names and taking more context if names collided. It is because I realize an automatic script wouldn’t think of names as good as a human.

Think of the last example of embed, but now using the official API definitions. This will generate:

type Block interface {}

...

type Embed struct {
    Url string
    ...
}

type BlockEmbed struct {
    Embed Embed
    ...
}

In case the types didn’t match, we will end up with names like BlockEmbedEmbed and trust me, this one is not the worst that could be generated.

Also, this is not the main reason I’m not continuing this project. The main reason is that the original intention of Notion was to make an OpenAPI public definition so the model types, server and client would be generated automatically. This could be used to generate the bindings in Go with just a command using existing tools, such as oapi-codegen.

Client basics by aoberoi · Pull Request #1 · makenotion/notion-sdk-js
Goals Establish a project / file structure Establish build configuration (TypeScript) Implement some basic client features (the API was established in an internal RFC) Initializing a client insta...
Type predictions for union types by EnixCoda · Pull Request #115 · makenotion/notion-sdk-js
This PR provides prediction functions that distinguish union types, which are frequently used in request return values. It works very well in my own use cases. Usage example (modified examples/data...

Conclusion

Typescript compiler API is an excellent resource to make any kind of inspecting tool for the language; such as looking for bad practices, automated documentation or even generating code.

Although, language translation is not an easy task when dealing with totally different languages and, even worse, paradigms. The translation could be implemented to suit the Notion API by implementing many more heuristics and specific conditions, but waiting for the “promised” OpenAPI definition may be the smartest solution.

It was a great topic to learn about and I hope more languages implement some kind of language inspection tools.