Work in progress

This document is not complete. Please check back soon for updates.

Authenticated Transfer Protocol

Glossary

  • Client: The application running on the user's device. Interacts with the network through a PDS.
  • Personal Data Server (PDS): A server hosting user data. Acts as the user's personal agent on the network.
  • Name server. A server mapping domains to DIDs via the com.atproto.identity.resolveHandle() API. Often a PDS.
  • Big Graph Service (BGS). A service that handles all of your events, like retrieving large-scale metrics (likes, reposts, followers), content discovery (algorithms), and user search.

Wire protocol (XRPC)

Atproto uses a light wrapper over HTTPS called XRPC. XRPC uses Lexicon, a global schema system, to unify behaviors across hosts. The atproto.com lexicons enumerate all XRPC methods used in ATP.

Identifiers

The following identifiers are used in atproto:

Identifier Usage
Domain names A unique global identifier which weakly identify repositories.
DID A unique global identifier which strongly identify repositories.
NSID A unique global identifier which identifies record types and XRPC methods.
TID A timestamp-based ID which identifies records.

Domain names

Domain names (aka "handles") weakly identify repositories. They are a convenience which should be used in UIs but rarely used within records to reference data as they may change at any time. The repo DID is preferred to provide a stable identifier.

DIDs

DIDs are unique global identifiers which strongly identify repositories. They are considered "strong" because they should never change during the lifecycle of a user. They should rarely be used in UIs, but should always be used in records to reference data.

Atproto supports two DID methods:

  • Web (did:web). Should be used only when the user is "self-hosting" and therefore directly controls the domain name & server. May also be used during testing.
  • Placeholder (did:plc). A method developed in conjunction with atproto to provide global secure IDs which are host-independent.

DIDs resolve to "DID Documents" which provide the address of the repo's host and the public key used to sign the repo's updates.

Timestamp IDs (TID)

Describe TIDs

URI scheme

Atproto uses the at:// URI scheme (specified here). Some example at URLs:

Repository at://alice.host.com
Repository at://did:plc:bv6ggog3tya2z3vxsub7hnal
Collection at://alice.host.com/io.example.song
Record at://alice.host.com/io.example.song/3yI5-c1z-cc2p-1a
Record Field at://bob.com/io.example.song/3yI5-c1z-cc2p-1a#/title

Schemas

Atproto uses strict schema definitions for XRPC methods and record types. These schemas are identified using NSIDs and defined using Lexicon.

Repositories

A data repository is a collection of signed records.

It is an implementation of a Merkle Search Tree (MST). The MST is an ordered, insert-order-independent, deterministic tree. Keys are laid out in alphabetic order. The key insight of an MST is that each key is hashed and starting 0s are counted to determine which layer it falls on (5 zeros for ~32 fanout).

This is a Merkle tree, so each subtree is referred to by its hash (CID). When a leaf is changed, every tree on the path to that leaf is changed as well, thereby updating the root hash.

Repo data layout

Provide a more detailed description of the data layout and how the MST is organized.

The repository data layout establishes the units of network-transmissible data. It includes the following three major groupings:

Grouping Description
Repository Repositories are the dataset of a single "user" in the atproto network. Every user has a single repository which is identified by a DID.
Collection A collection is an ordered list of records. Every collection is identified by an NSID. Collections only contain records of the type identified by their NSID.
Record A record is a key/value document. It is the smallest unit of data which can be transmitted over the network. Every record has a type and is identified by a TID.

Every node is an IPLD object (dag-cbor to be specific) which is referenced by a CID hash.

Node Type Description
Signed Root ("commit") The Signed Root, or “commit”, is the topmost node in a repo. It contains:
  • root The CID of the Root node.
  • sig A signature.
Root The Root node contains:
  • did The DID of this repository.
  • prev The CID(s) of the previous commit node(s) in this repository’s history.
  • data The Merkle Search Tree topmost node.
  • auth_token The jwt-encoded UCAN that gives authority to make the write which produced this root.
MST Node The Merkle Search Tree Nodes contain:
  • l (Optional) The CID of the leftmost subtree.
  • e An array of MST Entries.
MST Entry The Merkle Search Tree Entries contain:
  • p Prefix count of utf-8 chars that this key shares with the prev key.
  • k The rest of the key outside the shared prefix.
  • v The CID of the value of the entry.
  • t (Optional) The CID of the next subtree (to the right of the leaf).

Repo encodings

All data in the repository is encoded using CBOR. The following value types are supported:

null A CBOR simple value (major type 7, subtype 24) with a simple value of 22 (null).
boolean A CBOR simple value (major type 7, subtype 24) with a simple value of 21 (true) or 20 (false).
integer A CBOR integer (major type 0 or 1), choosing the shortest byte representation.
float A CBOR floating-point number (major type 7). All floating point values MUST be encoded as 64-bits (additional type value 27), even for integral values.
string A CBOR string (major type 3).
list A CBOR array (major type 4), where each element of the list is added, in order, as a value of the array according to its type.
map A CBOR map (major type 5), where each entry is represented as a member of the CBOR map. The entry key is expressed as a CBOR string (major type 3) as the key.
Are we missing value types? Binary? CID/Link?

Repo CBOR normalization

Describe normalization algorithm

Repo records

Repo records are CBOR-encoded objects (using only JSON-compatible CBOR types). Each record has a "type" which is defined by a lexicon. The type defines which collection will contain the record as well as the expected schema of the record.

The AT Protocol uses dollar ($) prefixed fields as system fields. The following fields are given a system-meaning:

Field Usage
$type Declares the type of a record. (Required on records and Union objects)

Client-to-server API

The client-to-server API drives communication between a client application and the user's PDS. The APIs are dictated by the lexicons implemented by the PDS. It's recommended that every PDS support the full atproto.com lexicon. Application-level lexicons such as bsky.app are also recommended.

Authentication

Authentication is a simple session-oriented process. View the API call in the applications model section of the docs here.

App passwords

We also have app passwords, an initial solution for authentication that will let users use third-party clients without having to trust them with their primary password. In the long term, we plan add SSO (Single Sign-On) authentication with scoped permissions.

Users can log into third-party apps with an app password in the same way that they login with their account password. App passwords have most of the same abilities as the user's account password, but they're restricted from destructive actions such as account deletion or account migration. They are also restricted from creating additional app passwords.

No client changes are required to adopt app passwords. However, we strongly encourage you to prompt users to use an app password on login and avoid ever entering their password. For account creation, we encourage redirecting a user to the Bluesky web client.

If you expect users have used their primary password with your client, we also strongly encourage you to delete all existing refresh tokens and re-fetch access/refresh tokens using an app password.

App passwords are of the form xxxx-xxxx-xxxx-xxxx. For your users' safety, you could run a quick check to ensure that they are logging in with an app password and not their account password.

For users to generate an app password, navigate to Settings > Advanced > App passwords.

Atproto core lexicon

The com.atproto.* lexicons provides the following behaviors:

Additional lexicons

For atproto to be practically useful, it needs to support a variety of sophisticated queries and behaviors. While these sophisticated behaviors could be implemented on the user device, doing so would perform more slowly than on the server. Therefore, the PDS is expected to implement lexicons which provide higher-level APIs. The reference PDS created by Bluesky implements the bsky.app lexicon.

Server-to-server API

The server-to-server APIs enable federation, event delivery, and global indexing. They may also be used to provide application behaviors such as mail delivery and form submission.

Authentication

Describe how servers may authenticate with each other

Join the Bluesky private beta.

The AT Protocol will launch soon.

Join the waitlist