CBOR Credential Database Format

A Collection of Interesting Ideas,

This version:
https://r4gus.github.io/ccdb/index.html
Issue Tracking:
GitHub
Editor:
Not Ready For Implementation

This spec is not yet ready for implementation. It exists in this repository to record the ideas and promote discussion.

Before attempting to implement this spec, please contact the editors.


Abstract

This document describes a format to store secrets at rest based on the CBOR data format. It is designed as an alternative to other file formats like KDBX used with KeePass and KeePassXC.

1. Introduction

The problem of storing secrets securely is an important part of credential management; unfortunately, most password managers and other applications managing secrets implement their own, and some times proprietary, credential database scheme.

This document proposes a new format for storing credentials at rest. The following goals are being pursued:

Extensibility: The format allows to add new capabilities to the file format over time, and third parties should be able to enrich the information embedded in the file with proprietary extensions, with tools unaware of newer extensions being able to ignore them.

Resilience: The format protects the data using state-of-the-art cryptographic algorithms to ensure the confidentiality and integrity of the protected data.

1.1. CBOR Grammar

This document uses the same grammar as used by [RFC8152]. The CBOR structures are described in prose.

bool

A boolean value (true: major type 7, value 21; false: major type 7, value 20)

int

An unsigned integer or a negative integer.

uint

An unsigned integer (major type 0)

nint

A negative integer (major type 1)

bstr

Byte string (major type 2)

tstr

Text string (major type 3)

map

A CBOR map (major type 5)

[+ FOO]

Indicates that the type FOO appears one or more times in an array. An optional empty array that is part of a map MUST NOT be serialized.

1.2. Conventions

Byte

A byte is a value in the range [0, 255] that can be represented with 8 bits.

Unsigned numbers

Unsigned numbers are represented as uN, a number in the range [0, 2^n - 1], e.g., u32 is a number between 0 and 4294967295.

Signed numbers

Signed numbers are represented as iN, a number in the range [-2^(N - 1), 2^(n - 1) - 1], e.g., i32 is a number between 2147483648 and 2147483647.

Endianness

All numbers are stored in the little-endian format, e.g., the u32 number 0x12345678 is stored as 78 56 34 12 consecutively in memory.

Byte sequence

A byte sequence is denoted as byte[N] where N is the number of consecutive bytes in memory.

UUID

Certain elements like ciphers and key derivation functions are encoded as Universally Unique IDentifiers [RFC4122].

URN

The human readable encoding of a UUID. In the context of CBOR this is encoded as a tstr.

String

A UTF-8 string.

2. Database Format

The CCDB data consists of a public and a confidential part. The database starts with a public header that encodes the properties of the database, including its version, followed by a encrypted block that contains the actual, CBOR encoded, data. The integrity of the header as well as the confidential block is verified using message authentication codes.

Outer Header Body Length (u64) AEAD Tag Encrypted Body

2.1. Outer Header

The outer header encodes information required to decrypt the remaining database. The overall structure of the outer header can be described as follows:

Version Header Field Length (u32) Header Field cid iv kdf

The header begins with the database version (§ 2.1.1 Version) followed by the length of the header fields in bytes, followed by one or more header fields (§ 2.1.2 Header Fields).

Note: The integrity of the header is validated by the selected AEAD cipher.

2.1.1. Version

The initial 8 bytes encode the CCDB version.

Name Data Type Description
sig u32 CCDB
major version u16 Major version number, e.g., 1 if the version is 1.0
minor version u16 Minor version number, e.g., 0 if the version is 1.0

2.1.2. Header Fields

The § 2.1.1 Version is followed by a CBOR map (major type 5) of the following header fields. The keys are encoded as text strings (major type 3) whereas the value types vary. All listed header fields are mandatory and MUST be encoded in the order listed below.

Key Data Type Description
cid tstr Identifier for a cipher suite used with the given database. The cipher suite is encoded as a text string.

The following ciphers MUST be supported:

  • CCDB_XCHACHA20_POLY1305_ARGON2ID: The nonce-extended version of the IETF ChaCha20 variant as authenticated cipher and Argon2id for key derivation.

The following ciphers MAY be supported:

  • CCDB_AES256GCM_ARGON2ID: AES as authenticated cipher in golauis counter mode and Argon2id for key derivation.

CCDB_XCHAChA20_POLY1305_ARGON2ID is considered the default AEAD cipher.

iv bstr Initialization vector (nonce) used for encryption. The IVs size depends on the cipher used for encryption:
  • XChaCha20-Poly1305: 24 bytes

  • AES256GCM: 12 bytes

Note: A new and unique IV MUST be used for every encryption. One MUST NOT use the same IV twice! This can be achieved by using a counter or by using a cryptographically secure (pseudo) random number generator (CSRNG) to generate the IV at random.

kdf § 2.1.2.1 KDF Parameters Values specific for the § 2.1.3 Key Derivation
2.1.2.1. KDF Parameters

The KDF parameters are used to derive a secret (§ 2.1.3.4 Symmetric Key). The $UUID is mandatory and defines which algorithm should be used as key derivation function. The other parameters are optional and depend on the selected algorithm. All parameters associated with the specified algorithm MUST be present for a specific algorithm, otherwise the database is malformed. The fields MUST be encoded in the order listed below.

Field Data Type Description Association
I uint Iterations, encoded within a u64. Argon2
M uint Memory usage in KiB, encoded within a u64. Argon2
P uint Parallelism encoded within a u32. Argon2
S bstr Random salt, typically 32 bytes. Argon2

2.1.3. Key Derivation

The encryption key is derived from one or more sources of key data:

2.1.3.1. Password

A password is the most common source of key data. It is set by a user during database creation and is also referred to as the master password. It is recommended to use a strong password that fulfill the criteria published by reputable authorities, such as OWASP and NIST.

It is further recommended that applications supporting the creation of CCDB databases further support the user with the creation of a secure master password, including the suggestion of randomized passwords.

2.1.3.2. Key File

A key file can serve as an input either alongside a password or as an alternative to the key derivation function used for deriving a symmetric encryption key. It’s recommended that applications support a range of key file formats to enhance compatibility and flexibility.

Applications SHOULD support the following key file formats:

2.1.3.3. Key Provider

Key material MAY also be obtained from other sources, e.g., using the HMAC Secret Extension of CTAP2 [fido-v2.1].

2.1.3.4. Symmetric Key

To generate a symmetric key, a key derivation function (kdf) is employed, which derives the key from either a password, key file or key provider. The input into this selected key derivation function is as follows, with its remaining arguments defined by § 2.1.2.1 KDF Parameters:

password || keyFileContent || keyProviderContent

It’s essential to ensure that the string passed to the KDF is not empty. Therefore, an application must enforce the usage of at least one of the following sources:

The process for deriving a symmetric key for encryption is as follows:

symKey = KDF(password || keyFileContent || keyProviderContent)

Note: It’s important to note that the length of the symmetric key must be adjusted according to the selected cipher (cid). For example, for AES256GCM, one can utilize the default hash length of 32 bytes for the symmetric key.

2.1.3.4.1. Argon2id + Password
I: 2, M: 4096, P: 8, S: 0102030401020304010203040102030401020304010203040102030401020304
password: supersecret

symKey = 1800b386aff0488a7a3720e014afd4b57d27c915ead08ed68ede40c225ce4e98 = Argon2id("supersecret")

2.2. Body

Directly after the § 2.1 Outer Header follows the body, consisting of the body length (u64), a tag, and the encrypted body data.

Name Data Type Description
Body Length u64 The length of the Encrypted Body Data in bytes.
Body Tag byte[] The AEAD tag is the result of encrypting the Body Data using the AEAD cipher defined by cid. The length N of the tag depends on the AEAD cipher used for encryption:
  • AES256GCM: byte[16]

Encrypted Body Data byte[Body Length] The body data encrypted using the AEAD cipher defined by cid of length Body Length.

2.2.1. Body Data Structure

The body data is a nested CBOR map (major type 5) consisting of § 2.2.1.2 Meta, § 2.2.1.3 Group, and § 2.2.1.6 Entry data items.

Field Data Type Description Optional
meta (0x00) map Properties describing the database itself (§ 2.2.1.2 Meta).
entries (0x01) [+ § 2.2.1.6 Entry] All entries of this database.
groups (0x02) [+ § 2.2.1.3 Group] All groups of this database. Groups are arranged as a forest. There is a implicit root and every item of the given array without a parent is implicitly a child of this root. Every Group MAY have one or more children referenced by an index. Optional
bin (0x03) [+ § 2.2.1.6 Entry ] The bin is a array of elements of type § 2.2.1.6 Entry that represent deleted entries. For every deleted entry the exp field of Times is set to some point in the future. Applications SHOULD check regularly the exp field and delete expired entries automatically.

This field MAY NOT be present if no entries have been deleted.

Optional
2.2.1.1. Bin Entry

When a § 2.2.1.6 Entry is deleted it SHOULD NOT be removed directly but instead moved into the Bin. Moving a deleted entry in the bin allows the user to undo a deletion. Each application MAY define a limit after which a deleted entry is permanently removed from the database.

Field Data Type Description Optional
time (0x00) uint Epoch-based date/time the entry was deleted.
entry (0x01) map The deleted § 2.2.1.6 Entry. The exact location where the entry was deleted from is defined by Groups.
2.2.1.2. Meta
Field Data Type Description Optional
gen (0x00) tstr The name of the application that created the database.
name (0x01) tstr The name of the database.
times (0x02) § 2.2.1.4 Times Time stamps. This field has to be updated each time the database content is changed.
2.2.1.3. Group
Field Data Type Description Optional
uuid (0x00) tstr A unique identifier for the given group, e.g., UUIDv4 or UUIDv7 encoded as URN, e.g.:
  • 0e695c28-42f9-43e4-9aca-3f71cd701dc0

name (0x01) tstr A human readable name for the given group.
times (0x02) map Counters and time values (see § 2.2.1.4 Times).
groups (0x03) [+ URN] A array of UUIDs encoded as URN referencing an object in groups. All listed groups are children of the given group.

The UUIDs SHOULD be of type UUIDv7 but applications MUST NOT expect the UUIDs to be sorted in any specific order.

Optional
entries (0x04) [+ URN] A array of UUIDs encoded as URN referencing an object in entries. All listed entries belong to the given group.

The UUIDs SHOULD be of type UUIDv7 but applications MUST NOT expect the UUIDs to be sorted in any specific order.

Optional
group (0x05) URN Points to the parent group. Optional
2.2.1.4. Times
Field Data Type Description Optional
creat (0x00) uint Epoch-based date/time the parent was created.
mod (0x01) uint Epoch-based date/time the parent was modified the last time.
exp (0x02) uint Epoch-based date/time the parent will expire. The meaning of this field may vary depending on the parent. Optional
cnt (0x03) uint Counter how many times the parent was used. The meaning of this field may vary depending on the parent. Optional
2.2.1.5. User
Field Data Type Description Optional
id (0x00) bstr The user handle of the user account. A user handle is an opaque byte sequence with a maximum size of 64 bytes, and is not meant to be displayed to the user. Optional
name (0x01) tstr A human-palatable identifier for a user account. This name is usually chosen by the user, e.g., the user name. For example, "alexm", "alex.mueller@example.com". Optional
display_name (0x02) tstr A human-palatable name for the user account, intended only for display. For example, "Alex Müller" or "田中倫". The Relying Party SHOULD let the user choose this, and SHOULD NOT restrict the choice more than necessary. Optional
2.2.1.6. Entry
Field Data Type Description Optional
uuid (0x00) tstr A unique identifier for the given entry, e.g., UUIDv4 or UUIDv7, encoded as URN.

The UUIDs SHOULD be of type UUIDv7.

name (0x01) tstr A human readable name for the given entry. Optional
times (0x02) map Counters and time values (see § 2.2.1.4 Times).

Note: For applications supporting passkeys, the UsageCount field might be of particular relevance but please be aware that counters make the synchronization between devices difficult and may lead to scenarios where the user locks himself out of his accounts.

notes (0x03) tstr Notes related to the given entry. Optional
secret (0x04) bstr A secret. This can be anything, including a password. The actual meaning of this value depends on the given context.

Note: The format defined within this document deliberately does NOT encrypt data twice (see security considerations; § 3.5.2 No Double Encryption). If you wan’t to protect your secrets using a second level of encryption, you SHOULD encrypt the secret before passing it to a ccdb writer.

Optional
key (0x05) map A CBOR Object Signing and Encryption (COSE) key [RFC8152]. Also see Double Coordinate Curves, Octet Key Pair, and Symmetric Keys.

Double Coordinate Curve:

{
  1: 2, 
  3: -7, 
  -1: 1, 
  -4: h’299ba40f6547f9a591636ba3aabcf52adedeca324d3d6e81c8302d5199de9d0d'
}

A4            # map(4)
   01         # unsigned(1) # kty
   02         # unsigned(2)   # Elliptic Curve keys w/ x- and y-coordinate pair
   03         # unsigned(3) # alg
   26         # negative(6)   # ECDSA w/ SHA-256
   20         # negative(0) # crv 
   01         # unsigned(1)   # NIST P-256 also known as secp256r1
   23         # negative(3) # d (private key)
   58 20      # bytes(32)
      299BA40F6547F9A591636BA3AABCF52ADEDECA324D3D6E81C8302D5199DE9D0D
Optional
url (0x06) tstr A text string representing a URL. Optional
user (0x07) § 2.2.1.5 User The user corresponding to the given credential.

The § 2.2.1.5 User MUST contain at least one field.

Optional
group (0x08) tstr A UUID (URN) referencing a § 2.2.1.3 Group. If not present, the given entry is implicitly associated to the group directly under the document root. Optional
tags (0x09) [+ tstr] One or more tags associated with the given entry. Optional
attach (0x0a) [+ § 2.2.1.7 Attachment ] One or more attachments associated with the given entry. This can for example be a file containing recovery keys. Optional
2.2.1.7. Attachment
Field Data Type Description Optional
desc (0x00) tstr A descriptor, e.g., a file name.
att (0x01) bstr A binary attachment.

2.3. Database Creation

Every application that supports the given standard should provide the flexibility to configure parameters that influence the behavior of the database. These parameters typically include the cipher, compression algorithm, and key derivation function. It’s recommended that applications only propose sensible values for ciphers and key derivation functions sourced from reputable authorities, such as OWASP and NIST.

By allowing configuration of these parameters, applications empower users to tailor their security settings to best suit their specific needs and environments. This adaptability ensures compatibility with various security protocols and standards, fostering a robust and customizable security posture for the application.

2.4. Serialization

During usage, the database typically exists in an intermediate form, largely contingent upon the programming language employed. Before persisting it to disk, the database must undergo serialization according to the following steps:

  1. Serialize Header Version: Serialize the version of the header as specified in § 2.1.1 Version.

  2. Header Field Length: Allocate 4 bytes to reserve space for the length of the header fields.

  3. Generate Initialization Vector (IV): Create a new and unique initialization vector (iv), ensuring it is not reused.

  4. Serialize the header fields, incorporating the iv.

  5. Write Serialized Header Length: Record the size of the serialized header fields within the 4 bytes reserved in the previous step.

  6. Encode the Body: The body is encoded as specified by the guidelines outlined in the § 2.2 Body section.

  7. Serialize Body Length: Serialize the length of the body as a u64 data type.

  8. Encrypt the Body: Utilize the cipher specified by cid to encrypt the body with the following parameters:

    • key: The symmetric key specified in § 2.1.3.4 Symmetric Key.

    • Initialization Vector (IV): iv

    • Associated Data (AD): The serialized header and body length.

  9. Write Tag and Encrypted Body: Place the resulting tag immediately after the body length, followed by the encrypted body.

2.5. Deserialization

Before an application can use a database it has to be deserialized and decrypted according to the following steps:

  1. Read Serialized Data: Retrieve the serialized data from storage.

  2. Validate Version: Read the § 2.1.1 Version and validate that sig equals CCDB.

  3. Extract Header Length: Extract the length of the header from the serialized data.

  4. Extract Header: Extract the header based on the header length.

  5. Validate Header: Validate that all required header fields are present and contain reasonable values.

  6. Extract Body Length: Extract the length of the body from the serialized data.

  7. Extract Tag and Encrypted Body: Extract the tag followed by the encrypted body from the serialized data.

  8. Decrypt the Body: Utilize the specified cipher and associated parameters to decrypt the encrypted body:

    • key: The symmetric key specified in § 2.1.3.4 Symmetric Key.

    • Initialization Vector (IV): iv

    • Associated Data (AD): Outer Header || Body Length

  9. Decode the Body: Decode the body according to the specifications outlined in the § 2.2 Body section.

3. Thread Model

3.1. Assumptions

Identifier Description
A_UNTRUSTED_STORAGE_LOCATION The file may be stored in a untrusted location, e.g., a unprotected USB stick or file share.
A_TRUSTED_PROCESSING_ENVIRONMENT The file is only decrypted and processed in a trusted processing environment.

3.2. Threads

Identifier Description
T_FILE_ACCESS The file is accessed at rest by a untrusted person. Applicable because of A_UNTRUSTED_STORAGE_LOCATION. Mitigated by M_ENCRYPTION.
T_FILE_MANIP The file is manipulated at rest by a untrusted person. Applicable because of A_UNTRUSTED_STORAGE_LOCATION. Mitigated by M_INTEGRITY.
T_MEMORY_ACCESS The memory is accessed while reading or writing the database by a malicious actor. Not applicable because of A_TRUSTED_PROCESSING_ENVIRONMENT. A ccdb file MUST NOT be processed on a untrusted system.
T_MEMORY_MANIP The memory is manipulated while reading or writing the database by a malicious actor. Not applicable because of A_TRUSTED_PROCESSING_ENVIRONMENT. A ccdb file MUST NOT be processed on a untrusted system.

3.3. Mitigations

Identifier Description
M_ENCRYPTION The file is encrypted using a state-of-the-art cipher.
M_INTEGRITY The integrity of the file is verified using a message authentication code.

3.4. Policies

The following policies SHOULD be considered by applications supporting the format defined within this specification.

Identifier Description
P_MEM_PROTECT The memory of readers and writers should be handled with care. This SHOULD include but is is not limited to mitigations like mlock on UNIX like systems.

3.5. Security Considerations

Files that store sensitive information are of particular interest to adversaries. The threads posed to a encrypted database change depending on its current state, including data-at-rest, data-in-transit, and data-in-use.

3.5.1. No Compression

Compression, in combination with encryption, may lead to unwanted behavior. Using CBOR as the main data format allows for a small message size making compression less relevant. This is why CCDB specifically doesn’t use compression.

3.5.2. No Double Encryption

Some file formats, like KDBX4, encrypt not only their main data but also specific fields such as password entries. The primary reason for this is to prevent the pollution of process memory with secrets, thereby making them more manageable. Assuming that the underlying operating system enforces process separation, including their allocated memory, and that the application protects the memory from being swapped out, the threat model involves an attacker with root privileges who can access process memory to obtain information about the decrypted data.

In the case of KDBX4, a problem arises when an attacker can read the decrypted XML data structure of a KDBX4 database located in main memory. In such a case, one must assume that the attacker can also read the prepended StreamKey. This allows the attacker to parse the XML data structure, collect all "protected" fields, and decrypt them using the StreamCipher with the StreamKey. Consequently, the fields are merely obfuscated, and the application cannot enforce the confidentiality of the data.

We, the authors of this document, believe that no confidential data should be processed on a compromised system and that there are no sufficient protection measures against an attacker with root privileges. Therefore, we assume a trusted processing environment in our threat model. Applications aiming to protect their data with a second layer of encryption should encrypt the data themselves and then store the ciphertext in the secret field of a § 2.2.1.6 Entry.

4. Recommended File Name Extension: .ccdb

The recommended file name extension for the "CBOR Credential Database Format" specified in this document is ".ccdb".

On Windows and macOS, files are distinguished by an extension to their filename. Such an extension is technically not actually required, as applications should be able to automatically detect the ccdb file format through the "magic bytes" at the beginning of the file, as some other UN*X desktop environments do. However, using name extensions makes it easier to work with files (e.g. visually distinguish file formats) so it is recommended - though not required - to use .ccdb as the name extension for files following this specification.

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

References

Normative References

[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[RFC8152]
J. Schaad. CBOR Object Signing and Encryption (COSE): Initial Algorithms. August 2022. Informational. URL: https://www.rfc-editor.org/rfc/rfc9053

Informative References

[FIDO-V2.1]
Client to Authenticator Protocol (CTAP). Editor's Draft. URL: https://fidoalliance.org/specs/fido-v2.1-ps-20210615/fido-client-to-authenticator-protocol-v2.1-ps-errata-20220621.html
[RFC4122]
K. Davis; B. Peabody; P. Leach. Universally Unique IDentifiers (UUIDs). May 2024. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc9562