JWP-BMT | January 2022 | |
Steele & Prorock | Informational | [Page] |
Merkle proofs are already being used to provide certificate transparency in [RFC9162].¶
The purpose of this specification is to define a generic encoding of merkle audit paths that is suitable for combining with [RFC7515] to construct selective disclosure proofs, that are not bound to the needs of certificate transparency, and that are suitable for more generic applications such as W3C Verifiable Credentials and W3C Decentralized Identifiers.¶
The scheme described herin features many important properties:¶
These properties allow the scheme to be used in applications where privacy and data minimization techniques are desired and/or required.¶
This scheme is meant to expose interfaces compatible with BBS+ Signature Scheme.¶
Many of the same benefits from that scheme apply here:¶
A recent emerging use case applies signature schemes in verifiable credentials. One problem with using simple signature schemes like ECSDA or ED25519 is a holder must disclose the entire signed message and signature for verification. Circuit based logic can be applied to verify these in zero-knowledge like SNARKS or Bulletproofs with R1CS but tend to be complicated. BBS+ on the other hand adds, to verifiable credentials or any other application, the ability to do very efficient zero-knowledge proofs. A holder gains the ability to choose which claims to reveal to a relying party without the need for any additional complicated logic.¶
The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this document, are to be interpreted as described in [RFC2119].¶
We will rely on the general terminology defined in [RFC7515], [RFC7797] and concepts defined in [RFC9162].¶
We introduce primarily 2 new concepts needed to generalize authenticated set membership proofs based on binary merkle trees.¶
The following terminology is used throughout this document:¶
const calculateMessageNonce = ( message: Buffer, index: number, rootNonce: Buffer, hash: Function ) => { const encodedIndex = Buffer.from(varint.encode(index)); return hash(Buffer.concat([message, encodedIndex, rootNonce])); };¶
memberNonce
under a chosen cryptographic hash function. These values represent the leaves of a binary merkle tree that encodes membership proofs for salted members.¶
Salted Members
¶
We restrict this specification to Binary (2-ary) Merkle Trees.¶
In short, increasing the branching factor of the merkle tree above 2 does not yield the desired properties.¶
See Verkle Trees.¶
See Vector Commitments and their Applications.¶
See Zero-Knowledge Sets with short proofs.¶
Vector commitment schemes are not defined in this specification, however we believe that some similar approach using them would be worth considering in future work.¶
The most obvious difference is the limitations imposed on tree construction that differ from the construction of tree's used by Bitcoin.¶
[RFC9162] refer to the IANA registry for Hash Algorithms¶
[RFC9162] refer to IANA registry for Signature Algorithms¶
But only define support for a few signature algorithms:¶
We prefer to generalize and enable support the full range of algorithms availabe under jose.¶
We believe choices regarding agility should be handled at a higher layer, but agree that restriction is a best practice.¶
BBS+ Signatures are in the process of being standardized.¶
Refer to Editors Draft of BBS+ Signature Scheme for details.¶
At a high level, we are seeking a larger degree of agility at both the hash and digital signature layers.¶
We believe that Blake2b as defined in [RFC7693] is the only supported hash algorithm.¶
We believe that BLS signature over BLS12381 as defined in Pairing-Friendly Curves, are the only supported signature scheme.¶
Binary Merkle Trees are constructed via the following algorithm:¶
See compute tree¶
let salted_members be the series defined by hashing the concatonation of each member bytes with its member nonce. let salted_members be the first level of a binary merkle tree, where each salted member is a leaf. for each pair of 2 leaves in the first level of the merkle tree, concatonate the leave and hash the result yielding the next first level of the merkle tree. if the first level of the tree is odd AND its length is not 1 (the end state), promote the odd leaf to the next leve and repeat the process. The algorithm ends when the first level of the merkle tree has length 1, its single element is the merkle root.¶
Here is a simple visualizaton of a tree constructed in this manner:¶
h(h(h(sm0+sm1) + h(sm2+sm3)) + sm4) (merkle root) / \ h(h(sm0+sm1) + h(sm2+sm3)) sm4 (2 siblings) / \ / h(sm0+sm1) h(sm2+sm3) sm4 (3 siblings) / \ / \ / sm0 sm1 sm2 sm3 sm4 (leaves are the base level, 5 siblings) ^ ^ ^ ^ ^ | | | | | m0 m1 m2 m3 m4¶
Note that this tree is "unbalanced" See Forgery Attacks on Unbalanced Binary Merkle Trees.¶
Also note that this algorithm differs from the one proposed in RFC9162.¶
The reason for this difference arises from our desire to support merkle trees and proofs used by Bitcoin, and our desire to encode a generic merkle proof that is not bound to "Certificate Transparency".¶
However, we may adjust the algorithm to match the one described in [RFC9162] in the future.¶
In order to verify that member bytes are included in a merkle root, a verifier requires:¶
This section defines a compact binary encoding of these "member proofs" based on approaches developed by Google and others related to "Protocol Buffers".¶
The main feature of protocol buffers which is used is called "varint".¶
This approach is also refered to as LEB128.¶
This approach is also used in Web Assembly¶
This approach is also used in multiformats.¶
We remain unaware of a best normative reference to provide for "protocol buffer style varints".¶
See this expired internet draft for protocol buffers.¶
See Google's documentation for protocol buffers.¶
We describe a binary encoding of an "audit path" or "inclusion path" and a nonce, such that a verifier with a merkle root may confirm that some given "member bytes" are included in a merkle root.¶
We define a binary encoded "membership proof" or a given "member" bytes as follows:¶
varint
, representing the size of auditDirectionsAsBits
and auditHashes
.¶
auditHashes
¶
See Test Vectors for specific examples.¶
We describe a binary encoding of a Root Nonce, Merkle Root, and series (order matters) of "audit path" or "inclusion paths".¶
These values are encoded via a convention of varint(length of bytes) + bytes
.¶
This is to support hash functions of varying lengths and member proofs of varying lengths.¶
We define a Full Disclosure
membership proof as the result of compression (under zlib v1.2.8) of the following:¶
memberNonce
. See Tree Construction. This value is encoded as varint (root length in bytes) + root bytes.¶
When decoding first the rootNonce
and root
are removed using varint decoding.¶
Then the remaining bytes are reduces by appling varint decoded to obtain the individual member proofs.¶
The compressed representations MUST NOT encode the original member bytes.¶
The order of member bytes MUST be preserved in some external data structure, in order to suppor full disclosure proof verification.¶
We define Selective Disclosure
the same as Full Disclosure
with the exception that the rootNonce
MUST NOT be encoded in the compressed representation.¶
The rootNonce
MUST be ommited in order to ensure that a selective disclosure proof does not reveal information that can be used to brute force siblings of disclosed members. This attack is also refered to as a second pre-image attack, See Disclosure of Root Nonce.¶
We recommend never transmitting the full disclosure proof, and instead deriving a selective disclosure proof even for the full disclosure use case.¶
In order to provide an "issuer" with the capability to provide a set of verifiable claims about a subject, a digital signature over a merkle root and some meta data can be applied.¶
We distinguish this data structure from a Full Disclosure Proof
by definition.¶
Full Disclosure Proof
.¶
Under this scheme a verifier MUST first, verify the signature by obtaining the associated public key from the meta data.¶
If the signature is valid, the verifier MUST second, verify all members of the disclosed proof have a corresponding encoded member proof.¶
If all members have a proof and the merkle root has a signature that is verified by the public key dereferenced by from the meta data we say:¶
The issuer has provided a full disclosure proof for the encoded members.¶
In other words, the issuer claims the members are in a set, and this claim can be verified as originating from the issuer and not having been tampered with under the assumption that the issuer's signing keys have not been compromised.¶
For a more formal definition of digital signature verification see [RFC7515] and [RFC7797].¶
Similar to the Authenticated Full Disclosure Proof
we define an Authenticated Selective Disclosure by definition:¶
Selective Disclosure Proof
.¶
We note that a Selective Disclosure Proof
MUST be derived from an associated Full Disclosure Proof
.¶
This derivation process omits the rootNonce
and filter the associated member proofs
.¶
We refer to the set of disclosed members proofs as:¶
Full Disclosure Proof
, but are ommited in the Selective Disclosure Proof
.¶
We note that the Redacted Member Series
may have size 0, however the absence of the rootNonce
still differentiates the Authenticated Selective Disclosure Proof
from an Authenticated Full Disclosure Proof
.¶
All algorithms that operate on public keys require first validating those keys. We assume all keys are represented according to [RFC7517].¶
Implementers are warned to consider base64url padding concerns.¶
If the root nonce associated with a merkle root is disclosed, a verifier or attacker can compute adjacent membership proofs from sets of disclosed proofs.¶
When a binary merkle tree is unbalanced (last level leaf count is not a power of 2), there is a potential for a forgery attack.¶
See CVE-2012-2459.¶
We remain unable to prove that our salting approach mitigates this potential vulnerability, as such the question remains:¶
does salting members before construction prevent forgery attacks on unbalanced binary merkle trees?¶
A note that [RFC9162] constructs merkle tree's differently.¶
Implementations of the signing algorithm SHOULD protect the secret key from side-channel attacks. One method for protecting against certain side-channel attacks is ensuring that the implementation executes exactly the same sequence of instructions and performs exactly the same memory accesses, for any value of the secret key. ( this copied verbatum from here).¶
It is recommended that the all nonces are from a trusted source of randomness AND all randomness is used for a single purpose.¶
For example, do not reuse randomness associated with Root Nonce for anything else.¶
The disclosure proofs defined herin rely on "Merkle Audit Paths" which are built from binary merkle trees under a chose hash function.¶
The choice of the hash function such as SHA-256 or SHAKE-256 is out of scope for this specification.¶
However, at the time of publishing we recommend a cryptographic hash function with at least 256-bit security strength.¶
Implementers are advised to consult:¶
Alone a Full Disclosure Proof can only prove membership, and is subject to tampering.¶
When combined with digital signature schemes, authentication is achievable.¶
The choice of the signature scheme such as ES256K or EdDSA is out of scope for this specification.¶
Implementers are advised to consult:¶
The following has NOT YET been added to the "JSON Web Proof Algorithm Types" registry:¶
//TODO¶