This specification describes mechanisms for ensuring the authenticity and integrity of structured digital documents using cryptography, such as digital signatures and other digital mathematical proofs.

This is an experimental specification and is undergoing regular revisions. It is not fit for production deployment.

Introduction

Cryptographic proofs enable functionality that is useful to implementors of distributed systems. For example, proofs can be used to:

The term Linked Data is used to describe a recommended best practice for exposing, sharing, and connecting information on the Web using standards, such as URLs, to identify things and their properties. When information is presented as Linked Data, other related information can be easily discovered and new information can be easily linked to it. Linked Data is extensible in a decentralized way, greatly reducing barriers to large scale integration.

With the increase in usage of Linked Data for a variety of applications, there is a need to be able to verify the authenticity and integrity of Linked Data documents. This specification adds authentication and integrity protection to data documents through the use of mathematical proofs without sacrificing Linked Data features such as extensibility and composability.

While this specification provides mechanisms to digitally sign Linked Data, the use of Linked Data is not necessary to gain some of the advantages provided by this specification.

Design Goals and Rationale

The Data Integrity specification achieves the following design goals:

Simple for Developers
The proof format is designed to be easy to use for developers that don't have significant cryptography training. For example, cryptographic suite identifiers are used instead of specific cryptographic parameters to ensure that it is difficult to accidentally produce a weak digital proof.
Layered Architecture
A number of historical digital signature mechanisms have had monolithic designs which limited use cases by combining data normalization, syntax, digital signature, and serialization into a single specification. This specification layers each component such that a broader range of use cases, such as generalized selective disclosure and serialization-agnostic signatures, are enabled.
Cryptographic Agility
Since digital proof mechanisms might be compromised without warning due to technological advancements, it is important that proof types can be easily and quickly replaced. This specification provides algorithm agility while still keeping the digital proof format easy for developers to understand.
Extensibility
Creating and deploying new proof types is a fairly trivial undertaking to ensure that the proof format increases the rate of innovation in the digital proof space.
Syntax Agnostic Proofs
Cryptographic proofs can be serialized in many different but equivalent ways and have often been tightly bound to the original document syntax. This specification enables one to create cryptographic proofs that are not bound to the original document syntax, which enable more advanced use cases such as being able to use a single digital signature across a variety of RDF-based graph serialization syntaxes such as JSON-LD, N-Quads, and TURTLE, without the need to regenerate the proof.

Terminology

Data Model

This section specifies the data model that is used for expressing data integrity proofs and verification methods.

Proofs

A data integrity proof is comprised of information about the proof, parameters required to verify it, and the proof value itself. All of this information is provided using Linked Data vocabularies such as [[SECURITY-VOCABULARY]].

A data integrity proof typically includes at least the following attributes:

type
Required. The specific proof type used. For example, an Ed25519Signature2020 type indicates that the proof includes a digital signature produced by an ed25519 cryptographic key.
proofPurpose
Required. The specific intent for the proof, the reason why an entity created it. Acts as a safeguard to prevent the proof from being misused for a purpose other than the one it was intended for. For example, a proof can be used for purposes of authentication, for asserting control of a Verifiable Credential (assertionMethod), and several others.
verificationMethod
Required. A set of parameters required to independently verify the proof, such as an identifier for a public/private key pair that would be used in the proof.
created
Required. The string value of an [[ISO8601]] combined date and time string generated by the Proof Algorithm.
domain
Optional. A string value specifying the restricted domain of the proof.
proofValue
Required. One of any number of valid representations of proof value generated by the Proof Algorithm.

The terms type, created, and domain above map to URLs. The vocabulary where these terms are defined is the [[SECURITY-VOCABULARY]].

A proof can be added to a JSON document like the following:

  {
    "title": "Hello world!"
  };
        

by adding the parameters outlined in this section:

  {
    "title": "Hello world!",
    "proof": {
      "type": "JcsSignature2020",
      "created": "2020-11-05T19:23:24Z",
      "verificationMethod": "https://di.example/issuer#z6MkjLrk3gKS2nnkeWcmcxi
        ZPGskmesDpuwRBorgHxUXfxnG",
      "proofPurpose": "assertionMethod",
      "proofValue": "zQeVbY4oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYV8nQA
        pmEcqaqA3Q1gVHMrXFkXJeV6doDwLWx"
    }
  }
        

The proof example above uses the JcsSignature2020 proof type to produce a verifiable digital proof by canonicalizing the input data using the JSON Canonicalization Scheme [[RFC8785]] and then digitally signing it using an Ed25519 elliptic curve signature.

Similarly, a proof can be added to a JSON-LD data document like the following:

  {
    "@context": {"title": "https://schema.org#title"},
    "title": "Hello world!"
  };
        

by adding the parameters outlined in this section:

  {
    "@context": [
      {"title": "https://schema.org#title"},
      "https://w3id.org/security/suites/ed25519-2020/v1"
    ],
    "title": "Hello world!",
    "proof": {
      "type": "Ed25519Signature2020",
      "created": "2020-11-05T19:23:24Z",
      "verificationMethod": "https://ldi.example/issuer#z6MkjLrk3gKS2nnkeWcmcxi
        ZPGskmesDpuwRBorgHxUXfxnG",
      "proofPurpose": "assertionMethod",
      "proofValue": "z4oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYVQeVbY8nQA
        VHMrXFkXJpmEcqdoDwLWxaqA3Q1geV6"
    }
  }
        

The proof example above uses the Ed25519Signature2020 proof type to produce a verifiable digital proof by canonicalizing the input data using the RDF Dataset Canonicalization algorithm [[RDF-DATASET-C14N]] and then digitally signing it using an Ed25519 elliptic curve signature.

Create a separate section detailing an optional mechanism for authenticating public key control via bi-directional links. How to establish trust in controllers is out of scope but examples can be given.
Specify algorithm agility mechanisms (additional attributes from the security vocab can be used to indicate other signing and hash algorithms). Rewrite algorithms to be parameterized on this basis and move `Ed25519Signature2020` definition to a single supported mechanism; specify its identifier as a URL. In order to make it easy to specify a variety of combinations of algorithms, introduce a core type `DataIntegrityProof` that allows for easy filtering/discover of proof nodes, but that type on its own doesn't specify any default proof or hash algorithms, those need to be given via other properties in the nodes.

The pattern that Data Integrity Signatures use presently leads to a proliferation in signature types and JSON-LD Contexts. This proliferation can be avoided without any loss of the security characteristics of tightly binding a cryptography suite version to one or more acceptable public keys. The following signature suites are currently being contemplated: eddsa-2022, nist-ecdsa-2022, koblitz-ecdsa-2022, rsa-2022, pgp-2022, bbs-2022, eascdsa-2022, ibsa-2022, and jws-2022.

{
  "@context": ["https://w3id.org/security/data-integrity/v1"],
  "type": "DataIntegritySignature",
  "cryptosuite": "ecdsa-2022",
  "created": "2022-11-29T20:35:38Z",
  "verificationMethod": "did:example:123456789abcdefghi#keys-1",
  "proofPurpose": "assertionMethod",
  "proofValue": "z2rb7doJxczUFBTdV5F5pehtbUXPDUgKVugZZ99jniVXCUpojJ9PqLYV
                 evMeB1gCyJ4HqpnTyQwaoRPWaD3afEZboXCBTdV5F5pehtbUXPDUgKVugUpoj"
}
      
Add an explicit check on key type to prevent an attacker from selecting an algorithm that could abuse how the key is used/interpreted.
Add a note indicating that selective disclosure proof mechanisms can be compatible with Data Integrity; for example, an algorithm could produce a merkle tree from a canonicalized set of N-Quads and then sign the root hash. Disclosure would involve including the merkle paths for each N-Quad that is to be revealed. This mechanism would merely consume the normalized output differently (this, and the proof mechanism would be modifications to this core spec). It might also be necessary to generate proof parameters such as a private key/seed that can be used along with an algorithm to deterministically generate nonces that are concatenated with each N-Quad to prevent rainbow table or similar attacks.

Proof Purposes

A proof that describes its purpose helps prevent it from being misused for some other purpose.

Add a mention of JWK's key_ops parameter and WebCrypto's KeyUsage restrictions; explain that Proof Purpose serves a similar goal but allows for finer-grained restrictions.

The following is a list of commonly used proof purpose values.

authentication
Indicates that a given proof is only to be used for the purposes of an authentication protocol.
assertionMethod
Indicates that a proof can only be used for making assertions, for example signing a Verifiable Credential.
keyAgreement
Indicates that a proof is used for for key agreement protocols, such as Elliptic Curve Diffie Hellman key agreement used by popular encryption libraries.
capabilityDelegation
Indicates that the proof can only be used for delegating capabilities. See the Authorization Capabilities [[ZCAP]] specification for more detail.
capabilityInvocation
Indicates that the proof can only be used for invoking capabilities. See the Authorization Capabilities [[ZCAP]] specification for more detail.

Note: The Authorization Capabilities [[ZCAP]] specification defines additional proof purposes for that use case, such as capabilityInvocation and capabilityDelegation.

Controller Documents

A controller document is a set of data that specifies one or more relationships between a controller and a set of data, such as a set of public cryptographic keys. The controller document SHOULD contain verification relationships that explicitly permit the use of certain verification methods for specific purposes.

Add examples of common Controller documents, such as controller documents published on a ledger-based registry, or on a mutable medium in combination with an integrity protection mechanism such as Hashlinks.

Verification Methods

A controller document can express verification methods, such as cryptographic public keys, which can be used to authenticate or authorize interactions with the controller or associated parties. For example, a cryptographic public key can be used as a verification method with respect to a digital signature; in such usage, it verifies that the signer could use the associated cryptographic private key. Verification methods might take many parameters. An example of this is a set of five cryptographic keys from which any three are required to contribute to a cryptographic threshold signature.

verificationMethod

The verificationMethod property is OPTIONAL. If present, the value MUST be a set of verification methods, where each verification method is expressed using a map. The verification method map MUST include the id, type, controller, and specific verification material properties that are determined by the value of type and are defined in . A verification method MAY include additional properties. Verification methods SHOULD be registered in the Data Integrity Specification Registries [TBD - DIS-REGISTRIES].

id

The value of the id property for a verification method MUST be a string that conforms to the [[URL]] syntax.

type
The value of the type property MUST be a string that references exactly one verification method type. In order to maximize global interoperability, the verification method type SHOULD be registered in the Data Integrity Specification Registries [TBD -- DIS-REGISTRIES].
controller
The value of the controller property MUST be a string that conforms to the [[URL]] syntax.
    {
      "@context": [
        "https://www.w3.org/ns/did/v1",
        "https://w3id.org/security/suites/jws-2020/v1"
        "https://w3id.org/security/suites/ed25519-2020/v1"
      ]
      "id": "did:example:123456789abcdefghi",
      ...
      "verificationMethod": [{
        "id": ...,
        "type": ...,
        "controller": ...,
        "publicKeyJwk": ...
      }, {
        "id": ...,
        "type": ...,
        "controller": ...,
        "publicKeyMultibase": ...
      }]
    }
          

The semantics of the controller property are the same when the subject of the relationship is the controller document as when the subject of the relationship is a verification method, such as a cryptographic public key. Since a key can't control itself, and the key controller cannot be inferred from the controller document, it is necessary to explicitly express the identity of the controller of the key. The difference is that the value of controller for a verification method is not necessarily a controller. controllers are expressed using the controller property at the highest level of the controller document.

Verification Material

Verification material is any information that is used by a process that applies a verification method. The type of a verification method is expected to be used to determine its compatibility with such processes. Examples of verification material properties are publicKeyJwk or publicKeyMultibase. A cryptographic suite specification is responsible for specifying the verification method type and its associated verification material. For example, see JSON Web Signature 2020 and Ed25519 Signature 2020. For all registered verification method types and associated verification material available for controllers, please see the Data Integrity Specification Registries [TBD - DIS-REGISTRIES].

Ensuring that cryptographic suites are versioned and tightly scoped to a very small set of possible key types and signature schemes (ideally one key type and size and one signature output type) is a design goal for most Data Integrity cryptographic suites. Historically, this has been done by defining both the key type and the cryptographic suite that uses the key type in the same specification. The downside of doing so, however, is that there might be a proliferation of different key types in multikey that result in different cryptosuites defining the same key material differently. For example, one cryptosuite might use compressed Curve P-256 keys while another uses uncompressed values. If that occurs, it will harm interoperability. It will be important in the coming months to years to ensure that this does not happen by fully defining the multikey format in a separate specification so cryptosuite specifications, such as this one, can refer to the multikey specification, thus reducing the chances of multikey type proliferation and improving the chances of maximum interoperability for the multikey format.

To increase the likelihood of interoperable implementations, this specification limits the number of formats for expressing verification material in a controller document. The fewer formats that implementers have to implement, the more likely it will be that they will support all of them. This approach attempts to strike a delicate balance between ease of implementation and supporting formats that have historically had broad deployment. Two supported verification material properties are listed below:

publicKeyJwk

The publicKeyJwk property is OPTIONAL. If present, the value MUST be a map representing a JSON Web Key that conforms to [[RFC7517]]. The map MUST NOT contain "d", or any other members of the private information class as described in Registration Template. It is RECOMMENDED that verification methods that use JWKs [[RFC7517]] to represent their public keys use the value of kid as their fragment identifier. It is RECOMMENDED that JWK kid values are set to the public key fingerprint [[RFC7638]]. See the first key in for an example of a public key with a compound key identifier.

publicKeyMultibase

The publicKeyMultibase property is OPTIONAL. This feature is non-normative. If present, the value MUST be a string representation of a [[?MULTIBASE]] encoded public key.

Note that the [[?MULTIBASE]] specification is not yet a standard and is subject to change. There might be some use cases for this data format where publicKeyMultibase is defined, to allow for expression of public keys, but privateKeyMultibase is not defined, to protect against accidental leakage of secret keys.

A verification method MUST NOT contain multiple verification material properties for the same material. For example, expressing key material in a verification method using both publicKeyJwk and publicKeyMultibase at the same time is prohibited.

An example of a controller document containing verification methods using both properties above is shown below.

    {
      "@context": [
        "https://www.w3.org/ns/did/v1",
        "https://w3id.org/security/suites/jws-2020/v1",
        "https://w3id.org/security/suites/ed25519-2020/v1"
      ]
      "id": "did:example:123456789abcdefghi",
      ...
      "verificationMethod": [{
        "id": "did:example:123#_Qq0UL2Fq651Q0Fjd6TvnYE-faHiOpRlPVQcY_-tA4A",
        "type": "JsonWebKey2020", // external (property value)
        "controller": "did:example:123",
        "publicKeyJwk": {
          "crv": "Ed25519", // external (property name)
          "x": "VCpo2LMLhn6iWku8MKvSLg2ZAoC-nlOyPVQaO3FxVeQ", // external (property name)
          "kty": "OKP", // external (property name)
          "kid": "_Qq0UL2Fq651Q0Fjd6TvnYE-faHiOpRlPVQcY_-tA4A" // external (property name)
        }
      }, {
        "id": "did:example:123456789abcdefghi#keys-1",
        "type": "Ed25519VerificationKey2020", // external (property value)
        "controller": "did:example:pqrstuvwxyz0987654321",
        "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
      }],
      ...
    }
            

Multikey

The Multikey data model is a specific type of verification method that utilizes the [[MULTICODEC]] specification to encode key types into a single binary stream that is then encoded using the [[MULTIBASE]] specification. To encode a Multikey, the verification method `type` MUST be set to `Multikey` and the `publicKeyMultibase` value MUST be a [[MULTIBASE]] encoded [[MULTICODEC]] value. An example of a Multikey is provided below:

{
  "@context": ["https://w3id.org/security/suites/multikey/v1"],
  "id": "did:example:123456789abcdefghi#keys-1",
  "type": "Multikey",
  "controller": "did:example:123456789abcdefghi",
  "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
}
            

In the example above, the `publicKeyMultibase` value starts with the letter `z`, which is the [[MULTIBASE]] header that conveys that the binary data is base58-encoded using the Bitcoin base-encoding alphabet. The decoded binary data [[MULTICODEC]] header is `0xed`, which specifies that the remaining data is a 32-byte raw Ed25519 public key.

The Multikey data model is also capable of encoding secret keys, sometimes referred to as private keys.

{
  "@context": ["https://w3id.org/security/suites/secrets/v1"],
  "id": "did:example:123456789abcdefghi#keys-1",
  "type": "Multikey",
  "controller": "did:example:123456789abcdefghi",
  "secretKeyMultibase": "z3u2fprgdREFtGakrHr6zLyTeTEZtivDnYCPZmcSt16EYCER"
}
            

In the example above, the `secretKeyMultibase` value starts with the letter `z`, which is the [[MULTIBASE]] header that conveys that the binary data is base58-encoded using the Bitcoin base-encoding alphabet. The decoded binary data [[MULTICODEC]] header is `0x1300`, which specifies that the remaining data is a 32-byte raw Ed25519 private key.

Referring to Verification Methods

Verification methods can be embedded in or referenced from properties associated with various verification relationships as described in . Referencing verification methods allows them to be used by more than one verification relationship.

If the value of a verification method property is a map, the verification method has been embedded and its properties can be accessed directly. However, if the value is a URL string, the verification method has been included by reference and its properties will need to be retrieved from elsewhere in the controller document or from another controller document. This is done by dereferencing the URL and searching the resulting resource for a verification method map with an id property whose value matches the URL.

    {
...

      "authentication": [
        // this key is referenced and might be used by
        // more than one verification relationship
        "did:example:123456789abcdefghi#keys-1",
        // this key is embedded and may *only* be used for authentication
        {
          "id": "did:example:123456789abcdefghi#keys-2",
          "type": "Ed25519VerificationKey2020", // external (property value)
          "controller": "did:example:123456789abcdefghi",
          "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
        }
      ],

...
    }
            

Verification Relationships

A verification relationship expresses the relationship between the controller and a verification method.

Different verification relationships enable the associated verification methods to be used for different purposes. It is up to a verifier to ascertain the validity of a verification attempt by checking that the verification method used is contained in the appropriate verification relationship property of the controller document.

The verification relationship between the controller and the verification method is explicit in the controller document. Verification methods that are not associated with a particular verification relationship cannot be used for that verification relationship. For example, a verification method in the value of the authentication property cannot be used to engage in key agreement protocols with the controller—the value of the keyAgreement property needs to be used for that.

The controller document does not express revoked keys using a verification relationship. If a referenced verification method is not in the latest controller document used to dereference it, then that verification method is considered invalid or revoked.

The following sections define several useful verification relationships. A controller document MAY include any of these, or other properties, to express a specific verification relationship. In order to maximize global interoperability, any such properties used SHOULD be registered in the Data Integrity Specification Registries [TBD: DIS-REGISTRIES].

Authentication

The authentication verification relationship is used to specify how the controller is expected to be authenticated, for purposes such as logging into a website or engaging in any sort of challenge-response protocol.

authentication
The authentication property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.
    {
      "@context": [
        "https://www.w3.org/ns/did/v1",
        "https://w3id.org/security/suites/ed25519-2020/v1"
      ],
      "id": "did:example:123456789abcdefghi",
      ...
      "authentication": [
        // this method can be used to authenticate as did:...fghi
        "did:example:123456789abcdefghi#keys-1",
        // this method is *only* approved for authentication, it may not
        // be used for any other proof purpose, so its full description is
        // embedded here rather than using only a reference
        {
          "id": "did:example:123456789abcdefghi#keys-2",
          "type": "Ed25519VerificationKey2020",
          "controller": "did:example:123456789abcdefghi",
          "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
        }
      ],
      ...
    }
            

If authentication is established, it is up to the application to decide what to do with that information.

This is useful to any authentication verifier that needs to check to see if an entity that is attempting to authenticate is, in fact, presenting a valid proof of authentication. When a verifier receives some data (in some protocol-specific format) that contains a proof that was made for the purpose of "authentication", and that says that an entity is identified by the `id`, then that verifier checks to ensure that the proof can be verified using a verification method (e.g., public key) listed under authentication in the controller document.

Note that the verification method indicated by the authentication property of a controller document can only be used to authenticate the controller. To authenticate a different controller, the entity associated with the value of controller needs to authenticate with its own controller document and associated authentication verification relationship.

Assertion

The assertionMethod verification relationship is used to specify how the controller is expected to express claims, such as for the purposes of issuing a Verifiable Credential [[?VC-DATA-MODEL]].

assertionMethod
The assertionMethod property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

This property is useful, for example, during the processing of a verifiable credential by a verifier. During verification, a verifier checks to see if a verifiable credential contains a proof created by the controller by checking that the verification method used to assert the proof is associated with the assertionMethod property in the corresponding controller document.

    {
      "@context": [
        "https://www.w3.org/ns/did/v1",
        "https://w3id.org/security/suites/ed25519-2020/v1"
      ],
      "id": "did:example:123456789abcdefghi",
      ...
      "assertionMethod": [
        // this method can be used to assert statements as did:...fghi
        "did:example:123456789abcdefghi#keys-1",
        // this method is *only* approved for assertion of statements, it is not
        // used for any other verification relationship, so its full description is
        // embedded here rather than using a reference
        {
          "id": "did:example:123456789abcdefghi#keys-2",
          "type": "Ed25519VerificationKey2020", // external (property value)
          "controller": "did:example:123456789abcdefghi",
          "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
        }
      ],
      ...
    }
            

Key Agreement

The keyAgreement verification relationship is used to specify how an entity can generate encryption material in order to transmit confidential information intended for the controller, such as for the purposes of establishing a secure communication channel with the recipient.

keyAgreement
The keyAgreement property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

An example of when this property is useful is when encrypting a message intended for the controller. In this case, the counterparty uses the cryptographic public key information in the verification method to wrap a decryption key for the recipient.

    {
      "@context": "https://www.w3.org/ns/did/v1",
      "id": "did:example:123456789abcdefghi",
      ...
      "keyAgreement": [
        // this method can be used to perform key agreement as did:...fghi
        "did:example:123456789abcdefghi#keys-1",
        // this method is *only* approved for key agreement usage, it will not
        // be used for any other verification relationship, so its full description is
        // embedded here rather than using only a reference
        {
          "id": "did:example:123#zC9ByQ8aJs8vrNXyDhPHHNNMSHPcaSgNpjjsBYpMMjsTdS",
          "type": "X25519KeyAgreementKey2019", // external (property value)
          "controller": "did:example:123",
          "publicKeyMultibase": "z6LSn6p3HRxx1ZZk1dT9VwcfTBCYgtNWdzdDMKPZjShLNWG7"
        }
      ],
      ...
    }
            

Capability Invocation

The capabilityInvocation verification relationship is used to specify a verification method that might be used by the controller to invoke a cryptographic capability, such as the authorization to update the controller document.

capabilityInvocation
The capabilityInvocation property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

An example of when this property is useful is when a controller needs to access a protected HTTP API that requires authorization in order to use it. In order to authorize when using the HTTP API, the controller uses a capability that is associated with a particular URL that is exposed via the HTTP API. The invocation of the capability could be expressed in a number of ways, e.g., as a digitally signed message that is placed into the HTTP Headers.

The server providing the HTTP API is the verifier of the capability and it would need to verify that the verification method referred to by the invoked capability exists in the capabilityInvocation property of the controller document. The verifier would also check to make sure that the action being performed is valid and the capability is appropriate for the resource being accessed. If the verification is successful, the server has cryptographically determined that the invoker is authorized to access the protected resource.

    {
      "@context": [
        "https://www.w3.org/ns/did/v1",
        "https://w3id.org/security/suites/ed25519-2020/v1"
      ],
      "id": "did:example:123456789abcdefghi",
      ...
      "capabilityInvocation": [
        // this method can be used to invoke capabilities as did:...fghi
        "did:example:123456789abcdefghi#keys-1",
        // this method is *only* approved for capability invocation usage, it will not
        // be used for any other verification relationship, so its full description is
        // embedded here rather than using only a reference
        {
        "id": "did:example:123456789abcdefghi#keys-2",
        "type": "Ed25519VerificationKey2020", // external (property value)
        "controller": "did:example:123456789abcdefghi",
        "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
        }
      ],
      ...
    }
            

Capability Delegation

The capabilityDelegation verification relationship is used to specify a mechanism that might be used by the controller to delegate a cryptographic capability to another party, such as delegating the authority to access a specific HTTP API to a subordinate.

capabilityDelegation
The capabilityDelegation property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

An example of when this property is useful is when a controller chooses to delegate their capability to access a protected HTTP API to a party other than themselves. In order to delegate the capability, the controller would use a verification method associated with the capabilityDelegation verification relationship to cryptographically sign the capability over to another controller. The delegate would then use the capability in a manner that is similar to the example described in .

    {
      "@context": [
        "https://www.w3.org/ns/did/v1",
        "https://w3id.org/security/suites/ed25519-2020/v1"
      ],
      "id": "did:example:123456789abcdefghi",
      ...
      "capabilityDelegation": [
        // this method can be used to perform capability delegation as did:...fghi
        "did:example:123456789abcdefghi#keys-1",
        // this method is *only* approved for granting capabilities; it will not
        // be used for any other verification relationship, so its full description is
        // embedded here rather than using only a reference
        {
        "id": "did:example:123456789abcdefghi#keys-2",
        "type": "Ed25519VerificationKey2020", // external (property value)
        "controller": "did:example:123456789abcdefghi",
        "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
        }
      ],
      ...
    }
            

Multiple Proofs

The Data Integrity specification supports the concept of multiple proofs in a single document. There are two types of multi-proof approaches that are identified: Proof Sets (un-ordered) and Proof Chains (ordered).

Proof Sets

A proof set is useful when the same data needs to be secured by multiple entities, but where the order of proofs does not matter, such as in the case of a set of signatures on a contract. A proof set, which has no order, is represented by associating a set of proofs with the proof key in a document.

{
  "@context": [
    {"title": "https://schema.org#title"},
    "https://w3id.org/security/suites/ed25519-2020/v1"
],
  "title": "Hello world!",
  "proof": [{
    "type": "Ed25519Signature2020",
    "created": "2020-11-05T19:23:24Z",
    "verificationMethod": "https://ldi.example/issuer/1#z6MkjLrk3gKS2nnkeWcmcxi
      ZPGskmesDpuwRBorgHxUXfxnG",
    "proofPurpose": "assertionMethod",
    "proofValue": "z4oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYVQeVbY8nQA
      VHMrXFkXJpmEcqdoDwLWxaqA3Q1geV6"
  }, {
    "type": "Ed25519Signature2020",
    "created": "2020-11-05T13:08:49Z",
    "verificationMethod": "https://pfps.example/issuer/2#z6MkGskxnGjLrk3gKS2mes
      DpuwRBokeWcmrgHxUXfnncxiZP",
    "proofPurpose": "assertionMethod",
    "proofValue": "z5QLBrp19KiWXerb8ByPnAZ9wujVFN8PDsxxXeMoyvDqhZ6Qnzr5CG9876
      zNht8BpStWi8H2Mi7XCY3inbLrZrm95"
  }]
}
        

Proof Chains

A proof chain is useful when the same data needs to be signed by multiple entities and the order of when the proofs occurred matters, such as in the case of a notary counter-signing a proof that had been created on a document. A proof chain, where order needs to be preserved, is represented by associating an ordered list of proofs with the proofChain key in a document.

{
  "@context": [
    {"title": "https://schema.org#title"},
    "https://w3id.org/security/suites/ed25519-2020/v1"
],
  "title": "Hello world!",
  "proofChain": [{
    "type": "Ed25519Signature2020",
    "created": "2020-11-05T19:23:42Z",
    "verificationMethod": "https://ldi.example/issuer/1#z6MkjLrk3gKS2nnkeWcmcxi
      ZPGskmesDpuwRBorgHxUXfxnG",
    "proofPurpose": "assertionMethod",
    "proofValue": "zVbY8nQAVHMrXFkXJpmEcqdoDwLWxaqA3Q1geV64oey5q2M3XKaxup3tmzN4
      DRFTLVqpLMweBrSxMY2xHX5XTYVQe"
  }, {
    "type": "Ed25519Signature2020",
    "created": "2020-11-05T21:28:14Z",
    "verificationMethod": "https://pfps.example/issuer/2#z6MkGskxnGjLrk3gKS2mes
      DpuwRBokeWcmrgHxUXfnncxiZP",
    "proofPurpose": "assertionMethod",
    "proofValue": "z6Qnzr5CG9876zNht8BpStWi8H2Mi7XCY3inbLrZrm955QLBrp19KiWXerb8
      ByPnAZ9wujVFN8PDsxxXeMoyvDqhZ"
  }]
}
        

Proof Types

Signatures

A data integrity signature is a type of cryptographic proof, and is comprised of information about the signature, parameters required to verify it, and the signature value itself. All of this information is provided using Linked Data vocabularies such as the [[!SECURITY-VOCABULARY]].

A data integrity signature typically includes at least the following attributes:

type (required)
A URI that identifies the digital cryptographic suite that was used to create the signature. For example: Ed25519Signature2020.
created (required)
The string value of an [[!ISO8601]] combined date and time string generated by the Proof Algorithm.
domain (optional)
A string value specifying the restricted domain of the signature.
nonce (optional, but strongly recommended)
A string value that is included in the digital signature and MUST only be used once for a particular domain and window of time. This value is used to mitigate replay attacks.
signature value (required)
One of any number of valid representations of signature value generated by the Proof Algorithm. Example: jws for detached JSON Web Signatures.

The terms type, created, domain, nonce, and jws above map to URLs. The vocabulary where these terms are defined is the [[SECURITY-VOCABULARY]].

A signature can be added to a data document like the following:

{
  "@context": {"title": "https://schema.org#title"},
  "title": "Hello world!",
}
        

by adding the parameters outlined in this section:

{
  "@context": [
    {"title": "https://schema.org#title"},
    "https://w3id.org/security/suites/ed25519-2020/v1"
],
  "title": "Hello world!",
  "proof": {
    "type": "Ed25519Signature2020",
    "created": "2020-11-05T19:23:24Z",
    "verificationMethod": "https://ldi.example/issuer#z6MkjLrk3gKS2nnkeWcmcxi
      ZPGskmesDpuwRBorgHxUXfxnG",
    "proofPurpose": "assertionMethod",
    "proofValue": "z4oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYVQeVbY8nQA
      VHMrXFkXJpmEcqdoDwLWxaqA3Q1geV6"
  }
}
        

The signature example above uses the Ed25519Signature2020 cryptographic suite to produce a verifiable digital signature.

Create a separate section detailing an optional mechanism for authenticating public key control via bi-directional links. How to establish trust in key controller entities is out of scope but examples can be given.
Specify algorithm agility mechanisms (additional attributes from the security vocab can be used to indicate other signing and hash algorithms). Rewrite algorithms to be parameterized on this basis and move `RsaSignature2018` definition to a single supported mechanism; specify its identifier as a URL. In order to make it easy to specify a variety of combinations of algorithms, introduce a core type `LinkedDataSignature` that allows for easy filtering/discover of signature nodes, but that type on its own doesn't specify any default signature or hash algorithms, those must be given via other properties in the nodes.
Add a note indicating that this specification should not be construed to indicate that public key controllers should be restricted to a single public key or that systems that use this spec and involve real people should identify each person as only ever being a single entity rather than perhaps N entities with M keys. There are no such restrictions and in many cases those kinds of restrictions are ill-advised due to privacy considerations.
Add an explicit check on key type to prevent an attacker from selecting an algorithm that may abuse how the key is used/interpreted.
Add a note indicating that selective disclosure signature mechanisms can be compatible with data integrity signatures; for example, an algorithm could produce a merkle tree from a canonicalized set of N-Quads and then sign the root hash. Disclosure would involve including the merkle paths for each N-Quad that is to be revealed. This mechanism would merely consume the normalized output differently (this, and the proof mechanism would be modifications to this core spec). It may also be necessary to generate signature parameters such as a private key/seed that can be used along with an algorithm to deterministically generate nonces that are concatenated with each N-Quad to prevent rainbow table or similar attacks.

Other Proof Types

TODO: Add links and examples to proof types that are not data integrity signatures, such as proof of existence, proof of work, proof of elapsed time, and proof of registration.

Advanced Terminology

These terms are relevant only to implementors of new cryptographic suites.

canonicalization algorithm
An algorithm that takes an input document that has more than one possible representation and always transforms it into a deterministic representation. For example, alphabetically sorting a list of items is a type canonicalization. This process is sometimes also called normalization.
message digest algorithm
An algorithm that takes an input message and produces a cryptographic output message that is often many orders of magnitude smaller than the input message. These algorithms are often 1) very fast, 2) non-reversible, 3) cause the output to change significantly when even one bit of the input message changes, and 4) make it infeasible to find two different inputs for the same output.
proof algorithm
An algorithm that takes an input message and produces an output value where the receiver of the message can mathematically verify that the message has not been modified in transit and came from someone possessing a particular secret.

Creating New Proof Types

A data integrity proof is designed to be easy to use by developers and therefore strives to minimize the amount of information one has to remember to generate a proof. Often, just the cryptographic suite name (e.g. Ed25519Signature2020) is required from developers to initiate the creation of a proof. These cryptographic suites are often created or reviewed by people that have the requisite cryptographic training to ensure that safe combinations of cryptographic primitives are used.

This section details the cryptographic primitives that are available to proof type developers.

At a minimum, a proof type is expected have the following attributes:

id
A URL that identifies the cryptographic suite. For example: https://w3id.org/security#Ed25519Signature2020.
type
The value ProofSuite.
canonicalizationAlgorithm
A URL that identifies the canonicalization algorithm to use on the document. For example: https://w3id.org/security#URDNA2015.
digestAlgorithm
A URL that identifies the message digest algorithm to use on the canonicalized document. For example: https://www.ietf.org/assignments/jwa-parameters#SHA256
proofAlgorithm
A URL that identifies the proof algorithm to use on the data to be signed. For example: https://w3id.org/security#ed25519

A complete example of a proof type is shown in the next example:

{
  "id": "https://w3id.org/security#Ed25519Signature2020",
  "type": "Ed25519VerificationKey2020",
  "canonicalizationAlgorithm": "https://w3id.org/security#URDNA2015",
  "digestAlgorithm": "https://www.ietf.org/assignments/jwa-parameters#SHA256",
  "signatureAlgorithm": "https://w3id.org/security#ed25519"
}
      

Algorithms

The algorithms defined below are generalized in that they require a specific canonicalization algorithm, message digest algorithm, and proof algorithm to be used to achieve the algorithm's intended outcome.

Proof Algorithm

The proof parameters should be included as headers and values in the data to be signed.

The following algorithm specifies how to create a digital proof that can be later used to verify the authenticity and integrity of a unsigned data document. A unsigned data document, document, proof options, options, and a private key, privateKey, are required inputs. The proof options MUST contain an identifier for the public/private key pair, and an [[!ISO8601]] combined date and time string, created, containing the current date and time, accurate to at least one second, in Universal Time Code format. A domain might also be specified in the options. A signed data document is produced as output. Whenever this algorithm encodes strings, it MUST use UTF-8 encoding.

  1. Create a copy of document, hereafter referred to as output.
  2. Generate a canonicalized document by canonicalizing document according to a canonicalization algorithm (e.g. the URDNA2015 [[!RDF-DATASET-C14N]] algorithm).
  3. Create a value tbs that represents the data to be signed, and set it to the result of running the Create Verify Hash Algorithm, passing the information in options.
  4. Digitally sign tbs using the privateKey and the the digital proof algorithm (e.g. JSON Web Proof using RSASSA-PKCS1-v1_5 algorithm). The resulting string is the proofValue.
  5. Add a proof node to output containing a data integrity proof using the appropriate type and proofValue values as well as all of the data in the proof options (e.g. created, and if given, any additional proof options such as domain).
  6. Return output as the signed data document.

Proof Verification Algorithm

This algorithm is highly specific to digital signatures and needs to be generalized to other proof mechanisms such as Equihash.

The following algorithm specifies how to check the authenticity and integrity of a signed data document by verifying its digital proof. This algorithm takes a signed data document, signed document and outputs a true or false value based on whether or not the digital proof on signed document was verified. Whenever this algorithm encodes strings, it MUST use UTF-8 encoding.

Specify how the public key can be obtained, through some out-of-band process and passed in or it can be retrieved by derefencing its URL identifier, etc.
  1. Get the public key by dereferencing its URL identifier in the proof node of the default graph of signed document. Confirm that the unsigned data document that describes the public key specifies its controller and that its controllers's URL identifier can be dereferenced to reveal a bi-directional link back to the key. Ensure that the key's controller is a trusted entity before proceeding to the next step.
  2. Let document be a copy of signed document.
  3. Remove any proof nodes from the default graph in document and save it as proof.
  4. Generate a canonicalized document by canonicalizing document according to the canonicalization algorithm (e.g. the URDNA2015 [[!RDF-DATASET-C14N]] algorithm).
  5. Create a value tbv that represents the data to be verified, and set it to the result of running the Create Verify Hash Algorithm, passing the information in proof.
  6. Pass the proofValue, tbv, and the public key to the proof algorithm (e.g. JSON Web Proof using RSASSA-PKCS1-v1_5 algorithm). Return the resulting boolean value.

Create Verify Hash Algorithm

This algorithm is too specific to digital signatures and needs to be generalized for algorithms such as Equihash.

The following algorithm specifies how to create the data that is used to generate or verify a digital proof. It takes a canonicalized unsigned data document, canonicalized document, canonicalization algorithm, a message digest algorithm, and proof options, input options (by reference). The proof options MUST contain an identifier for the public/private key pair, and an [[!ISO8601]] combined date and time string, created, containing the current date and time, accurate to at least one second, in Universal Time Code format. A domain might also be specified in the options. Its output is a data that can be used to generate or verify a digital proof (it is usually further hashed as part of the verification or signing process).

  1. Let options be a copy of input options.
  2. If the proofValue parameter, such as jws, exists in options, remove the entry.
  3. If created does not exist in options, add an entry with a value that is an [[!ISO8601]] combined date and time string containing the current date and time accurate to at least one second, in Universal Time Code format. For example: 2017-11-13T20:21:34Z.
  4. Generate output by:
    1. Creating a canonicalized options document by canonicalizing options according to the canonicalization algorithm (e.g. the URDNA2015 [[!RDF-DATASET-C14N]] algorithm).
    2. Hash canonicalized options document using the message digest algorithm (e.g. SHA-256) and set output to the result.
    3. Hash canonicalized document using the message digest algorithm (e.g. SHA-256) and append it to output.
  5. This last step needs further clarification. Signing implementations usually automatically perform their own integrated hashing of an input message, i.e. signing algorithms are a combination of a raw signing mechanism and a hashing mechanism such as RS256 (RSA + SHA-256). Current implementations of RSA-based data integrity proof suites therefore do not perform this last step before passing the data to a signing algorithm as it will be performed internally. The Ed25519Proof2018 algorithm also does not perform this last step -- and, in fact, uses SHA-512 internally. In short, this last step should better communicate that the 64 bytes produced from concatenating the SHA-256 of the canonicalized options with the SHA-256 of the canonicalized document are passed into the signing algorithm with a presumption that the signing algorithm will include hashing of its own.
    Note: It is presumed that the 64-byte output will be used in a signing algorithm that includes its own hashing algorithm, such as RS256 (RSA + SHA-256) or EdDsa (Ed25519 which uses SHA-512).
  6. Return output.

Security Considerations

The following section describes security considerations that developers implementing this specification should be aware of in order to create secure software.

TODO: We need to add a complete list of security considerations.

Verification Method Binding

Implementers must ensure that a verification method is bound to a particular controller by going from the verification method to the controller document, and then ensuring that the controller document also contains the verification method.

Verification Relationship Validation

Implementers need to ensure that when a verification method is used, that it matches the verification relationship associated with it and that it lines up with the proof purpose.

Canonicalization Method Correctness

Canonicalization mechanisms utilized for normalizing input to hashing functions need to have vetted mathematical proofs associted with them. Canonicalization mechanisms that create collisions in hash functions can be used to attack digital signatures.

Privacy Considerations

The following section describes privacy considerations that developers implementing this specification should be aware of in order to create privacy enhancing software.

TODO: We need to add a complete list of privacy considerations.

Unlinkability

When the contents of a digitally signed payload contains correlatable identifiers, those identifiers can be used to track individuals. A static digital signature is a correlatable identifier. There are digital signature schemes that provide uncorrelatable digital signatures.

Selective Disclosure

When the contents of a digitally signed payload contains correlatable identifiers, those identifiers can be used to track individuals. A static digital signature is a correlatable identifier. There are selective disclosure digital signature schemes, such as BBS+, that are capable of not disclosing correlatable identifiers and ensuring that a different but valid digital signature is re-created upon every presentation.