Skip to content

PKI Fundamentals

The substrate everything else in this atlas assumes. Skim if you’ve shipped a TSP. Read if you’ve ever been surprised by a “valid” certificate that wasn’t.

Every later chapter — eIDAS, post-quantum migration, AI agents in the trust stack, the EU AI Act mapping, the operational playbooks — treats X.509 path validation, revocation, and the canonical signing pipeline as load-bearing background. We resist the urge to compress this. Half the “AI signing” startups that broke between 2023 and 2025 broke because somebody skipped a CRL freshness check or silently soft-failed OCSP. The post-quantum chapter assumes you know what an AuthorityKeyIdentifier is and why it matters. The AI Act chapter assumes you can read a chain of trust without help.

This chapter exists so we never have to apologise for that assumption again.

We aim at engineers and architects who are competent in security broadly — TLS, JWT, OAuth — but not necessarily in the formal-PKI disciplines that ETSI, the IETF PKIX working group, and qualified trust-service providers grew up in. We give you enough vocabulary and enough judgement to read the rest of the atlas; we do not give you a CA-engineer certification. For the latter we recommend the PKI Consortium’s references and ETSI EN 319 411-1 directly.

1.2 The X.509 certificate as a structured assertion

Section titled “1.2 The X.509 certificate as a structured assertion”

A certificate is not a key. A certificate is a signed assertion that a particular public key belongs to a particular subject, that the assertion was made by a particular issuer, that it is valid in a particular time window, and that it can be used for particular purposes. The key is a participant in the assertion; it is not the assertion.

The on-the-wire format is ASN.1 DER (Distinguished Encoding Rules) following the structure laid down in RFC 5280. The top-level fields most engineers care about are:

FieldWhat it claims
versionAlways v3 in practice (02 01 02); pre-v3 fields are obsolete
serialNumberUnique within the issuer’s namespace; sometimes within the TSP
signatureAlgorithmThe algorithm used by the issuer to sign this certificate
issuerDistinguished Name of the certificate authority that issued it
validitynotBefore / notAfter — the time window of legitimacy
subjectDistinguished Name of the entity the certificate is about
subjectPublicKeyInfoThe public key being bound (RSA / ECDSA / ML-DSA / hybrid)
extensionsWhere most of the practical claims actually live

The pre-extension fields are the original 1988 X.509v1 frame. They are necessary but no longer sufficient for any modern decision: a real-world deployment hangs almost everything operationally significant on extensions. We treat the extensions individually because they are the difference between “validates” and “validates correctly.”

A Distinguished Name (DN) is an ordered sequence of attribute-value pairs. The encoding is hierarchical: country (C=), organisation (O=), organisational unit (OU=), common name (CN=), and others. A DN has exact-match semantics in PKI — O=Acme Corp and O=Acme Corp. (note the trailing period) are different organisations under EN 319 102-1’s path-validation rules. RFC 5280 specifies “compatibility-equality” rules to soften this, but operationally, the safest assumption is that DNs match if and only if the bytes match.

The Common Name is, historically, where servers put their hostname (“CN=www.example.com”). It is not where you should look for hostnames anymore. Modern TLS specifies that hostname matching uses the Subject Alternative Name extension; CN is an antiquarian fallback. Putting a hostname in CN is at best meaningless decoration and at worst a covert channel for hostnames the SAN does not list.

subjectPublicKeyInfo carries the key being bound, plus the algorithm identifier specifying how to interpret the bytes. Common algorithms in 2026:

  • RSA with 2048-bit (legacy minimum) or 3072-bit (preferred) keys. Still ubiquitous; will be ubiquitous for years.
  • ECDSA over P-256 or P-384. Smaller, faster, and (under most threat models) at least as strong.
  • EdDSA over Curve25519 (Ed25519). Common in modern code-signing and SSH; underrepresented in QTSP-grade certificates because qualified-signature regulations were slow to recognise it.
  • ML-DSA (Dilithium under FIPS 204). The NIST post-quantum signature standard. Begins appearing in certificates as part of hybrid schemes; pure ML-DSA certificates are rare in production.

When a certificate carries a hybrid public key (e.g. RSA + ML-DSA), the encoding is one of the still-evolving compositions described in Chapter 3. For now, treat hybrid certificates as a special case and verify both component keys independently.

Five extensions are load-bearing for routine validation. Skipping any of them is a security incident waiting to happen.

KeyUsage restricts the purposes the certificate can be used for. A certificate marked digitalSignature, nonRepudiation cannot be used for key-encipherment, no matter what the application thinks. A certificate without a KeyUsage extension is, depending on profile, either “all uses” or “must be rejected”; ETSI profiles require it, web-PKI profiles do not.

ExtendedKeyUsage (EKU) narrows further. id-kp-codeSigning means a certificate that may sign code; using it to sign documents violates the issuer’s policy and any verifier worth its OCSP fetch should reject. EKU stacking is multiplicative: a certificate with both id-kp-codeSigning and id-kp-emailProtection is good for signing both code and email, but only those two.

SubjectAlternativeName (SAN) is where modern hostnames, e-mail addresses, IP addresses, URIs, and other identifiers live. For TLS this is the only place hostnames may appear; CN-derived hostnames have been obsolete since RFC 6125 (2011) and formally banned in TLS-targeting CA/Browser Forum rules since 2017. For e-signature certificates, SAN often carries a pseudonym or a national identifier in directoryName form.

BasicConstraints declares whether the certificate is a CA (cA=TRUE) or an end-entity (cA=FALSE), and if CA, the maximum path length below it (pathLenConstraint). A leaf certificate without cA=FALSE is a programming error; a CA certificate with pathLenConstraint=0 may not issue further intermediate CAs. Verifiers that ignore BasicConstraints have, historically, been the path through which leaf-certificate compromise becomes infrastructure compromise. Always check it.

AuthorityKeyIdentifier (AKI) identifies the issuing key by a short hash, allowing efficient chain-building when a CA has rotated keys but kept the same DN. SubjectKeyIdentifier (SKI) does the same for the subject’s own key. The two together let a verifier construct chains by SKI/AKI matching rather than by DN matching alone — important when a CA reuses a DN across key rotations, which happens.

A handful of further extensions appear in production: AuthorityInfoAccess (AIA) carries URLs to the issuer’s certificate and OCSP responder; CRLDistributionPoints (CDP) carries URLs to CRLs; CertificatePolicies references the policy OIDs the certificate was issued under; NameConstraints lets a sub-CA be restricted to a particular DN tree. You will encounter all of them; you do not need to memorise their DER encoding.

A single certificate, on its own, is an unsupported claim. Path validation is the procedure by which a verifier decides whether to believe the claim, and it is the single most undertested piece of code in most PKI deployments.

The standard procedure is RFC 5280 §6, which is the single most important normative reference for anyone who writes verifying code. ETSI EN 319 102-1 then refines the procedure for European trust-service contexts (qualified signatures, AdES profiles); web-PKI verifiers follow CA/Browser Forum profiles that further constrain it. All three converge on the same shape:

  1. Build the chain. Starting from the leaf, walk upward by AKI/SKI matching (or DN matching as a fallback) until you reach a certificate whose subject and issuer are equal — that is a self-signed certificate, your candidate trust anchor.
  2. Validate against a configured trust anchor. A trust anchor is not a certificate that “exists in the chain”; it is a certificate the verifier has independently configured to trust. Trust anchors may be self-signed certificates, but they may also be raw subjectPublicKeyInfo blobs paired with metadata.
  3. For each certificate, in order from anchor to leaf:
    1. Verify the signature using the issuer’s public key (the parent in the chain).
    2. Verify the validity window includes the time of interest. For TLS, that is “now”. For document signatures, that is the claimed signing time, ideally cross-checked with an RFC 3161 timestamp (see Chapter 6).
    3. Verify the certificate has not been revoked at the time of interest (see §1.4).
    4. Verify the issuer’s BasicConstraints allows it to issue (this is what protects against leaf-as-CA).
    5. Process KeyUsage, ExtendedKeyUsage, NameConstraints, PolicyConstraints, PolicyMappings, CertificatePolicies, and the inhibit-policy-mapping bit. Most of these are inert in most chains, but you must process them in case one isn’t.
  4. Verify the leaf is appropriate for the use. TLS verifiers check hostname match against SAN; document-signature verifiers check subject identity and EKU.

Most production PKI bugs hide in step 3.iii (revocation) and step 3.v (policy processing). We discuss revocation in §1.4. Policy processing is well covered by RFC 5280 §6 and RFC 4158; we do not reproduce the algorithm here.

A common confusion is to treat the trust-anchor set as a property of “PKI”. It is not. It is a property of the verifier. The set of roots in a Mozilla browser, in /etc/ssl/certs, in an Estonian ID-card middleware, and in a payment terminal are four different sets of decisions. They overlap heavily but are not identical.

Engineering implication: if you write code that “trusts the system roots”, you are inheriting whatever curation discipline the system provides. For applications under regulatory obligation — anything in scope of eIDAS or the AI Act — this is rarely sufficient. You need a trust-anchor decision that the auditor can read and approve.

The single most common path-validation bug is to validate as of “now” when the question is “was the signature valid when made?”. The two are not the same. A certificate that was valid on 2024-04-15 (when the signature was made) may have expired or been revoked by 2026-05-05 (when you verify). Both are normal. The procedure that distinguishes them is long-term validation (LTV); ETSI EN 319 102-1 specifies it explicitly, and the AdES profiles (CAdES-LT, XAdES-LT, PAdES-LT) carry the captured revocation and timestamp data necessary to make the historical validation reproducible.

PAdES-LTA (long-term with archival) goes further: periodically, new timestamps are added over the entire signed structure to defend against algorithm deprecation. This is the operational pattern that makes a 30-year retention promise plausible. We discuss it in Chapter 6.

1.4 Revocation: CRLs vs. OCSP vs. stapling

Section titled “1.4 Revocation: CRLs vs. OCSP vs. stapling”

A certificate is valid until either it expires or its issuer declares it revoked. Revocation is the mechanism by which the issuer makes that declaration visible to verifiers. It is the hardest part of PKI to get right, and the one where production incidents most often occur (see the EJBCA OCSP-responder-key expiry incident discussed in Chapter 6).

A leaf certificate’s private key may be compromised. An employee may leave the company. A device may be retired. A subordinate CA may turn out to have issued certificates it should not have. In all of these cases the issuer needs to tell the world do not believe certificates with these serial numbers any more. Without a working revocation channel, the only defence against compromise is short certificate lifetimes — and short lifetimes have their own costs (rotation overhead, CT-log spam, churn in dependents).

A CRL is a signed list, published by an issuer, of revoked serial numbers and the reason and time of revocation. CRLs are good when the population of revoked certificates is small relative to the population of issued certificates, and when verifiers can tolerate fetching the entire list. They are bad when the issuer has many revoked certificates (a megabyte CRL is operationally awkward) or when verifiers are bandwidth-sensitive (mobile devices, embedded hardware).

Verifiers must fetch CRLs before relying on them, and the certificate’s CRLDistributionPoints extension typically tells them where. CRLs have a nextUpdate field that determines how fresh “fresh” is. A CRL whose nextUpdate is past should not be trusted; an issuer that lets its CRLs go stale is announcing that its revocation channel is down, even if the underlying certificates are fine.

RFC 6960 specifies OCSP as a request-response protocol for asking an issuer’s responder about the status of a single certificate. The verifier sends a CertID (issuer name hash, issuer key hash, serial number); the responder replies with good, revoked, or unknown, plus a signature.

OCSP is good for verifiers that need fresh status without paying the bandwidth cost of a full CRL. It is bad for verifier privacy (the responder learns which certificates the verifier is checking) and for issuer scalability (a busy responder is a busy HSM session). It also has a notorious soft-fail problem: many TLS clients, when the OCSP responder is unreachable, treat the unreachable response as “good”, reasoning that an attacker who can block the responder presumably could block the connection anyway. The reasoning is sound for TLS in some threat models; it is dangerous to extrapolate to other applications.

OCSP responses are signed by either the issuer’s CA key directly (rare, because exposing the CA key to an OCSP responder is operationally bad) or by a delegated OCSP responder certificate issued by the same CA. The responder certificate is itself subject to expiry and revocation; an OCSP responder whose own certificate has expired produces signed responses that no verifier should accept. This is the failure mode that takes infrastructure down when a maintenance team forgets the responder-cert renewal.

OCSP stapling moves the OCSP fetch from the verifier to the signer (or, in TLS, the server). The signer fetches a fresh OCSP response and includes it in the signature or TLS handshake. The verifier then validates the OCSP response without contacting the responder.

For document signatures, this is the pattern used by AdES profiles in long-term-validation modes: the OCSP response is captured at sign time and included in the signature container. This is the right default for AI evidence packages too, and is what the EATF .aep format does (carrying OCSP in the rev field; see the preprint companion to this atlas).

For TLS, OCSP stapling solves the responder-availability problem at scale and improves privacy by removing the verifier from the responder’s view. It is gradually displacing client-side OCSP for TLS endpoints, with the caveat that stapled responses must still be reasonably fresh.

A practical decision matrix:

ScenarioRecommendation
TLS, public servicesOCSP stapling, CT-log monitoring
Document signatures, short-term validationOCSP fetch at sign time, embed
Document signatures, long-term validationStapled OCSP + RFC 3161 TSA + LTA
AI evidence packages (this atlas)OCSP captured at sign time, in .aep rev field
Highly bandwidth-constrained verifiersCRL pre-fetched and cached centrally
Privacy-critical verifiersOCSP stapling or local CRL only

The crucial discipline, regardless of mechanism, is to capture revocation state at the moment of signing, not at the moment of verification. Verification at any time after signing must be able to resolve what the issuer believed at sign time, not what the issuer believes now. This is non-obvious to engineers coming from TLS, where the question is always “now”; it is foundational for any application that produces evidence intended for later audit.

A signature without canonical encoding is a signature waiting to fail. The canonical signing pipeline has four steps:

canonical payload → hash → sign → verify

Each step has subtle failure modes worth treating individually.

The same logical payload — say, a JSON object with three fields — can be encoded as different byte sequences depending on key order, whitespace, number formatting, character encoding, and any extension allowed by the format. Two encodings that look the same to a human can hash to different SHA-256 outputs. A signature made over one encoding will not validate against the other.

The fix is to specify a canonical encoding: one that, given a logical payload, produces exactly one byte sequence. Common canonical encodings:

  • DER (ASN.1). Used by X.509 itself, CAdES, RFC 3161 timestamps, CMS, S/MIME. Deterministic and old. Not human-readable; bring a parser.
  • Canonical XML / XML Exclusive Canonicalization. Used by XAdES. Notorious for subtle interop hazards between implementations; if you can avoid it, do.
  • Deterministic CBOR (RFC 8949 §4.2.1). Used by COSE, EATF .aep packages, several recent IETF protocols. Compact, fast to encode and decode, sane defaults.
  • JCS (JSON Canonicalization Scheme, RFC 8785). A canonical form for JSON. Useful when human-readability matters and you cannot escape JSON. Has its own pitfalls around number precision.

The cardinal sin is to “canonicalise by re-serialisation” — sign your payload, then re-encode it before storing it, on the theory that the verifier will re-canonicalise before hashing. The verifier may not. The verifier may use a different parser. The verifier may be running on a different operating system. Do not rely on re-canonicalisation; sign exactly the bytes you are willing to verify against, and store exactly those bytes.

The hash function reduces an arbitrary-length payload to a fixed- length digest that the signature operation can sign. SHA-256 is the contemporary baseline; SHA-3 (Keccak) and SHA3-256 specifically are good alternatives where collision-resistance margins above SHA-256 are wanted. SHA-1 is broken for collision-resistance and must not be used in new constructions; certificates that still use SHA-1 in their signatureAlgorithm are legacy artefacts and should be aggressively migrated.

For very large payloads, consider chunked hashing or Merkle-tree constructions, but be aware that these change the verifier’s algorithm and require explicit profile support.

The sign step uses the signer’s private key to produce a signature over the hash. The pairing of (hash, signature algorithm) defines the signature scheme: SHA-256 + RSA-PSS, SHA-256 + ECDSA-P256, SHA-256 + Ed25519, SHA-256 + ML-DSA-65 (for the ML-DSA “deterministic” mode), etc. RSA-PSS is the modern choice for RSA: it includes randomised padding that defends against several attacks against the older PKCS#1 v1.5 padding.

For hybrid signatures, the sign step is repeated under each scheme, and the resulting signatures are bundled together. The verifier’s policy decides whether to require all, any, or a specific subset. See the EATF .aep schema in the preprint for one concrete encoding.

The verifier reverses the chain: parse the signature container, canonicalise the payload, hash it, run the verify primitive against the signer’s public key, and check that the certificate path discussed in §1.3 and §1.4 endorses that public key for that purpose at that time.

Common verifier bugs:

  • Trust the encoded hash. Some signature containers carry both the digest and the payload. A naive verifier hashes the digest field rather than re-hashing the payload — the signature validates, but the signed thing was the digest, not the payload. Always re-hash the payload independently.
  • Skip OCSP because “the signature is valid”. A signature can be cryptographically valid against a revoked certificate. The validity of the signature is necessary, not sufficient.
  • Treat unknown extensions as harmless. RFC 5280 specifies a critical flag on extensions. An unknown critical extension must cause the verifier to reject. Many verifiers do not enforce this.

A short, opinionated list of failure modes the authors have personally seen in production deployments — including their own. The PKI-incident genre is well-developed; a fuller catalogue appears in Chapter 6.

“TLS validates everything.” TLS validates the chain, the hostname, and (with stapling or live OCSP) the revocation state. It does not validate the purpose: a TLS server certificate is not necessarily allowed to sign code, and using it that way is either a programming error or a policy violation. Read the EKU.

Soft-fail OCSP. Treating an unreachable OCSP responder as “the certificate is good” is a documented anti-pattern. In TLS, soft-fail is a policy decision with explicit threat-model justification. In document and AI-evidence signing, where the audit question is asked years later, soft-fail at sign time silently produces unauditable evidence. Capture OCSP at sign time and store it; if you cannot capture it, do not sign.

Trust on first use creep. Trust anchors should be configured explicitly, audited periodically, and rotated deliberately. “Whatever the OS shipped” is not an audited trust-anchor decision. For applications under regulatory obligation, this position is indefensible.

SAN-only thinking. Hostnames belong in SAN. Identities for e-signatures may belong in SAN’s directoryName form, or in the Subject DN, depending on profile. Do not assume one place; read the profile.

Not capturing OCSP at sign time. Mentioned above; mentioned again because it is the single most expensive mistake we have observed. The cost compounds: every signed artefact made under this mistake is, in retrospect, a signed artefact whose validity cannot be reconstructed. Migration is not always possible.

Ignoring path-length constraints. A pathLenConstraint=0 on an intermediate CA prevents the issuance of further intermediates. Verifiers that ignore this constraint inherit a much larger trust surface than the issuer intended.

Mismatched canonical forms between signer and verifier. A signer using JCS-canonicalised JSON and a verifier using ad-hoc re-serialisation will produce a signature that fails to validate on payloads that look identical. Specify the canonical form in the protocol.

The four operational diagrams of this chapter are rendered inline below as Mermaid sources, then re-cast as polished SVGs through MiMo-V2-Omni once the grant is active. The Mermaid versions are load-bearing: they validate against the same schematic content the SVGs will, so a reader who only ever sees this page still gets the full picture.

flowchart TB
    classDef anchor fill:#1e40af,stroke:#1e40af,color:#fff
    classDef cert fill:#fef3c7,stroke:#92400e,color:#1f2937
    classDef check fill:#dbeafe,stroke:#1e40af,color:#1e3a8a

    A[Trust Anchor <br>self-signed root<br>cA=TRUE]:::anchor
    A -->|signature verified| C1[Per-step checks]:::check
    C1 --> B[Intermediate CA <br>Issuer = Root<br>cA=TRUE pathLen ≥ 0]:::cert
    B -->|signature verified| C2[Per-step checks]:::check
    C2 --> L[Leaf Certificate <br>Issuer = Intermediate<br>cA=FALSE]:::cert
    L --> C3[Final checks: revocation, KeyUsage, EKU, SAN match]:::check
    C3 --> R[Decision: VALID or INVALID]

Per-step checks at each link include: signature verifies under the parent’s public key; validity window covers the time of interest; the parent has not been revoked at that time; the parent’s BasicConstraints permits issuance; KeyUsage allows certificate signing; NameConstraints permit the next subject DN; policy processing as required.

flowchart LR
    classDef good fill:#dcfce7,stroke:#15803d,color:#14532d
    classDef warn fill:#fef9c3,stroke:#a16207,color:#713f12
    classDef bad fill:#fee2e2,stroke:#b91c1c,color:#7f1d1d

    subgraph CRL["CRL — legacy"]
        V1[Verifier] -->|fetch CRL| I1[Issuer pub.]
        I1 --> V1
        V1 --> D1{serial in CRL?}
    end

    subgraph OCSP["OCSP — live"]
        V2[Verifier] -->|CertID| R2[OCSP Responder]
        R2 -->|status| V2
        V2 --> D2{good / revoked / unknown}:::warn
    end

    subgraph STAP["OCSP stapling"]
        S3[Server / Signer] -->|fetch| R3[OCSP Responder]
        R3 --> S3
        S3 -->|stapled OCSP in TLS / signature| V3[Verifier]:::good
    end

Stapling is the operationally preferred default for new deployments: fresh status, verifier privacy, tolerates responder outages. AI evidence packages (.aep) follow the document-signature pattern — capture OCSP at sign time, embed it in the package, never re-fetch live unless deployer policy requires.

The four-stage pipeline, with the canonical-encoding choice as the first decision.

flowchart LR
    P[payload <br> canonical encoding] --> H[hash <br> SHA-256 / SHA3-256]
    H --> S[sign <br> RSA-PSS / ECDSA / Ed25519 / ML-DSA / hybrid]
    S --> V[verify <br> parse + rehash + signature validate]

    P -. choose .-> P1[DER ASN.1 <br> Canonical XML <br> Deterministic CBOR <br> JCS RFC 8785]
    H -. choose .-> H1[SHA-256 default]
    S -. choose .-> S1[per algorithm + key]
    V -. checks .-> V1[path + revocation + time]

The cardinal anti-patterns: re-canonicalising before verification; trusting the encoded hash instead of re-hashing the payload; treating unknown critical extensions as harmless; soft-failing on OCSP unreachability without an explicit policy decision.

1.7.4 X.509 certificate anatomy (informal)

Section titled “1.7.4 X.509 certificate anatomy (informal)”

A certificate is a structured signed assertion. The signed body (tbsCertificate) contains the version, the (per-issuer) serial number, the signature-algorithm choice, the issuer DN, the validity window, the subject DN, the subject public key info, and a list of extensions. The five load-bearing extensions for everyday validation — KeyUsage, ExtendedKeyUsage, SubjectAlternativeName, BasicConstraints, and the AuthorityKeyIdentifier / SubjectKeyIdentifier pair — are highlighted in the V2-Omni-bound SVG; here we list them in prose and refer the reader to §1.2 for the discussion.

  • RFC 5280 — Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile, 2008. The load-bearing PKIX standard.
  • RFC 6960 — X.509 Internet Public Key Infrastructure Online Certificate Status Protocol — OCSP, 2013.
  • RFC 6125 — Representation and Verification of Domain-Based Application Service Identity within Internet Public Key Infrastructure Using X.509 (PKIX) Certificates in the Context of Transport Layer Security (TLS), 2011. SAN-based hostname matching.
  • RFC 8949 — Concise Binary Object Representation (CBOR), 2020. Including §4.2.1 deterministic encoding.
  • RFC 8785 — JSON Canonicalization Scheme (JCS), 2020.
  • ETSI EN 319 102-1 — Procedures for Creation and Validation of AdES Digital Signatures. The European refinement of the RFC 5280 procedure for AdES contexts.
  • CA/Browser Forum — Baseline Requirements for the Issuance and Management of Publicly-Trusted Certificates. The web-PKI refinement.
  • Chapter 2 · eIDAS adds a regulatory layer on top of the primitives above.
  • Chapter 3 · Post-quantum migration replaces the signature-scheme box in the canonical pipeline.
  • Chapter 6 · Operational playbooks for everything that goes wrong in §1.6.