Internet-Draft | Detecting RRDP Session Desynchronization | August 2024 |
Snijders & de Kock | Expires 7 February 2025 | [Page] |
This document describes an approach for Resource Public Key Infrastructure (RPKI) Relying Parties to detect a particular form of RPKI Repository Delta Protocol (RRDP) session desynchronization and how to recover.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 7 February 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Resource Public Key Infrastructure (RPKI) Repository Delta Protocol (RRDP) [RFC8182] is a one-way synchronization protocol for distributing RPKI data in the form of differences (deltas) between sequential repository states. Relying Parties apply a contiguous chain of deltas to synchronize their local copy of the repository with the current state of the remote Repository Server. Delta files for any given session_id and serial number are expected to contain an immutable record of the state of the Repository Server at that given point in time, but this is not always the case.¶
This document describes an approach for Relying Parties (RPs) to detect a particular form of RRDP session desynchronization and how to recover.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Section 3.1 of [RFC8182] describes how discrete publication events such as the addition, modification, or deletion of one or more repository objects can be communicated as immutable files, highlighting advantages for publishers, such as the ability to pre-calculate files and make use of caching infrastructure.¶
While the global RPKI is understood to present a loosely consistent view, depending on timing, updating, fetching (see Section 6 of [RFC7115]), different caches having different data for the same RRDP session at the same serial violates the principle of least astonishment.¶
If an RRDP server over time serves differing data for a given session_id and serial number, distinct RP instances (depending on the moment they connected to the RRDP server) would end up with divergent local repositories. Comparing only the server-provided session_id and latest serial number across distinct RP instances would not bring such divergence to light.¶
The [RFC8182] specification does allude to immutability being a property of RRDP files, but doesn't make it clear that immutability is an absolute requirement for the RRDP protocol to work well. A future update to [RFC8182] should set a hard rule to establish that the immutability of RRDP files must not be violated after publication, and that RPs should check for unexpected mutations.¶
Relying Parties can implement a mechanism to keep a record of the serial and hash attribute values in delta elements of the previous successful fetch of an Update Notification File. Then, after fetching a new Update Notification File, the Relying Party should compare if the serial and hash values of previously seen serials match those in the newly fetched file. If any differences are detected, this means that the Delta files were unexpectedly mutated, and the RP should proceed to Section 4.¶
RP implementations decide on the number of Delta Files to process before switching to downloading the latest Snapshot File. The same upper bound can be used as a limit to the number of delta element serial and hash values to track.¶
This section contains two versions of an Update Notification File to demonstrate an unexpected mutation. The initial Update Notification File is as follows:¶
<notification xmlns="http://www.ripe.net/rpki/rrdp" version="1" session_id="fe528335-db5f-48b2-be7e-bf0992d0b5ec" serial="1774"> <snapshot uri="https://rrdp.example.net/1774/snapshot.xml" hash="4b5f27b099737b8bf288a33796bfe825fb2014a69fd6aa99080380299952f2e2"/> <delta serial="1774" hash="effac94afd30bbf1cd6e180e7f445a4d4653cb4c91068fa9e7b669d49b5aaa00" uri="https://rrdp.example.net/1774/delta.xml" /> <delta serial="1773" hash="731169254dd5de0ede94ba6999bda63b0fae9880873a3710e87a71bafb64761a" uri="https://rrdp.example.net/1773/delta.xml" /> <delta serial="1772" hash="d4087585323fd6b7fd899ebf662ef213c469d39f53839fa6241847f4f6ceb939" uri="https://rrdp.example.net/1772/delta.xml /> </notification>¶
Based on the above Update Notification File, an RP implementation could record the following state:¶
fe528335-db5f-48b2-be7e-bf0992d0b5ec 1774 effac94afd30bbf1cd6e180e7f445a4d4653cb4c91068fa9e7b669d49b5aaa00 1773 731169254dd5de0ede94ba6999bda63b0fae9880873a3710e87a71bafb64761a 1772 d4087585323fd6b7fd899ebf662ef213c469d39f53839fa6241847f4f6ceb939¶
A new version of the Update Notification File is published, as following:¶
<notification xmlns="http://www.ripe.net/rpki/rrdp" version="1" session_id="fe528335-db5f-48b2-be7e-bf0992d0b5ec" serial="1775"> <snapshot uri="https://rrdp.example.net/1775/snapshot.xml" hash="cd430c386deacb04bda55301c2aa49f192b529989b739f412aea01c9a77e5389"/> <delta serial="1775" hash="d199376e98a9095dbcf14ccd49208b4223a28a1327669f89566475d94b2b08cc" uri="https://rrdp.example.net/1775/delta.xml /> <delta serial="1774" hash="10ca28480a584105a059f95df5ca8369142fd7c8069380f84ebe613b8b89f0d3" uri="https://rrdp.example.net/1774/delta.xml" /> <delta serial="1773" hash="731169254dd5de0ede94ba6999bda63b0fae9880873a3710e87a71bafb64761a" uri="https://rrdp.example.net/1773/delta.xml" /> </notification>¶
Using its previously recorded state (Section 3.1), the RP can compare the hash values for serials 1773 and 1774. For serial 1774, compared to the earlier version of the Update Notification File, a different hash value is now listed, meaning an unexpected delta mutation occurred.¶
Following the detection of RRDP session desynchronization, the RP implementation SHOULD issue a warning and SHOULD download the latest Snapshot File and process it as described in Section 3.4.3 of [RFC8182].¶
Due to the lifetime of RRDP sessions (often measured in months), desynchronization can persist for an extended period if undetected.¶
Caches in a desynchronized state pose a risk by emitting a different set of Validated Payloads than they would otherwise emit with a consistent repository copy. Through the interaction of the desynchronization and the failed fetch mechanism described in Section 6.6 of [RFC9286], Relying Parties could spuriously omit Validated Payloads or emit Validated Payloads that the Certification Authority intended to withdraw. In a desynchronized state, all bets are off.¶
Missing Validated Payloads negatively impact the ability to validate BGP announcements using mechanisms such as those described in [RFC6811] and [I-D.ietf-sidrops-aspa-verification].¶
Section 6.6 of [RFC9286] advises RP implementations to continue to use cached versions of objects, but only until such time as they become stale. By detecting whether the remote Repository Server is in an inconsistent state and then immediately switching to using the latest Snapshot File, RPs increase the probability to successfully replace objects before they become stale.¶
No IANA actions required.¶
During the hallway track at RIPE 86, Ties de Kock shared the idea for detecting this particular form of RRDP desynchronization, after which Claudio Jeker, Job Snijders, and Theo Buehler produced an implementation based on rpki-client. Equipped with tooling to detect this particular error condition, in subsequent months it became apparent that unexpected delta mutations in the global RPKI repositories do happen from time to time.¶
The authors wish to thank Theo Buehler, Mikhail Puzanov, Alberto Leiva, Tom Harrison, and Warren Kumari for their careful review and feedback on this document.¶
This section is to be removed before publishing as an RFC.¶
This section records the status of known implementations of the protocol defined by this specification at the time of posting of this Internet-Draft, and is based on a proposal described in RFC 7942. The description of implementations in this section is intended to assist the IETF in its decision processes in progressing drafts to RFCs. Please note that the listing of any individual implementation here does not imply endorsement by the IETF. Furthermore, no effort has been spent to verify the information presented here that was supplied by IETF contributors. This is not intended as, and must not be construed to be, a catalog of available implementations or their features. Readers are advised to note that other implementations may exist.¶
According to RFC 7942, "this will allow reviewers and working groups to assign due consideration to documents that have the benefit of running code, which may serve as evidence of valuable experimentation and feedback that have made the implemented protocols more mature. It is up to the individual working groups to use this information as they see fit".¶