Internet-Draft The Restatement Anti-Pattern March 2024
Bormann Expires 3 September 2024 [Page]
Workgroup:
Network Working Group
Internet-Draft:
draft-bormann-restatement-01
Published:
Intended Status:
Informational
Expires:
Author:
C. Bormann
Universität Bremen TZI

The Restatement Anti-Pattern

Abstract

Normative documents that cite other normative documents often restate normative content extracted out of the cited document in their own words.

The present memo explains why this can be an Antipattern, and how it can be mitigated.

About This Document

This note is to be removed before publishing as an RFC.

Status information for this document may be found at https://datatracker.ietf.org/doc/draft-bormann-restatement/.

Source for this draft and an issue tracker can be found at https://github.com/cabo/restatement.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 3 September 2024.

Table of Contents

1. Introduction

Normative documents that cite other normative documents often restate normative content extracted out of the cited document in their own words.

The present memo explains why this can be an Antipattern [KOENIG][ANTIPATTERN], and how it can be mitigated.

1.1. Conventions and Definitions

Although this document is not an IETF Standards Track publication, it adopts the conventions for normative language to provide clarity of instructions to the implementer. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [BCP14] when, and only when, they appear in all capitals, as shown here.

2. The Restatement Anti-Pattern

A Restatement is the attempted expression of information that is already expressed elsewhere.

In this document, we are mostly concerned with Normative Restatements, i.e., statements that are intended (or look like they are intended!) to be normative.

Restatements are rarely verbatim copies of the original statement and the context needed to interpret that, so they tend to introduce uncertainty about the interpretation of the restatement.

Authors often presume a reader is well-versed enough to infer that such an uncertainty (or outright contradiction) is not intended and how it is to be resolved. There is little reason to believe this is actually the case.

An internal restatement is a restatement of information that has been provided previously in the document under discussion. Note that an unambiguous internal reference is not a restatement, as it points to the original text and its context. (There may still be uncertainties how to interpret the internal restatement in the additional context.)

A reference is unambiguous if the previous passage is clearly identified and delimited.

An external restatement is a restatement of information that has been provided in (one or more) external documents. Here there is increased danger of an unclear scope of the reference, often by pointing to an entire document where only a specific passage is actually intended to be referenced.

Restatements can be entirely hidden, i.e., there is no indication that information given is a restatement. Restatements can also be explicit by clearly being identified as such, typically no longer using normative language.

Restatements can be an Anti-Pattern because they can be "a common response to a recurring problem that is usually ineffective and risks being highly counterproductive" [WP-ANTIPATTERN]. Section 3 discusses the recurring problems as perceived by document authors, Section 4 explains why restatements can be ineffective and counterproductive, and Section 5 discusses how to use restatements in a way that is not.

3. Reasons for making restatements

There are many reasons that cause document authors to include restatements in their work, many of which are actually good reasons once the perils of restatements are properly managed.

3.1. Integrating a Complicated Base Standard Ecosystem

Sometimes the source of the actual normative statement is complex and would require considerable time to digest. A simplifying restatement tries to shield the reader by rephrasing or summarizing information from that source.

Such a restatement can be of good intention, or it can try to hide complexity that the referencing document actually does make required to incur.

3.2. Trying to be a Textbook for the Implementer

More generally, a restatement can attempt to be a directly useful source for an implementer or user of a standard, e.g., by giving a mere checklist of items (not necessarily complete!) that must be implemented instead of actually identifying where the requirement and possibly its finer points come from.

3.3. Increasing Availability from a Source with Restricted Access

In some cases, normative information from a cited document is not openly available, but only under specific conditions that cannot be expected to be satisfied by all users of the referencing document, such as membership in an organization or payment of a non-trivial fee. It may be appropriate to restate information from such a source so the referencing document becomes useful.

3.4. Trying to Raise Attention to a Detail Deemed Surprising

The author of the referencing document may see a need to alert the reader to a detail of the cited document that might seem unintuitive (i.e., not familiar) to the author. By restating the detail in terms more familiar to readers of the referencing document, this alert can be more useful.

3.5. Limitations in Formal Description Techniques

Formal description techniques (FDT, such as ABNF) are usually designed to document a single specific artifact, not its evolution or its embedding into another artifact. This can lead to wholesale imports of FDT material, without indication whether just the FDT was imported or whether the importing document is intended to evolve with the donor document. See Section 6.1 and Section 6.3 for additional illustration of this reason.

4. Perils of restatements

The danger of restatements is that they might not be exactly expressing the same normative statement that the cited document makes.

One form of this is the incomplete restatement.

Abridged copies of a normative statement from the cited document often leave open whether the abridgment is intentional: Is the referencing document only importing some of the requirements of the cited document? In the worst case, the restatement may appear to be forking an ecosystem, i.e., an implementation of the cited document cannot be used because it makes additional constraints that are not meant to be included in the referencing document. (This peril of course is also present with intentional changes to the normative statements in a cited document, but is likely to receive much more attention during review.)

Section 6.5 presents an example for the situation where a reader might infer behavior based on the common law statute interpretation rule:

which states that the reader is to presume that expressly referencing one matter implies that other similar matters are intentionally not mentioned and therefore are excluded. This is particularly problematic with abridged statements, where this rule may be invoked by the reader without an author being aware of it.

Restatements may be slightly semantically different from the cited document, in particular if the latter is based on a relatively inaccessible (possibly poorly documented or poorly developed) terminology. Both authors and readers may not be aware that they need to use tools that are commonplace in the ecosystem of the cited document.

A large danger originates from restatements that are unclear whether a new normative requirement is intended or a just a restatement of known normative requirements of the cited document. This is, of course, particularly dangerous for hidden restatements.

A restatement can cause maintainability hazards, as illustrated in Section 6.1; it also can cause a referencing document to decouple from the ecosystem of a cited document once that is repaired (Section 6.3).

Finally, to readers familiar with the cited document, the restatement can be surprising; if there really is no information in the restatement, the reader automatically searches for a specific reason this restatement is made and starts to reinterpret it until it means something specific that would justify its presence. If the restatement is not clearly identified as such, this is likely to cause misinterpretations, as if the usage envisioned attempts to fork the cited ecosystem. (Often, the people who need to interpret the document in question are actually more familiar with the cited document and the surrounding ecosystem than the authors of the referencing document, who may just be pulling in the ecosystem to solve one of their problems.)

5. Defusing restatements

A general recommendation for readers of a referencing document is that they should try to detect restatements and read them in full knowledge of their perils (Section 4). If a resolution is required, the RFC errata process may provide a (poor) mechanism to obtain the resolution and ensure it is documented in the context of the referencing document. Mailing list discussions are also a good way to obtain a resolution, but for additional readers they can be hard to find, and, when found, it can be hard to extract any consensus that was formed.

The rest of this section provides a summary of the recommendations made by this document, employing RFC2119 keywords as an instruction to the potential implementers of this document, i.e., document authors and reviewers.

Much of the danger of restatements can be averted if they are sufficiently identified by the authors as such.

If a larger copy from a cited document is made, it SHOULD be made verbatim and differences introduced deliberately should be explicitly identified, possibly in a second step. Note that the FDT mechanisms and their evolution can make verbatim copies less useful, in which case a systematic approach of first copying and fixing and then, if necessary, modifying can help the reader. For instance, [RFC2397] uses a variant form of ABNF that can be parsed only once the variant ":=" syntax is replaced by "=". (This is an active specification and was cited as recently as in Section 4.3 of [RFC9399], which provides a clearly identified restatement in modern ABNF, with errata applied and rules referenced from elsewhere added [we ignore the innocuous redefinition of "hex" from "HEXDIG"].)

By making the copy informative, repairs from the base document (in the [RFC2397] example e.g. [errata2397]) can be imported, even future ones.

Where the copy is made because the cited document is not openly available, this also often requires more processing than a verbatim copy, increasing the probability of introducing errors and misunderstandings. This can be somewhat mitigated by clearly stating the purpose of a restatement, and the intended result when the restatement and the original diverge.

5.1. Summary of Recommendations

(...Add nice checklist text for authors and reviewers based on Section 5 later...)

6. Examples

6.1. Example: Web linking [RFC8288]

This example is about an internal, FDT-induced restatement in [RFC5988], which turned into an external restatement in [RFC6690], which was not healed by the update to [RFC5988] in [RFC8288].

Section 5 of [RFC5988] defines a serialization of web links in a Link Header Field. A link can have zero of more link-param parameters, each of which has the form (simplified):

link-extension = parmname [ "=" ( ptoken / quoted-string ) ]

So link-extensions can always be written as a quoted-string, or, alternatively, without quotes as a ptoken if the more limited character repertoire of ptokens suffices.

However, [RFC5988] also defines the specifics of a few link parameters. When simply inserting this into the overall ABNF, the ABNF given for these link parameters needs to restate the ABNF for link parameters in their common syntax (simplified):

link-param  = ( "rel" "=" relation-types )
            / ( "anchor" "=" <"> URI-Reference <"> )
            / ( "rev" "=" relation-types )
            / ( "hreflang" "=" Language-Tag )
            / ( "media" "=" ( MediaDesc / ( <"> MediaDesc <"> ) ) )
            / ( "title" "=" quoted-string )
            / ( "type" "=" ( media-type / quoted-mt ) )

This restatement loses the intended choice between ptoken and quoted-string for many predefined link parameters, only keeping it for "media" and "type" (and "rel" in the definition of relation-types, which is arguably faulty by allowing non-ptoken characters in an unquoted URI).

One could say that this restatement was caused by a limitation of ABNF: ABNF cannot separately express both the overall syntax of link-params (which yields the link-param value)) and the specific syntax for the predefined link-params, contaminating the former with the latter. The specific syntax would really need to be in terms of the value yielded as opposed to restating the link-param syntax that yields the value.

Section 3 of [RFC8288] finally repairs this:

link-param = token BWS [ "=" BWS ( token / quoted-string ) ]

Note that any link-param can be generated with values using either the token or the quoted-string syntax; therefore, recipients MUST be able to parse both forms. In other words, the following parameters are equivalent:

x=y
x="y"

Previous definitions of the Link header did not equate the token and quoted-string forms explicitly; the title parameter was always quoted, and the hreflang parameter was always a token. Senders wishing to maximize interoperability will send them in those forms.

Individual link-params specify their syntax in terms of the value after any necessary unquoting (as per [RFC7230], Section 3.2.6).

Unfortunately, [RFC6690] adds an external restatement copying from [RFC5988] in defining a few more link-params (simplified):

link-param     = ( "rel" "=" relation-types )  ; ...
               / ( "type" "=" ( media-type / quoted-mt ) )
               / ( "rt" "=" relation-types )
               / ( "if" "=" relation-types )
               / ( "sz" "=" cardinal )
cardinal       = "0" / ( %x31-39 *DIGIT )

The letter of this specification for instance prohibits sz="47" (requiring this to be represented as sz=47). The repair in [RFC8288] cannot quite fix this as:

  • it is not clear that the repair actually applies to [RFC6690] (a general problem with updated ["obsoleted"] references)

  • the ABNF in [RFC6690] would need to be rewritten to apply the rule cardinal to the extracted value of the link-param.

6.2. Example: Restatement of [ISO8601:1988] in [RFC3339]

[RFC3339] was largely intended as a freely available restatement of the paywalled [ISO8601:1988], with focus added on formally defining the parts that might be useful in the Internet. However, when [ISO8601:2000] introduced additional text that seemed to disallow the syntax used for one extension that Section 4.3 of [RFC3339] had made to the semantics of [ISO8601:1988], the precedence remained unclear. Implementers of Internet-related standards largely ignored the additional semantics of that extension anyway, while implementers of [ISO8601:1988] in general often performed input validation that made sure the extension made by [RFC3339] wouldn't work. (This is only now being addressed by Section 2 of [I-D.ietf-sedate-datetime-extended].)

6.3. Example: Date-Time in YANG (RFC6991)

[RFC6991] defines a YANG type date-and-time on page 11, restating parts of [RFC3339] (the restatement is also faulty in its item (b), with an attempted cleanup in [I-D.ietf-netmod-rfc6991-bis]). Now that [RFC3339] is being bug-fixed via Section 2 of [I-D.ietf-sedate-datetime-extended], it is not clear whether the change applies to the YANG type as well. This is more of a problem for YANG than it might be otherwise, as it might trigger the YANG concept of a "non-backwards-compatible" change to that datatype — a problem that is not entirely caused by restatements but gets much harder to discuss.

6.4. Example: ACME for Subdomains (RFC9444-to-be)

A late draft of what became [RFC9444] defines a new feature added to [RFC8555], referencing the base standard in a number of places.

Reviewing the draft [I-D.draft-ietf-acme-subdomains-04], [acme-comment] states:

  • ## restatement vs. new normative content

    Providing a specification of a new feature added to ACME, the text explains a number basic ACME mechanisms that are relevant to this specification.

    One pervasive problem is that these restatements of RFC 8555 content are not always easy to distinguish from new, normative statements made by this document. E.g., 4.2 contains a statement about "is defined" that is part of a paragraph restating RFC 8555 -- this one, however, appears to be new normative content. (Languagetool diagnoses overuse of passive voice, which exacerbates this problem.)

    (The first paragraph of section 4 repeats the last paragraph of section 3. But that is not a problem; redundancy can be good if it improves the flow, and this is clearly labeled as a restatement.) The introduction of section 4 is a summary/restatement of RFC 8555; section 4.1 introduces new normative content without warning (and leads the reader astray by actually referencing RFC 8555).

(These problems were ultimately addressed in [RFC9444].)

6.5. Example: Base64 Encoding variants in draft-ietf-rats-eat-20

Base64 encoding is defined in [RFC4648], but comes in a number of variants. These often have default settings that are to be used "unless the specification referring to this document explicitly states otherwise" (e.g., Section 3.2 of [RFC4648]).

Documents that reference [RFC4648] normatively are surprisingly often sloppy in doing so. Not [I-D.draft-ietf-rats-eat-20]: Its Section 2 (terminology) defines the term "Base64url Encoding”, referencing [RFC4648] as well as [RFC7515] to fill in the open questions from [RFC4648] (i.e., Section 5 and not Section 4, no '=' padding that would be default, no extra characters).

While this was a good start, incomplete restatements in the following text cause a problem [rats-comment]:

A term "base64url encoded” [...] is used in multiple places. One of the places restates its own reference to RFC 4648, but doesn’t restate the reference to RFC 7515 and the text required with that. This restatement is very misleading as it strongly implies RFC 7515 is not used here; the reference needs to be removed. In the other places I find the term is simply used, which assumes the reader will think to look up the term in the terminology [...]

(The problem in the draft was quickly addressed in the next revision.)

7. Security Considerations

Restatements about security requirements and properties can create the same uncertainties and interoperability problems as restatements in other contexts. Security considerations sections have turned out to be an attractor for such problems. They are meant "both to encourage document authors to consider security in their designs and to inform the reader of relevant security issues" (Section 1 of [RFC3552]). In practice, they tend to be the first point in a document that security issues are considered at all, so they often both contain normative statements that are nowhere else in the document and security-conscious restatements of other normative statements in the document, the latter with all the perils that this memo is about. The fact that security considerations sections are often heavily fleshed out during IESG processing can exacerbate the problem.

8. IANA Considerations

This document has no IANA actions.

9. References

9.1. Normative References

[BCP14]
Best Current Practice 14, <https://www.rfc-editor.org/info/bcp14>.
At the time of writing, this BCP comprises the following:
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.

9.2. Informative References

[acme-comment]
Bormann, C., "[Last-Call] Artart last call review of draft-ietf-acme-subdomains-04", , <https://mailarchive.ietf.org/arch/msg/last-call/v0RYQkByhAII9yvaD6gbKWx0WtA>.
[ANTIPATTERN]
"Anti Pattern", C2 Wiki (Last edited:), , <http://wiki.c2.com/?AntiPattern>.
[errata2397]
"RFC Errata Report » RFC Editor", search result, n.d., <https://www.rfc-editor.org/errata/rfc2397>.
[I-D.draft-ietf-acme-subdomains-04]
Friel, O., Barnes, R., Hollebeek, T., and M. Richardson, "ACME for Subdomains", Work in Progress, Internet-Draft, draft-ietf-acme-subdomains-04, , <https://datatracker.ietf.org/doc/html/draft-ietf-acme-subdomains-04>.
[I-D.draft-ietf-rats-eat-20]
Lundblade, L., Mandyam, G., O'Donoghue, J., and C. Wallace, "The Entity Attestation Token (EAT)", Work in Progress, Internet-Draft, draft-ietf-rats-eat-20, , <https://datatracker.ietf.org/doc/html/draft-ietf-rats-eat-20>.
[I-D.ietf-netmod-rfc6991-bis]
Schönwälder, J., "Common YANG Data Types", Work in Progress, Internet-Draft, draft-ietf-netmod-rfc6991-bis-15, , <https://datatracker.ietf.org/doc/html/draft-ietf-netmod-rfc6991-bis-15>.
[I-D.ietf-sedate-datetime-extended]
Sharma, U. and C. Bormann, "Date and Time on the Internet: Timestamps with additional information", Work in Progress, Internet-Draft, draft-ietf-sedate-datetime-extended-11, , <https://datatracker.ietf.org/doc/html/draft-ietf-sedate-datetime-extended-11>.
[ISO8601:1988]
ISO, "Data elements and interchange formats — Information interchange — Representation of dates and times", ISO 8601:1988, , <https://www.iso.org/standard/15903.html>. Also available from <⁠https://nvlpubs.nist.gov/nistpubs/Legacy/FIPS/fipspub4-1-1991.pdf>.
[ISO8601:2000]
ISO, "Data elements and interchange formats — Information interchange — Representation of dates and times", ISO 8601:2000, , <https://www.iso.org/standard/26780.html>.
[KOENIG]
Koenig, A., "Patterns and Antipatterns", J. Object Oriented Program. 8(1): pp. 46-48, .
[rats-comment]
Bormann, C., "Re: [Rats] I-D Action: draft-ietf-rats-eat-20.txt", n.d., <https://mailarchive.ietf.org/arch/msg/rats/H8qXwQywD0W6x4QcC9Iwd5LYl2s>.
[RFC2397]
Masinter, L., "The "data" URL scheme", RFC 2397, DOI 10.17487/RFC2397, , <https://www.rfc-editor.org/rfc/rfc2397>.
[RFC3339]
Klyne, G. and C. Newman, "Date and Time on the Internet: Timestamps", RFC 3339, DOI 10.17487/RFC3339, , <https://www.rfc-editor.org/rfc/rfc3339>.
[RFC3552]
Rescorla, E. and B. Korver, "Guidelines for Writing RFC Text on Security Considerations", BCP 72, RFC 3552, DOI 10.17487/RFC3552, , <https://www.rfc-editor.org/rfc/rfc3552>.
[RFC4648]
Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, , <https://www.rfc-editor.org/rfc/rfc4648>.
[RFC5988]
Nottingham, M., "Web Linking", RFC 5988, DOI 10.17487/RFC5988, , <https://www.rfc-editor.org/rfc/rfc5988>.
[RFC6690]
Shelby, Z., "Constrained RESTful Environments (CoRE) Link Format", RFC 6690, DOI 10.17487/RFC6690, , <https://www.rfc-editor.org/rfc/rfc6690>.
[RFC6991]
Schoenwaelder, J., Ed., "Common YANG Data Types", RFC 6991, DOI 10.17487/RFC6991, , <https://www.rfc-editor.org/rfc/rfc6991>.
[RFC7230]
Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing", RFC 7230, DOI 10.17487/RFC7230, , <https://www.rfc-editor.org/rfc/rfc7230>.
[RFC7515]
Jones, M., Bradley, J., and N. Sakimura, "JSON Web Signature (JWS)", RFC 7515, DOI 10.17487/RFC7515, , <https://www.rfc-editor.org/rfc/rfc7515>.
[RFC8288]
Nottingham, M., "Web Linking", RFC 8288, DOI 10.17487/RFC8288, , <https://www.rfc-editor.org/rfc/rfc8288>.
[RFC8555]
Barnes, R., Hoffman-Andrews, J., McCarney, D., and J. Kasten, "Automatic Certificate Management Environment (ACME)", RFC 8555, DOI 10.17487/RFC8555, , <https://www.rfc-editor.org/rfc/rfc8555>.
[RFC9399]
Santesson, S., Housley, R., Freeman, T., and L. Rosenthol, "Internet X.509 Public Key Infrastructure: Logotypes in X.509 Certificates", RFC 9399, DOI 10.17487/RFC9399, , <https://www.rfc-editor.org/rfc/rfc9399>.
[RFC9444]
Friel, O., Barnes, R., Hollebeek, T., and M. Richardson, "Automated Certificate Management Environment (ACME) for Subdomains", RFC 9444, DOI 10.17487/RFC9444, , <https://www.rfc-editor.org/rfc/rfc9444>.
[WP-ANTIPATTERN]
"Anti-pattern", Wikipedia page (at the time of writing:), , <https://en.wikipedia.org/w/index.php?title=Anti-pattern&oldid=1144938932>.

Acknowledgments

Julian Reschke opened the author's eyes to the fundamental problem of restatements, possibly not using this word. Many IETFers over decades have worked on mitigating restatements; the author apologizes that examples in this memo naturally mainly come from the author's own recollection.

Author's Address

Carsten Bormann
Universität Bremen TZI
Postfach 330440
D-28359 Bremen
Germany