Ignore:
Timestamp:
Nov 14, 2008, 11:49:55 PM (11 years ago)
Author:
fielding@…
Message:

Deprecate line folding, addresses #77.
Require that invalid whitespace around field-names be rejected, addresses #30.
Make non-ASCII content obsolete and opaque in header fields
and reason phrase, addresses #63, #74, #94, #111.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • draft-ietf-httpbis/latest/p1-messaging.xml

    r391 r395  
    304304<section title="Syntax Notation" anchor="notation">
    305305<iref primary="true" item="Grammar" subitem="ALPHA"/>
    306 <iref primary="true" item="Grammar" subitem="CHAR"/>
    307306<iref primary="true" item="Grammar" subitem="CR"/>
    308307<iref primary="true" item="Grammar" subitem="CRLF"/>
     
    311310<iref primary="true" item="Grammar" subitem="DQUOTE"/>
    312311<iref primary="true" item="Grammar" subitem="HEXDIG"/>
    313 <iref primary="true" item="Grammar" subitem="HTAB"/>
    314312<iref primary="true" item="Grammar" subitem="LF"/>
    315313<iref primary="true" item="Grammar" subitem="OCTET"/>
    316314<iref primary="true" item="Grammar" subitem="SP"/>
     315<iref primary="true" item="Grammar" subitem="VCHAR"/>
    317316<iref primary="true" item="Grammar" subitem="WSP"/>
    318317<t anchor="core.rules">
    319318  <x:anchor-alias value="ALPHA"/>
    320   <x:anchor-alias value="CHAR"/>
    321319  <x:anchor-alias value="CTL"/>
    322320  <x:anchor-alias value="CR"/>
     
    325323  <x:anchor-alias value="DQUOTE"/>
    326324  <x:anchor-alias value="HEXDIG"/>
    327   <x:anchor-alias value="HTAB"/>
    328325  <x:anchor-alias value="LF"/>
    329326  <x:anchor-alias value="OCTET"/>
    330327  <x:anchor-alias value="SP"/>
     328  <x:anchor-alias value="VCHAR"/>
    331329  <x:anchor-alias value="WSP"/>
    332330   This specification uses the Augmented Backus-Naur Form (ABNF) notation
    333331   of <xref target="RFC5234"/>.  The following core rules are included by
    334332   reference, as defined in <xref target="RFC5234" x:fmt="," x:sec="B.1"/>:
    335    ALPHA (letters), CHAR (any <xref target="USASCII"/> character,
    336    excluding NUL), CR (carriage return), CRLF (CR LF), CTL (controls),
     333   ALPHA (letters), CR (carriage return), CRLF (CR LF), CTL (controls),
    337334   DIGIT (decimal 0-9), DQUOTE (double quote),
    338    HEXDIG (hexadecimal 0-9/A-F/a-f), HTAB (horizontal tab),
    339    LF (line feed), OCTET (any 8-bit sequence of data), SP (space)
     335   HEXDIG (hexadecimal 0-9/A-F/a-f), LF (line feed),
     336   OCTET (any 8-bit sequence of data), SP (space),
     337   VCHAR (any visible <xref target="USASCII"/> character),
    340338   and WSP (white space).
    341339</t>
     
    388386</t>
    389387<t anchor="rule.LWS">
    390    All linear white space (LWS) in header field-values has the same semantics as SP. A
    391    recipient &MAY; replace any such linear white space with a single SP before
     388   This specification uses three rules to denote the use of linear
     389   whitespace: OWS (optional whitespace), RWS (required whitespace), and
     390   BWS ("bad" whitespace).
     391</t>
     392<t>
     393   The OWS rule is used where zero or more linear white space characters may
     394   appear. OWS &SHOULD; either not be produced or be produced as a single SP
     395   character. Multiple OWS characters that occur within field-content &SHOULD;
     396   be replaced with a single SP before interpreting the field value or
     397   forwarding the message downstream.
     398</t>
     399<t>
     400   RWS is used when at least one linear white space character is required to
     401   separate field tokens. RWS &SHOULD; be produced as a single SP character.
     402   Multiple RWS characters that occur within field-content &SHOULD; be
     403   replaced with a single SP before interpreting the field value or
     404   forwarding the message downstream.
     405</t>
     406<t>
     407   BWS is used where the grammar allows optional whitespace for historical
     408   reasons but senders &SHOULD-NOT; produce it in messages. HTTP/1.1
     409   recipients &MUST; accept such bad optional whitespace and remove it before
    392410   interpreting the field value or forwarding the message downstream.
    393 </t>
    394 <t>
    395    Historically, HTTP/1.1 header field values allow linear white space folding across
    396    multiple lines. However, this specification deprecates its use; senders &MUST-NOT;
    397    produce messages that include LWS folding (i.e., use the obs-fold rule), except
    398    within the message/http media type (<xref target="internet.media.type.message.http"/>).
    399    Receivers &SHOULD; still parse folded linear white space.
    400 </t>
    401 <t>
    402    This specification uses three rules to denote the use of linear white space;
    403    BWS ("Bad" White Space), OWS (Optional White Space), and RWS (Required White Space).
    404 </t>
    405 <t>
    406    "Bad" white space is allowed by the BNF, but senders &SHOULD-NOT; produce it in messages.
    407    Receivers &MUST; accept it in incoming messages.
    408 </t>
    409 <t>
    410    Required white space is used when at least one linear white space character
    411    is required to separate field tokens. In all such cases, a single SP character
    412    &SHOULD; be used.
    413411</t>
    414412<t anchor="rule.whitespace">
     
    427425  <x:ref>obs-fold</x:ref>       = <x:ref>CRLF</x:ref>
    428426</artwork></figure>
    429 <t anchor="rule.TEXT">
    430   <x:anchor-alias value="TEXT"/>
    431    The TEXT rule is only used for descriptive field contents and values
    432    that are not intended to be interpreted by the message parser. Words
    433    of *TEXT &MAY; contain characters from character sets other than ISO-8859-1
    434    <xref target="ISO-8859-1"/> only when encoded according to the rules of
    435    <xref target="RFC2047"/>.
    436 </t>
    437 <figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="TEXT"/>
    438   <x:ref>TEXT</x:ref>           = %x20-7E / %x80-FF / <x:ref>OWS</x:ref>
    439                  ; any <x:ref>OCTET</x:ref> except <x:ref>CTL</x:ref>s, but including <x:ref>OWS</x:ref>
    440 </artwork></figure>
    441 <t>
    442    A CRLF is allowed in the definition of TEXT only as part of a header
    443    field continuation. It is expected that the folding LWS will be
    444    replaced with a single SP before interpretation of the TEXT value.
    445 </t>
    446427<t anchor="rule.token.separators">
    447428  <x:anchor-alias value="tchar"/>
    448429  <x:anchor-alias value="token"/>
    449    Many HTTP/1.1 header field values consist of words separated by LWS
     430   Many HTTP/1.1 header field values consist of words separated by whitespace
    450431   or special characters. These special characters &MUST; be in a quoted
    451432   string to be used within a parameter value (as defined in
     
    459440  <x:ref>token</x:ref>          = 1*<x:ref>tchar</x:ref>
    460441</artwork></figure>
    461 <t anchor="rule.comment">
    462   <x:anchor-alias value="comment"/>
    463   <x:anchor-alias value="ctext"/>
    464    Comments can be included in some HTTP header fields by surrounding
    465    the comment text with parentheses. Comments are only allowed in
    466    fields containing "comment" as part of their field value definition.
    467    In all other fields, parentheses are considered part of the field
    468    value.
    469 </t>
    470 <figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="comment"/><iref primary="true" item="Grammar" subitem="ctext"/>
    471   <x:ref>comment</x:ref>        = "(" *( <x:ref>ctext</x:ref> / <x:ref>quoted-pair</x:ref> / <x:ref>comment</x:ref> ) ")"
    472   <x:ref>ctext</x:ref>          = &lt;any <x:ref>TEXT</x:ref> excluding "(" and ")"&gt;
    473 </artwork></figure>
    474442<t anchor="rule.quoted-string">
    475443  <x:anchor-alias value="quoted-string"/>
    476444  <x:anchor-alias value="qdtext"/>
     445  <x:anchor-alias value="obs-text"/>
    477446   A string of text is parsed as a single word if it is quoted using
    478447   double-quote marks.
    479448</t>
    480 <figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="quoted-string"/><iref primary="true" item="Grammar" subitem="qdtext"/>
     449<figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="quoted-string"/><iref primary="true" item="Grammar" subitem="qdtext"/><iref primary="true" item="Grammar" subitem="obs-text"/>
    481450  <x:ref>quoted-string</x:ref>  = <x:ref>DQUOTE</x:ref> *(<x:ref>qdtext</x:ref> / <x:ref>quoted-pair</x:ref> ) <x:ref>DQUOTE</x:ref>
    482   <x:ref>qdtext</x:ref>         = &lt;any <x:ref>TEXT</x:ref> excluding <x:ref>DQUOTE</x:ref> and "\">
     451  <x:ref>qdtext</x:ref>         = *( <x:ref>OWS</x:ref> / %x21 / %x23-5B / %x5D-7E / <x:ref>obs-text</x:ref> )
     452  <x:ref>obs-text</x:ref>       = %x80-FF
    483453</artwork></figure>
    484454<t anchor="rule.quoted-pair">
     
    564534</t>
    565535<figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="URI-reference"/><iref primary="true" item="Grammar" subitem="absolute-URI"/><iref primary="true" item="Grammar" subitem="authority"/><iref primary="true" item="Grammar" subitem="path-absolute"/><iref primary="true" item="Grammar" subitem="port"/><iref primary="true" item="Grammar" subitem="query"/><iref primary="true" item="Grammar" subitem="uri-host"/>
    566   <x:ref>URI</x:ref>           = &lt;URI, defined in <xref target="RFC3986" x:fmt="," x:sec="3"/>>
    567   <x:ref>URI-reference</x:ref> = &lt;URI-reference, defined in <xref target="RFC3986" x:fmt="," x:sec="4.1"/>>
    568   <x:ref>absolute-URI</x:ref>  = &lt;absolute-URI, defined in <xref target="RFC3986" x:fmt="," x:sec="4.3"/>>
    569   <x:ref>relative-part</x:ref> = &lt;relative-part, defined in <xref target="RFC3986" x:fmt="," x:sec="4.2"/>>
    570   <x:ref>authority</x:ref>     = &lt;authority, defined in <xref target="RFC3986" x:fmt="," x:sec="3.2"/>>
    571   <x:ref>fragment</x:ref>      = &lt;fragment, defined in <xref target="RFC3986" x:fmt="," x:sec="3.5"/>>
    572   <x:ref>path-abempty</x:ref>  = &lt;path-abempty, defined in <xref target="RFC3986" x:fmt="," x:sec="3.3"/>>
    573   <x:ref>path-absolute</x:ref> = &lt;path-absolute, defined in <xref target="RFC3986" x:fmt="," x:sec="3.3"/>>
    574   <x:ref>port</x:ref>          = &lt;port, defined in <xref target="RFC3986" x:fmt="," x:sec="3.2.3"/>>
    575   <x:ref>query</x:ref>         = &lt;query, defined in <xref target="RFC3986" x:fmt="," x:sec="3.4"/>>
    576   <x:ref>uri-host</x:ref>      = &lt;host, defined in <xref target="RFC3986" x:fmt="," x:sec="3.2.2"/>>
     536  <x:ref>URI</x:ref>           = &lt;URI, defined in <xref target="RFC3986" x:fmt="," x:sec="3"/>&gt;
     537  <x:ref>URI-reference</x:ref> = &lt;URI-reference, defined in <xref target="RFC3986" x:fmt="," x:sec="4.1"/>&gt;
     538  <x:ref>absolute-URI</x:ref>  = &lt;absolute-URI, defined in <xref target="RFC3986" x:fmt="," x:sec="4.3"/>&gt;
     539  <x:ref>relative-part</x:ref> = &lt;relative-part, defined in <xref target="RFC3986" x:fmt="," x:sec="4.2"/>&gt;
     540  <x:ref>authority</x:ref>     = &lt;authority, defined in <xref target="RFC3986" x:fmt="," x:sec="3.2"/>&gt;
     541  <x:ref>fragment</x:ref>      = &lt;fragment, defined in <xref target="RFC3986" x:fmt="," x:sec="3.5"/>&gt;
     542  <x:ref>path-abempty</x:ref>  = &lt;path-abempty, defined in <xref target="RFC3986" x:fmt="," x:sec="3.3"/>&gt;
     543  <x:ref>path-absolute</x:ref> = &lt;path-absolute, defined in <xref target="RFC3986" x:fmt="," x:sec="3.3"/>&gt;
     544  <x:ref>port</x:ref>          = &lt;port, defined in <xref target="RFC3986" x:fmt="," x:sec="3.2.3"/>&gt;
     545  <x:ref>query</x:ref>         = &lt;query, defined in <xref target="RFC3986" x:fmt="," x:sec="3.4"/>&gt;
     546  <x:ref>uri-host</x:ref>      = &lt;host, defined in <xref target="RFC3986" x:fmt="," x:sec="3.2.2"/>&gt;
    577547 
    578548  <x:ref>partial-URI</x:ref>   = relative-part [ "?" query ]
     
    906876   abbreviation for time zone, and &MUST; be assumed when reading the
    907877   asctime format. HTTP-date is case sensitive and &MUST-NOT; include
    908    additional LWS beyond that specifically included as SP in the
     878   additional whitespace beyond that specifically included as SP in the
    909879   grammar.
    910880</t>
     
    12111181   extra CRLF.
    12121182</t>
     1183<t>
     1184   Whitespace (WSP) &MUST-NOT; be sent between the start-line and the first
     1185   header field. The presence of whitespace might be an attempt to trick a
     1186   noncompliant implementation of HTTP into ignoring that field or processing
     1187   the next line as a new request, either of which may result in security
     1188   issues when implementations within the request chain interpret the
     1189   same message differently. HTTP/1.1 servers &MUST; reject such a message
     1190   with a 400 (Bad Request) response.
     1191</t>
    12131192</section>
    12141193
     
    12191198  <x:anchor-alias value="message-header"/>
    12201199<t>
    1221    HTTP header fields, which include general-header (<xref target="general.header.fields"/>),
    1222    request-header (&request-header-fields;), response-header (&response-header-fields;), and
    1223    entity-header (&entity-header-fields;) fields, follow the same generic format as
    1224    that given in <xref target="RFC5322" x:fmt="of" x:sec="2.1"/>. Each header field consists
    1225    of a name followed by a colon (":") and the field value. Field names
    1226    are case-insensitive. The field value &MAY; be preceded by any amount
    1227    of LWS, though a single SP is preferred. Header fields can be
    1228    extended over multiple lines by preceding each extra line with at
    1229    least one SP or HTAB. Applications ought to follow "common form", where
    1230    one is known or indicated, when generating HTTP constructs, since
    1231    there might exist some implementations that fail to accept anything
    1232    beyond the common forms.
     1200   HTTP header fields follow the same general format as Internet messages in
     1201   <xref target="RFC5322" x:fmt="of" x:sec="2.1"/>. Each header field consists
     1202   of a name followed by a colon (":"), optional whitespace, and the field
     1203   value. Field names are case-insensitive.
    12331204</t>
    12341205<figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="message-header"/><iref primary="true" item="Grammar" subitem="field-name"/><iref primary="true" item="Grammar" subitem="field-value"/><iref primary="true" item="Grammar" subitem="field-content"/>
    1235   <x:ref>message-header</x:ref> = <x:ref>field-name</x:ref> ":" [ <x:ref>field-value</x:ref> ]
     1206  <x:ref>message-header</x:ref> = <x:ref>field-name</x:ref> ":" OWS [ <x:ref>field-value</x:ref> ] OWS
    12361207  <x:ref>field-name</x:ref>     = <x:ref>token</x:ref>
    12371208  <x:ref>field-value</x:ref>    = *( <x:ref>field-content</x:ref> / <x:ref>OWS</x:ref> )
    1238   <x:ref>field-content</x:ref>  = &lt;field content&gt;
    1239 </artwork></figure>
    1240 <t>
    1241   <cref>whitespace between field-name and colon is an error and MUST NOT be accepted</cref>
    1242 </t>
    1243 <t>
    1244    The field-content does not include any leading or trailing LWS:
    1245    linear white space occurring before the first non-whitespace
    1246    character of the field-value or after the last non-whitespace
    1247    character of the field-value. Such leading or trailing LWS &MAY; be
    1248    removed without changing the semantics of the field value. Any LWS
    1249    that occurs between field-content &MAY; be replaced with a single SP
    1250    before interpreting the field value or forwarding the message
    1251    downstream.
    1252 </t>
     1209  <x:ref>field-content</x:ref>  = *( <x:ref>WSP</x:ref> / <x:ref>VCHAR</x:ref> / <x:ref>obs-text</x:ref> )
     1210</artwork></figure>
     1211<t>
     1212   Historically, HTTP has allowed field-content with text in the ISO-8859-1
     1213   <xref target="ISO-8859-1"/> character encoding (allowing other character sets
     1214   through use of <xref target="RFC2047"/> encoding). In practice, most HTTP
     1215   header field-values use only a subset of the US-ASCII charset
     1216   <xref target="USASCII"/>. Newly defined header fields &SHOULD; constrain
     1217   their field-values to US-ASCII characters. Recipients &SHOULD; treat other
     1218   (obs-text) octets in field-content as opaque data.
     1219</t>
     1220<t>
     1221   No whitespace is allowed between the header field-name and colon. For
     1222   security reasons, any request message received containing such whitespace
     1223   &MUST; be rejected with a response code of 400 (Bad Request) and any such
     1224   whitespace in a response message &MUST; be removed.
     1225</t>
     1226<t>
     1227   The field value &MAY; be preceded by optional white space; a single SP is
     1228   preferred. The field-value does not include any leading or trailing white
     1229   space: OWS occurring before the first non-whitespace character of the
     1230   field-value or after the last non-whitespace character of the field-value
     1231   is ignored and &MAY; be removed without changing the meaning of the header
     1232   field.
     1233</t>
     1234<t>
     1235   Historically, HTTP header field values could be extended over multiple
     1236   lines by preceding each extra line with at least one space or horizontal
     1237   tab character (line folding). This specification deprecates such line
     1238   folding except within the message/http media type
     1239   (<xref target="internet.media.type.message.http"/>).
     1240   HTTP/1.1 senders &MUST-NOT; produce messages that include line folding
     1241   (i.e., that contain any field-content that matches the obs-fold rule) unless
     1242   the message is intended for packaging within the message/http media type.
     1243   HTTP/1.1 recipients &SHOULD; accept line folding and replace any embedded
     1244   obs-fold whitespace with a single SP prior to interpreting the field value
     1245   or forwarding the message downstream.
     1246</t>
     1247<t anchor="rule.comment">
     1248  <x:anchor-alias value="comment"/>
     1249  <x:anchor-alias value="ctext"/>
     1250   Comments can be included in some HTTP header fields by surrounding
     1251   the comment text with parentheses. Comments are only allowed in
     1252   fields containing "comment" as part of their field value definition.
     1253   In all other fields, parentheses are considered part of the field
     1254   value.
     1255</t>
     1256<figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="comment"/><iref primary="true" item="Grammar" subitem="ctext"/>
     1257  <x:ref>comment</x:ref>        = "(" *( <x:ref>ctext</x:ref> / <x:ref>quoted-pair</x:ref> / <x:ref>comment</x:ref> ) ")"
     1258  <x:ref>ctext</x:ref>          = *( <x:ref>OWS</x:ref> / %x21-27 / %x2A-7E / <x:ref>obs-text</x:ref> )
     1259</artwork></figure>
    12531260<t>
    12541261   The order in which header fields with differing field names are
     
    16921699<figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="Status-Code"/><iref primary="true" item="Grammar" subitem="extension-code"/><iref primary="true" item="Grammar" subitem="Reason-Phrase"/>
    16931700  <x:ref>Status-Code</x:ref>    = 3<x:ref>DIGIT</x:ref>
    1694   <x:ref>Reason-Phrase</x:ref>  = *&lt;<x:ref>TEXT</x:ref>, excluding <x:ref>CR</x:ref>, <x:ref>LF</x:ref>&gt;
     1701  <x:ref>Reason-Phrase</x:ref>  = *( <x:ref>WSP</x:ref> / <x:ref>VCHAR</x:ref> / <x:ref>obs-text</x:ref> )
    16951702</artwork></figure>
    16961703</section>
     
    38653872   Clients &SHOULD; be tolerant in parsing the Status-Line and servers
    38663873   tolerant when parsing the Request-Line. In particular, they &SHOULD;
    3867    accept any amount of SP or HTAB characters between fields, even though
     3874   accept any amount of WSP characters between fields, even though
    38683875   only a single SP is required.
    38693876</t>
     
    40774084  Rules about implicit linear white space between certain grammar productions
    40784085  have been removed; now it's only allowed when specifically pointed out
    4079   in the ABNF.
    4080   The CHAR rule does not allow the NUL character anymore (this affects
    4081   the comment and quoted-string rules).  Furthermore, the quoted-pair
    4082   rule does not allow escaping NUL, CR or LF anymore.
     4086  in the ABNF. The NUL character is no longer allowed in comment and quoted-string
     4087  text. The quoted-pair rule no longer allows escaping NUL, CR or LF.
    40834088  (<xref target="basic.rules"/>)
    40844089</t>
     
    44634468    </t>
    44644469    <t>
    4465       Use names of RFC4234 core rules DQUOTE and HTAB,
     4470      Use names of RFC4234 core rules DQUOTE and WSP,
    44664471      fix broken ABNF for chunk-data
    44674472      (work in progress on <eref target="http://tools.ietf.org/wg/httpbis/trac/ticket/36"/>)
     
    45104515    </t>
    45114516    <t>
    4512       Synchronize core rules with RFC5234 (this includes a change to CHAR
    4513       which now excludes NUL).
     4517      Synchronize core rules with RFC5234.
    45144518    </t>
    45154519    <t>
Note: See TracChangeset for help on using the changeset viewer.