Changeset 1177 for draft-ietf-httpbis/latest/p1-messaging.html
- Timestamp:
- 14/03/11 03:25:50 (12 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
draft-ietf-httpbis/latest/p1-messaging.html
r1175 r1177 383 383 <link rel="Chapter" title="1 Introduction" href="#rfc.section.1"> 384 384 <link rel="Chapter" title="2 HTTP-related architecture" href="#rfc.section.2"> 385 <link rel="Chapter" title="3 HTTP Message" href="#rfc.section.3">385 <link rel="Chapter" title="3 Message Format" href="#rfc.section.3"> 386 386 <link rel="Chapter" title="4 Request" href="#rfc.section.4"> 387 387 <link rel="Chapter" title="5 Response" href="#rfc.section.5"> … … 564 564 </ul> 565 565 </li> 566 <li>3. <a href="#http.message"> HTTP Message</a><ul>566 <li>3. <a href="#http.message">Message Format</a><ul> 567 567 <li>3.1 <a href="#message.robustness">Message Parsing Robustness</a></li> 568 568 <li>3.2 <a href="#header.fields">Header Fields</a></li> … … 815 815 </p> 816 816 </div> 817 <p id="rfc.section.1.2.2.p.3">The OWS rule is used where zero or more linear whitespace characters might appear. OWS <em class="bcp14">SHOULD</em> either not be produced or be produced as a single SP character. Multiple OWS characters that occur within field-content <em class="bcp14">SHOULD</em> be replaced with a single SP before interpreting the field value or forwarding the message downstream.818 </p> 819 <p id="rfc.section.1.2.2.p.4">RWS is used when at least one linear whitespace character is required to separate field tokens. RWS <em class="bcp14">SHOULD</em> be produced as a single SP character. Multiple RWS characters that occur within field-content <em class="bcp14">SHOULD</em> be replaced with a single SP before interpreting the field value or forwarding the message downstream.817 <p id="rfc.section.1.2.2.p.3">The OWS rule is used where zero or more linear whitespace octets might appear. OWS <em class="bcp14">SHOULD</em> either not be produced or be produced as a single SP. Multiple OWS octets that occur within field-content <em class="bcp14">SHOULD</em> be replaced with a single SP before interpreting the field value or forwarding the message downstream. 818 </p> 819 <p id="rfc.section.1.2.2.p.4">RWS is used when at least one linear whitespace octet is required to separate field tokens. RWS <em class="bcp14">SHOULD</em> be produced as a single SP. Multiple RWS octets that occur within field-content <em class="bcp14">SHOULD</em> be replaced with a single SP before interpreting the field value or forwarding the message downstream. 820 820 </p> 821 821 <p id="rfc.section.1.2.2.p.5">BWS is used where the grammar allows optional whitespace for historical reasons but senders <em class="bcp14">SHOULD NOT</em> produce it in messages. HTTP/1.1 recipients <em class="bcp14">MUST</em> accept such bad optional whitespace and remove it before interpreting the field value or forwarding the message downstream. … … 857 857 <a href="#rule.quoted-string" class="smpl">obs-text</a> = %x80-FF 858 858 </pre><div id="rule.quoted-pair"> 859 <p id="rfc.section.1.2.2.p.12"> The backslash character ("\") can be used as a single-characterquoting mechanism within quoted-string constructs:</p>859 <p id="rfc.section.1.2.2.p.12"> The backslash octet ("\") can be used as a single-octet quoting mechanism within quoted-string constructs:</p> 860 860 </div> 861 861 <div id="rfc.figure.u.11"></div><pre class="inline"><span id="rfc.iref.g.23"></span> <a href="#rule.quoted-pair" class="smpl">quoted-pair</a> = "\" ( <a href="#core.rules" class="smpl">WSP</a> / <a href="#core.rules" class="smpl">VCHAR</a> / <a href="#rule.quoted-string" class="smpl">obs-text</a> ) 862 </pre><p id="rfc.section.1.2.2.p.14"> Producers <em class="bcp14">SHOULD NOT</em> escape characters that do not require escaping (i.e., other than DQUOTE and the backslash character).862 </pre><p id="rfc.section.1.2.2.p.14">Senders <em class="bcp14">SHOULD NOT</em> escape octets that do not require escaping (i.e., other than DQUOTE and the backslash octet). 863 863 </p> 864 864 <h1 id="rfc.section.2"><a href="#rfc.section.2">2.</a> <a id="architecture" href="#architecture">HTTP-related architecture</a></h1> … … 1022 1022 <div id="rfc.figure.u.17"></div><pre class="inline"><span id="rfc.iref.g.25"></span><span id="rfc.iref.g.26"></span> <a href="#http.version" class="smpl">HTTP-Version</a> = <a href="#http.version" class="smpl">HTTP-Prot-Name</a> "/" 1*<a href="#core.rules" class="smpl">DIGIT</a> "." 1*<a href="#core.rules" class="smpl">DIGIT</a> 1023 1023 <a href="#http.version" class="smpl">HTTP-Prot-Name</a> = %x48.54.54.50 ; "HTTP", case-sensitive 1024 </pre><p id="rfc.section.2.5.p.4">The HTTP version number consists of two non-negative decimal integers separated by the "." (period or decimal point) character.1025 The first number ("major version") indicates the HTTP messaging syntax, whereas the second number ("minor version") indicates1026 the highest minor version to which the sender is at least conditionally compliant and able to understand for future communication.1027 Theminor version advertises the sender's communication capabilities even when the sender is only using a backwards-compatible1024 </pre><p id="rfc.section.2.5.p.4">The HTTP version number consists of two non-negative decimal integers separated by a "." (period or decimal point). The first 1025 number ("major version") indicates the HTTP messaging syntax, whereas the second number ("minor version") indicates the highest 1026 minor version to which the sender is at least conditionally compliant and able to understand for future communication. The 1027 minor version advertises the sender's communication capabilities even when the sender is only using a backwards-compatible 1028 1028 subset of the protocol, thereby letting the recipient know that more advanced features can be used in response (by servers) 1029 1029 or in future requests (by clients). … … 1167 1167 http://EXAMPLE.com/%7Esmith/home.html 1168 1168 http://EXAMPLE.com:/%7esmith/home.html 1169 </pre><h1 id="rfc.section.3"><a href="#rfc.section.3">3.</a> <a id="http.message" href="#http.message"> HTTP Message</a></h1>1169 </pre><h1 id="rfc.section.3"><a href="#rfc.section.3">3.</a> <a id="http.message" href="#http.message">Message Format</a></h1> 1170 1170 <div id="rfc.iref.h.3"></div> 1171 1171 <div id="rfc.iref.h.4"></div> 1172 1172 <div id="rfc.iref.h.5"></div> 1173 <p id="rfc.section.3.p.1">All HTTP/1.1 messages consist of a start-line followed by a sequence of characters in a format similar to the Internet Message1173 <p id="rfc.section.3.p.1">All HTTP/1.1 messages consist of a start-line followed by a sequence of octets in a format similar to the Internet Message 1174 1174 Format <a href="#RFC5322" id="rfc.xref.RFC5322.2"><cite title="Internet Message Format">[RFC5322]</cite></a>: zero or more header fields (collectively referred to as the "headers" or the "header section"), an empty line indicating 1175 1175 the end of the header section, and an optional message-body. … … 1186 1186 [ <a href="#message.body" class="smpl">message-body</a> ] 1187 1187 <a href="#http.message" class="smpl">start-line</a> = <a href="#request-line" class="smpl">Request-Line</a> / <a href="#status-line" class="smpl">Status-Line</a> 1188 </pre><p id="rfc.section.3.p.4">Whitespace (WSP) <em class="bcp14">MUST NOT</em> be sent between the start-line and the first header field. The presence of whitespace might be an attempt to trick a noncompliant 1189 implementation of HTTP into ignoring that field or processing the next line as a new request, either of which might result 1190 in security issues when implementations within the request chain interpret the same message differently. HTTP/1.1 servers <em class="bcp14">MUST</em> reject such a message with a 400 (Bad Request) response. 1188 </pre><p id="rfc.section.3.p.4">Implementations <em class="bcp14">MUST NOT</em> send whitespace between the start-line and the first header field. The presence of such whitespace in a request might be an 1189 attempt to trick a server into ignoring that field or processing the line after it as a new request, either of which might 1190 result in a security vulnerability if other implementations within the request chain interpret the same message differently. 1191 Likewise, the presence of such whitespace in a response might be ignored by some clients or cause others to cease parsing. 1191 1192 </p> 1192 1193 <h2 id="rfc.section.3.1"><a href="#rfc.section.3.1">3.1</a> <a id="message.robustness" href="#message.robustness">Message Parsing Robustness</a></h2> … … 1198 1199 the client <em class="bcp14">MUST</em> include the terminating CRLF octets as part of the message-body length. 1199 1200 </p> 1200 <p id="rfc.section.3.1.p.3">The normal procedure for parsing an HTTP message is to read the start-line into a structure, read each header field into a 1201 <p id="rfc.section.3.1.p.3">When a server listening only for HTTP request messages, or processing what appears from the start-line to be an HTTP request 1202 message, receives a sequence of octets that does not match the HTTP-message grammar aside from the robustness exceptions listed 1203 above, the server <em class="bcp14">MUST</em> respond with an HTTP/1.1 400 (Bad Request) response. 1204 </p> 1205 <p id="rfc.section.3.1.p.4">The normal procedure for parsing an HTTP message is to read the start-line into a structure, read each header field into a 1201 1206 hash table by field name until the empty line, and then use the parsed data to determine if a message-body is expected. If 1202 1207 a message-body has been indicated, then it is read as a stream until an amount of octets equal to the message-body length … … 1205 1210 might introduce security flaws due to the differing ways that such parsers interpret invalid characters. 1206 1211 </p> 1207 <p id="rfc.section.3.1.p. 4">HTTP allows the set of defined header fields to be extended without changing the protocol version (see <a href="#header.field.registration" title="Header Field Registration">Section 10.1</a>). Unrecognized header fields <em class="bcp14">MUST</em> be forwarded by a proxy unless the proxy is specifically configured to block or otherwise transform such fields. Unrecognized1212 <p id="rfc.section.3.1.p.5">HTTP allows the set of defined header fields to be extended without changing the protocol version (see <a href="#header.field.registration" title="Header Field Registration">Section 10.1</a>). Unrecognized header fields <em class="bcp14">MUST</em> be forwarded by a proxy unless the proxy is specifically configured to block or otherwise transform such fields. Unrecognized 1208 1213 header fields <em class="bcp14">SHOULD</em> be ignored by other recipients. 1209 1214 </p> … … 1220 1225 </p> 1221 1226 <p id="rfc.section.3.2.p.4">A field value <em class="bcp14">MAY</em> be preceded by optional whitespace (OWS); a single SP is preferred. The field value does not include any leading or trailing 1222 white space: OWS occurring before the first non-whitespace character of the field value or after the last non-whitespace character1227 white space: OWS occurring before the first non-whitespace octet of the field value or after the last non-whitespace octet 1223 1228 of the field value is ignored and <em class="bcp14">SHOULD</em> be removed before further processing (as this does not change the meaning of the header field). 1224 1229 </p> … … 1240 1245 </div> 1241 1246 <p id="rfc.section.3.2.p.8">Historically, HTTP header field values could be extended over multiple lines by preceding each extra line with at least one 1242 space or horizontal tab character(line folding). This specification deprecates such line folding except within the message/http1247 space or horizontal tab octet (line folding). This specification deprecates such line folding except within the message/http 1243 1248 media type (<a href="#internet.media.type.message.http" title="Internet Media Type message/http">Section 10.3.1</a>). HTTP/1.1 senders <em class="bcp14">MUST NOT</em> produce messages that include line folding (i.e., that contain any field-content that matches the obs-fold rule) unless the 1244 1249 message is intended for packaging within the message/http media type. HTTP/1.1 recipients <em class="bcp14">SHOULD</em> accept line folding and replace any embedded obs-fold whitespace with a single SP prior to interpreting the field value or 1245 1250 forwarding the message downstream. 1246 1251 </p> 1247 <p id="rfc.section.3.2.p.9">Historically, HTTP has allowed field content with text in the ISO-8859-1 <a href="#ISO-8859-1" id="rfc.xref.ISO-8859-1.1"><cite title="Information technology -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1">[ISO-8859-1]</cite></a> character encoding and supported other character sets only through use of <a href="#RFC2047" id="rfc.xref.RFC2047.1"><cite title="MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text">[RFC2047]</cite></a> encoding. In practice, most HTTP header field values use only a subset of the US-ASCII character encoding <a href="#USASCII" id="rfc.xref.USASCII.2"><cite title="Coded Character Set -- 7-bit American Standard Code for Information Interchange">[USASCII]</cite></a>. Newly defined header fields <em class="bcp14">SHOULD</em> limit their field values to US-ASCII characters. Recipients <em class="bcp14">SHOULD</em> treat other (obs-text) octets in field content as opaque data.1252 <p id="rfc.section.3.2.p.9">Historically, HTTP has allowed field content with text in the ISO-8859-1 <a href="#ISO-8859-1" id="rfc.xref.ISO-8859-1.1"><cite title="Information technology -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1">[ISO-8859-1]</cite></a> character encoding and supported other character sets only through use of <a href="#RFC2047" id="rfc.xref.RFC2047.1"><cite title="MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text">[RFC2047]</cite></a> encoding. In practice, most HTTP header field values use only a subset of the US-ASCII character encoding <a href="#USASCII" id="rfc.xref.USASCII.2"><cite title="Coded Character Set -- 7-bit American Standard Code for Information Interchange">[USASCII]</cite></a>. Newly defined header fields <em class="bcp14">SHOULD</em> limit their field values to US-ASCII octets. Recipients <em class="bcp14">SHOULD</em> treat other (obs-text) octets in field content as opaque data. 1248 1253 </p> 1249 1254 <div id="rule.comment"> … … 1256 1261 ; <a href="#rule.whitespace" class="smpl">OWS</a> / <<a href="#core.rules" class="smpl">VCHAR</a> except "(", ")", and "\"> / <a href="#rule.quoted-string" class="smpl">obs-text</a> 1257 1262 </pre><div id="rule.quoted-cpair"> 1258 <p id="rfc.section.3.2.p.12"> The backslash character ("\") can be used as a single-characterquoting mechanism within comment constructs:</p>1263 <p id="rfc.section.3.2.p.12"> The backslash octet ("\") can be used as a single-octet quoting mechanism within comment constructs:</p> 1259 1264 </div> 1260 1265 <div id="rfc.figure.u.25"></div><pre class="inline"><span id="rfc.iref.g.43"></span> <a href="#rule.quoted-cpair" class="smpl">quoted-cpair</a> = "\" ( <a href="#core.rules" class="smpl">WSP</a> / <a href="#core.rules" class="smpl">VCHAR</a> / <a href="#rule.quoted-string" class="smpl">obs-text</a> ) 1261 </pre><p id="rfc.section.3.2.p.14"> Producers <em class="bcp14">SHOULD NOT</em> escape characters that do not require escaping (i.e., other than the backslash character"\" and the parentheses "(" and ")").1266 </pre><p id="rfc.section.3.2.p.14">Senders <em class="bcp14">SHOULD NOT</em> escape octets that do not require escaping (i.e., other than the backslash octet "\" and the parentheses "(" and ")"). 1262 1267 </p> 1263 1268 <h2 id="rfc.section.3.3"><a href="#rfc.section.3.3">3.3</a> <a id="message.body" href="#message.body">Message Body</a></h2> … … 1394 1399 </div> 1395 1400 <h1 id="rfc.section.4"><a href="#rfc.section.4">4.</a> <a id="request" href="#request">Request</a></h1> 1396 <p id="rfc.section.4.p.1">A request message from a client to a server includes, within the first line of that message, the method to be applied to the1397 resource, the identifier of the resource, and the protocol version in use.1401 <p id="rfc.section.4.p.1">A request message from a client to a server begins with a Request-Line, followed by zero or more header fields, an empty line 1402 signifying the end of the header block, and an optional message body. 1398 1403 </p> 1399 1404 <div id="rfc.figure.u.27"></div><pre class="inline"><span id="rfc.iref.g.45"></span> <a href="#request" class="smpl">Request</a> = <a href="#request-line" class="smpl">Request-Line</a> ; <a href="#request-line" title="Request-Line">Section 4.1</a> … … 1402 1407 [ <a href="#message.body" class="smpl">message-body</a> ] ; <a href="#message.body" title="Message Body">Section 3.3</a> 1403 1408 </pre><h2 id="rfc.section.4.1"><a href="#rfc.section.4.1">4.1</a> <a id="request-line" href="#request-line">Request-Line</a></h2> 1404 <p id="rfc.section.4.1.p.1">The Request-Line begins with a method token, followed by the request-target and the protocol version, and ending with CRLF.1405 The elements are separated by SP characters. No CR or LF is allowed except in the final CRLF sequence.1409 <p id="rfc.section.4.1.p.1">The Request-Line begins with a method token, followed by a single space (SP), the request-target, another single space (SP), 1410 the protocol version, and ending with CRLF. 1406 1411 </p> 1407 1412 <div id="rfc.figure.u.28"></div><pre class="inline"><span id="rfc.iref.g.46"></span> <a href="#request-line" class="smpl">Request-Line</a> = <a href="#method" class="smpl">Method</a> <a href="#core.rules" class="smpl">SP</a> <a href="#request-target" class="smpl">request-target</a> <a href="#core.rules" class="smpl">SP</a> <a href="#http.version" class="smpl">HTTP-Version</a> <a href="#core.rules" class="smpl">CRLF</a> … … 1511 1516 TCP connection, 1512 1517 </li> 1513 <li>the charactersequence "://",</li>1518 <li>the octet sequence "://",</li> 1514 1519 <li>the authority component, as specified in the Host header field (<a href="#header.host" id="rfc.xref.header.host.1" title="Host">Section 9.4</a>), and 1515 1520 </li> … … 1541 1546 [ <a href="#message.body" class="smpl">message-body</a> ] ; <a href="#message.body" title="Message Body">Section 3.3</a> 1542 1547 </pre><h2 id="rfc.section.5.1"><a href="#rfc.section.5.1">5.1</a> <a id="status-line" href="#status-line">Status-Line</a></h2> 1543 <p id="rfc.section.5.1.p.1">The first line of a Response message is the Status-Line, consisting of the protocol version followed by a numeric status code 1544 and its associated textual phrase, with each element separated by SP characters. No CR or LF is allowed except in the final 1545 CRLF sequence. 1548 <p id="rfc.section.5.1.p.1">The first line of a Response message is the Status-Line, consisting of the protocol version, a space (SP), the status code, 1549 another space, a possibly-empty textual phrase describing the status code, and ending with CRLF. 1546 1550 </p> 1547 1551 <div id="rfc.figure.u.39"></div><pre class="inline"><span id="rfc.iref.g.50"></span> <a href="#status-line" class="smpl">Status-Line</a> = <a href="#http.version" class="smpl">HTTP-Version</a> <a href="#core.rules" class="smpl">SP</a> <a href="#status.code.and.reason.phrase" class="smpl">Status-Code</a> <a href="#core.rules" class="smpl">SP</a> <a href="#status.code.and.reason.phrase" class="smpl">Reason-Phrase</a> <a href="#core.rules" class="smpl">CRLF</a> … … 1804 1808 <div id="rfc.figure.u.53"></div><pre class="text"> User-Agent: CERN-LineMode/2.15 libwww/2.17b3 1805 1809 Server: Apache/0.8.4 1806 </pre><p id="rfc.section.6.3.p.5">Product tokens <em class="bcp14">SHOULD</em> be short and to the point. They <em class="bcp14">MUST NOT</em> be used for advertising or other non-essential information. Although any token character<em class="bcp14">MAY</em> appear in a product-version, this token <em class="bcp14">SHOULD</em> only be used for a version identifier (i.e., successive versions of the same product <em class="bcp14">SHOULD</em> only differ in the product-version portion of the product value).1810 </pre><p id="rfc.section.6.3.p.5">Product tokens <em class="bcp14">SHOULD</em> be short and to the point. They <em class="bcp14">MUST NOT</em> be used for advertising or other non-essential information. Although any token octet <em class="bcp14">MAY</em> appear in a product-version, this token <em class="bcp14">SHOULD</em> only be used for a version identifier (i.e., successive versions of the same product <em class="bcp14">SHOULD</em> only differ in the product-version portion of the product value). 1807 1811 </p> 1808 1812 <h2 id="rfc.section.6.4"><a href="#rfc.section.6.4">6.4</a> <a id="quality.values" href="#quality.values">Quality Values</a></h2> … … 2976 2980 can be interpreted unambiguously. 2977 2981 </p> 2978 <p id="rfc.section.A.p.2">Clients <em class="bcp14">SHOULD</em> be tolerant in parsing the Status-Line and servers <em class="bcp14">SHOULD</em> be tolerant when parsing the Request-Line. In particular, they <em class="bcp14">SHOULD</em> accept any amount of WSP characters between fields, even though only a single SP is required. 2979 </p> 2980 <p id="rfc.section.A.p.3">The line terminator for header fields is the sequence CRLF. However, we recommend that applications, when parsing such headers 2982 <p id="rfc.section.A.p.2">The line terminator for header fields is the sequence CRLF. However, we recommend that applications, when parsing such headers 2981 2983 fields, recognize a single LF as a line terminator and ignore the leading CR. 2982 2984 </p> 2983 <p id="rfc.section.A.p. 4">The character setof a representation <em class="bcp14">SHOULD</em> be labeled as the lowest common denominator of the character codes used within that representation, with the exception that2985 <p id="rfc.section.A.p.3">The character encoding of a representation <em class="bcp14">SHOULD</em> be labeled as the lowest common denominator of the character codes used within that representation, with the exception that 2984 2986 not labeling the representation is preferred over labeling the representation with the labels US-ASCII or ISO-8859-1. See <a href="#Part3" id="rfc.xref.Part3.6"><cite title="HTTP/1.1, part 3: Message Payload and Content Negotiation">[Part3]</cite></a>. 2985 2987 </p> 2986 <p id="rfc.section.A.p. 5">Additional rules for requirements on parsing and encoding of dates and other potential problems with date encodings include:</p>2987 <p id="rfc.section.A.p. 6"> </p>2988 <p id="rfc.section.A.p.4">Additional rules for requirements on parsing and encoding of dates and other potential problems with date encodings include:</p> 2989 <p id="rfc.section.A.p.5"> </p> 2988 2990 <ul> 2989 2991 <li>HTTP/1.1 clients and caches <em class="bcp14">SHOULD</em> assume that an RFC-850 date which appears to be more than 50 years in the future is in fact in the past (this helps solve … … 3073 3075 </p> 3074 3076 <p id="rfc.section.B.3.p.2">Rules about implicit linear whitespace between certain grammar productions have been removed; now it's only allowed when specifically 3075 pointed out in the ABNF. The NUL character is no longer allowed in comment and quoted-string text. The quoted-pair rule no3076 longerallows escaping control characters other than HTAB. Non-ASCII content in header fields and reason phrase has been obsoleted3077 pointed out in the ABNF. The NUL octet is no longer allowed in comment and quoted-string text. The quoted-pair rule no longer 3078 allows escaping control characters other than HTAB. Non-ASCII content in header fields and reason phrase has been obsoleted 3077 3079 and made opaque (the TEXT rule was removed) (<a href="#basic.rules" title="Basic Rules">Section 1.2.2</a>) 3078 3080 </p>
Note: See TracChangeset
for help on using the changeset viewer.