Changeset 1176 for draft-ietf-httpbis
- Timestamp:
- 14/03/11 03:25:25 (11 years ago)
- Location:
- draft-ietf-httpbis/latest
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
draft-ietf-httpbis/latest/p1-messaging.xml
r1175 r1176 427 427 </t> 428 428 <t> 429 The OWS rule is used where zero or more linear whitespace characters might430 appear. OWS &SHOULD; either not be produced or be produced as a single SP431 character. Multiple OWS characters that occur within field-content &SHOULD;429 The OWS rule is used where zero or more linear whitespace octets might 430 appear. OWS &SHOULD; either not be produced or be produced as a single 431 SP. Multiple OWS octets that occur within field-content &SHOULD; 432 432 be replaced with a single SP before interpreting the field value or 433 433 forwarding the message downstream. 434 434 </t> 435 435 <t> 436 RWS is used when at least one linear whitespace characteris required to437 separate field tokens. RWS &SHOULD; be produced as a single SP character.438 Multiple RWS characters that occur within field-content &SHOULD; be436 RWS is used when at least one linear whitespace octet is required to 437 separate field tokens. RWS &SHOULD; be produced as a single SP. 438 Multiple RWS octets that occur within field-content &SHOULD; be 439 439 replaced with a single SP before interpreting the field value or 440 440 forwarding the message downstream. … … 503 503 <t anchor="rule.quoted-pair"> 504 504 <x:anchor-alias value="quoted-pair"/> 505 The backslash character ("\") can be used as a single-character505 The backslash octet ("\") can be used as a single-octet 506 506 quoting mechanism within quoted-string constructs: 507 507 </t> … … 510 510 </artwork></figure> 511 511 <t> 512 Producers &SHOULD-NOT; escape characters that do not require escaping513 (i.e., other than DQUOTE and the backslash character).512 Senders &SHOULD-NOT; escape octets that do not require escaping 513 (i.e., other than DQUOTE and the backslash octet). 514 514 </t> 515 515 </section> … … 819 819 <t> 820 820 The HTTP version number consists of two non-negative decimal integers 821 separated by the "." (period or decimal point) character. The first821 separated by a "." (period or decimal point). The first 822 822 number ("major version") indicates the HTTP messaging syntax, whereas 823 823 the second number ("minor version") indicates the highest minor … … 1133 1133 </section> 1134 1134 1135 <section title=" HTTP Message" anchor="http.message">1135 <section title="Message Format" anchor="http.message"> 1136 1136 <x:anchor-alias value="generic-message"/> 1137 1137 <x:anchor-alias value="message.types"/> … … 1143 1143 <t> 1144 1144 All HTTP/1.1 messages consist of a start-line followed by a sequence of 1145 characters in a format similar to the Internet Message Format1145 octets in a format similar to the Internet Message Format 1146 1146 <xref target="RFC5322"/>: zero or more header fields (collectively 1147 1147 referred to as the "headers" or the "header section"), an empty line … … 1168 1168 </artwork></figure> 1169 1169 <t> 1170 Whitespace (WSP) &MUST-NOT; be sent between the start-line and the first 1171 header field. The presence of whitespace might be an attempt to trick a 1172 noncompliant implementation of HTTP into ignoring that field or processing 1173 the next line as a new request, either of which might result in security 1174 issues when implementations within the request chain interpret the 1175 same message differently. HTTP/1.1 servers &MUST; reject such a message 1176 with a 400 (Bad Request) response. 1170 Implementations &MUST-NOT; send whitespace between the start-line and 1171 the first header field. The presence of such whitespace in a request 1172 might be an attempt to trick a server into ignoring that field or 1173 processing the line after it as a new request, either of which might 1174 result in a security vulnerability if other implementations within 1175 the request chain interpret the same message differently. 1176 Likewise, the presence of such whitespace in a response might be 1177 ignored by some clients or cause others to cease parsing. 1177 1178 </t> 1178 1179 … … 1193 1194 client &MUST; include the terminating CRLF octets as part of the 1194 1195 message-body length. 1196 </t> 1197 <t> 1198 When a server listening only for HTTP request messages, or processing 1199 what appears from the start-line to be an HTTP request message, 1200 receives a sequence of octets that does not match the HTTP-message 1201 grammar aside from the robustness exceptions listed above, the 1202 server &MUST; respond with an HTTP/1.1 400 (Bad Request) response. 1195 1203 </t> 1196 1204 <t> … … 1242 1250 A field value &MAY; be preceded by optional whitespace (OWS); a single SP is 1243 1251 preferred. The field value does not include any leading or trailing white 1244 space: OWS occurring before the first non-whitespace characterof the1245 field value or after the last non-whitespace characterof the field value1252 space: OWS occurring before the first non-whitespace octet of the 1253 field value or after the last non-whitespace octet of the field value 1246 1254 is ignored and &SHOULD; be removed before further processing (as this does 1247 1255 not change the meaning of the header field). … … 1283 1291 Historically, HTTP header field values could be extended over multiple 1284 1292 lines by preceding each extra line with at least one space or horizontal 1285 tab character(line folding). This specification deprecates such line1293 tab octet (line folding). This specification deprecates such line 1286 1294 folding except within the message/http media type 1287 1295 (<xref target="internet.media.type.message.http"/>). … … 1299 1307 In practice, most HTTP header field values use only a subset of the 1300 1308 US-ASCII character encoding <xref target="USASCII"/>. Newly defined 1301 header fields &SHOULD; limit their field values to US-ASCII characters.1309 header fields &SHOULD; limit their field values to US-ASCII octets. 1302 1310 Recipients &SHOULD; treat other (obs-text) octets in field content as 1303 1311 opaque data. … … 1317 1325 <t anchor="rule.quoted-cpair"> 1318 1326 <x:anchor-alias value="quoted-cpair"/> 1319 The backslash character ("\") can be used as a single-character1327 The backslash octet ("\") can be used as a single-octet 1320 1328 quoting mechanism within comment constructs: 1321 1329 </t> … … 1324 1332 </artwork></figure> 1325 1333 <t> 1326 Producers &SHOULD-NOT; escape characters that do not require escaping1327 (i.e., other than the backslash character"\" and the parentheses "(" and1334 Senders &SHOULD-NOT; escape octets that do not require escaping 1335 (i.e., other than the backslash octet "\" and the parentheses "(" and 1328 1336 ")"). 1329 1337 </t> … … 1547 1555 <x:anchor-alias value="Request"/> 1548 1556 <t> 1549 A request message from a client to a server includes, within the 1550 first line of that message, the method to be applied to the resource, 1551 the identifier of the resource, and the protocol version in use. 1557 A request message from a client to a server begins with a 1558 Request-Line, followed by zero or more header fields, an empty 1559 line signifying the end of the header block, and an optional 1560 message body. 1552 1561 </t> 1553 1562 <!-- Host ; should be moved here eventually --> … … 1562 1571 <x:anchor-alias value="Request-Line"/> 1563 1572 <t> 1564 The Request-Line begins with a method token, followed by the 1565 request-target and the protocol version, and ending with CRLF. The 1566 elements are separated by SP characters. No CR or LF is allowed 1567 except in the final CRLF sequence. 1573 The Request-Line begins with a method token, followed by a single 1574 space (SP), the request-target, another single space (SP), the 1575 protocol version, and ending with CRLF. 1568 1576 </t> 1569 1577 <figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="Request-Line"/> … … 1787 1795 </t> 1788 1796 <t> 1789 the charactersequence "://",1797 the octet sequence "://", 1790 1798 </t> 1791 1799 <t> … … 1863 1871 <t> 1864 1872 The first line of a Response message is the Status-Line, consisting 1865 of the protocol version followed by a numeric status code and its1866 a ssociated textual phrase, with each element separated by SP1867 characters. No CR or LF is allowed except in the final CRLF sequence.1873 of the protocol version, a space (SP), the status code, another space, 1874 a possibly-empty textual phrase describing the status code, and 1875 ending with CRLF. 1868 1876 </t> 1869 1877 <figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="Status-Line"/> … … 2350 2358 Product tokens &SHOULD; be short and to the point. They &MUST-NOT; be 2351 2359 used for advertising or other non-essential information. Although any 2352 token character&MAY; appear in a product-version, this token &SHOULD;2360 token octet &MAY; appear in a product-version, this token &SHOULD; 2353 2361 only be used for a version identifier (i.e., successive versions of 2354 2362 the same product &SHOULD; only differ in the product-version portion of … … 4847 4855 </t> 4848 4856 <t> 4849 Clients &SHOULD; be tolerant in parsing the Status-Line and servers4850 &SHOULD; be tolerant when parsing the Request-Line. In particular, they4851 &SHOULD; accept any amount of WSP characters between fields, even though4852 only a single SP is required.4853 </t>4854 <t>4855 4857 The line terminator for header fields is the sequence CRLF. 4856 4858 However, we recommend that applications, when parsing such headers fields, … … 4858 4860 </t> 4859 4861 <t> 4860 The character setof a representation &SHOULD; be labeled as the lowest4862 The character encoding of a representation &SHOULD; be labeled as the lowest 4861 4863 common denominator of the character codes used within that representation, with 4862 4864 the exception that not labeling the representation is preferred over labeling … … 5027 5029 Rules about implicit linear whitespace between certain grammar productions 5028 5030 have been removed; now it's only allowed when specifically pointed out 5029 in the ABNF. The NUL characteris no longer allowed in comment and quoted-string5031 in the ABNF. The NUL octet is no longer allowed in comment and quoted-string 5030 5032 text. The quoted-pair rule no longer allows escaping control characters other than HTAB. 5031 5033 Non-ASCII content in header fields and reason phrase has been obsoleted and -
draft-ietf-httpbis/latest/p3-payload.xml
r1173 r1176 336 336 <section title="Protocol Parameters" anchor="protocol.parameters"> 337 337 338 <section title="Character Sets" anchor="character.sets"> 339 <t> 340 HTTP uses the same definition of the term "character set" as that 341 described for MIME: 342 </t> 343 <t> 344 The term "character set" is used in this document to refer to a 345 method used with one or more tables to convert a sequence of octets 346 into a sequence of characters. Note that unconditional conversion in 347 the other direction is not required, in that not all characters might 348 be available in a given character set and a character set might provide 349 more than one sequence of octets to represent a particular character. 350 This definition is intended to allow various kinds of character 351 encoding, from simple single-table mappings such as US-ASCII to 352 complex table switching methods such as those that use ISO-2022's 353 techniques. However, the definition associated with a MIME character 354 set name &MUST; fully specify the mapping to be performed from octets 355 to characters. In particular, use of external profiling information 356 to determine the exact mapping is not permitted. 357 </t> 358 <x:note> 359 <t> 360 <x:h>Note:</x:h> This use of the term "character set" is more commonly 361 referred to as a "character encoding". However, since HTTP and 362 MIME share the same registry, it is important that the terminology 363 also be shared. 364 </t> 365 </x:note> 338 <section title="Character Encodings (charset)" anchor="character.sets"> 339 <t> 340 HTTP uses charset names to indicate the character encoding of a 341 textual representation. 342 </t> 366 343 <t anchor="rule.charset"> 367 344 <x:anchor-alias value="charset"/> 368 HTTP character sets are identified by case-insensitive tokens. The345 A character encoding is identified by a case-insensitive token. The 369 346 complete set of tokens is defined by the IANA Character Set registry 370 347 (<eref target="http://www.iana.org/assignments/character-sets"/>). … … 376 353 Although HTTP allows an arbitrary token to be used as a charset 377 354 value, any token that has a predefined value within the IANA 378 Character Set registry &MUST; represent the character setdefined355 Character Set registry &MUST; represent the character encoding defined 379 356 by that registry. Applications &SHOULD; limit their use of character 380 sets to those defined bythe IANA registry.357 encodings to those defined within the IANA registry. 381 358 </t> 382 359 <t> … … 1094 1071 <t> 1095 1072 The "Accept-Charset" header field can be used by user agents to 1096 indicate what response character sets are acceptable. This field allows 1073 indicate what character encodings are acceptable in a response 1074 payload. This field allows 1097 1075 clients capable of understanding more comprehensive or special-purpose 1098 character sets to signal that capability to a server which is capable of1099 representing documents in those character sets.1076 character encodings to signal that capability to a server which is capable of 1077 representing documents in those character encodings. 1100 1078 </t> 1101 1079 <figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="Accept-Charset"/><iref primary="true" item="Grammar" subitem="Accept-Charset-v"/> … … 1106 1084 </artwork></figure> 1107 1085 <t> 1108 Character set values are described in <xref target="character.sets"/>. Each charset &MAY; 1109 be given an associated quality value which represents the user's 1110 preference for that charset. The default value is q=1. An example is 1086 Character encoding values (a.k.a., charsets) are described in 1087 <xref target="character.sets"/>. Each charset &MAY; be given an 1088 associated quality value which represents the user's preference 1089 for that charset. The default value is q=1. An example is 1111 1090 </t> 1112 1091 <figure><artwork type="example"> … … 1115 1094 <t> 1116 1095 The special value "*", if present in the Accept-Charset field, 1117 matches every character set(including ISO-8859-1) which is not1096 matches every character encoding (including ISO-8859-1) which is not 1118 1097 mentioned elsewhere in the Accept-Charset field. If no "*" is present 1119 in an Accept-Charset field, then all character sets not explicitly1098 in an Accept-Charset field, then all character encodings not explicitly 1120 1099 mentioned get a quality value of 0, except for ISO-8859-1, which gets 1121 1100 a quality value of 1 if not explicitly mentioned. … … 1123 1102 <t> 1124 1103 If no Accept-Charset header field is present, the default is that any 1125 character setis acceptable. If an Accept-Charset header field is present,1104 character encoding is acceptable. If an Accept-Charset header field is present, 1126 1105 and if the server cannot send a response which is acceptable 1127 1106 according to the Accept-Charset header field, then the server &SHOULD; send … … 2544 2523 Where it is possible, a proxy or gateway from HTTP to a strict MIME 2545 2524 environment &SHOULD; translate all line breaks within the text media 2546 types described in <xref target="canonicalization.and.text.defaults"/> of this document to the RFC 2049 2525 types described in <xref target="canonicalization.and.text.defaults"/> 2526 of this document to the RFC 2049 2547 2527 canonical form of CRLF. Note, however, that this might be complicated 2548 2528 by the presence of a Content-Encoding and by the fact that HTTP 2549 allows the use of some character sets which do not use octets 13 and2550 10 to represent CR and LF, as is the case for some multi-byte2551 character sets.2529 allows the use of some character encodings which do not use octets 13 and 2530 10 to represent CR and LF, respectively, as is the case for some multi-byte 2531 character encodings. 2552 2532 </t> 2553 2533 <t>
Note: See TracChangeset
for help on using the changeset viewer.