Changeset 647 for draft-ietf-httpbis/latest/p1-messaging.xml
- Timestamp:
- 28/07/09 15:02:09 (12 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
draft-ietf-httpbis/latest/p1-messaging.xml
r646 r647 427 427 ; "bad" whitespace 428 428 <x:ref>obs-fold</x:ref> = <x:ref>CRLF</x:ref> 429 ; see <xref target=" message.headers"/>429 ; see <xref target="header.fields"/> 430 430 </artwork></figure> 431 431 <t anchor="rule.token.separators"> … … 1019 1019 1020 1020 <section title="HTTP Message" anchor="http.message"> 1021 1022 <section title="Message Types" anchor="message.types"> 1023 <x:anchor-alias value="generic-message"/> 1024 <x:anchor-alias value="HTTP-message"/> 1025 <x:anchor-alias value="start-line"/> 1026 <t> 1027 HTTP messages consist of requests from client to server and responses 1028 from server to client. 1021 <x:anchor-alias value="generic-message"/> 1022 <x:anchor-alias value="message.types"/> 1023 <x:anchor-alias value="HTTP-message"/> 1024 <x:anchor-alias value="start-line"/> 1025 <iref item="header section"/> 1026 <iref item="headers"/> 1027 <iref item="header field"/> 1028 <t> 1029 All HTTP/1.1 messages consist of a start-line followed by a sequence of 1030 characters in a format similar to the Internet Message Format 1031 <xref target="RFC5322"/>: zero or more header fields (collectively 1032 referred to as the "headers" or the "header section"), an empty line 1033 indicating the end of the header section, and an optional message-body. 1034 </t> 1035 <t> 1036 An HTTP message can either be a request from client to server or a 1037 response from server to client. Syntactically, the two types of message 1038 differ only in the start-line, which is either a Request-Line (for requests) 1039 or a Status-Line (for responses), and in the algorithm for determining 1040 the length of the message-body (<xref target="message.length"/>). 1041 In theory, a client could receive requests and a server could receive 1042 responses, distinguishing them by their different start-line formats, 1043 but in practice servers are implemented to only expect a request 1044 (a response is interpreted as an unknown or invalid request method) 1045 and clients are implemented to only expect a response. 1029 1046 </t> 1030 1047 <figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="HTTP-message"/> 1031 <x:ref>HTTP-message</x:ref> = <x:ref>Request</x:ref> / <x:ref>Response</x:ref> ; HTTP/1.1 messages 1032 </artwork></figure> 1033 <t> 1034 Request (<xref target="request"/>) and Response (<xref target="response"/>) messages use the generic 1035 message format of <xref target="RFC5322"/> for transferring entities (the payload 1036 of the message). Both types of message consist of a start-line, zero 1037 or more header fields (also known as "headers"), an empty line (i.e., 1038 a line with nothing preceding the CRLF) indicating the end of the 1039 header fields, and possibly a message-body. 1040 </t> 1041 <figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="generic-message"/><iref primary="true" item="Grammar" subitem="start-line"/> 1042 <x:ref>generic-message</x:ref> = <x:ref>start-line</x:ref> 1043 *( <x:ref>message-header</x:ref> <x:ref>CRLF</x:ref> ) 1048 <x:ref>HTTP-message</x:ref> = <x:ref>start-line</x:ref> 1049 *( <x:ref>header-field</x:ref> <x:ref>CRLF</x:ref> ) 1044 1050 <x:ref>CRLF</x:ref> 1045 1051 [ <x:ref>message-body</x:ref> ] 1046 1052 <x:ref>start-line</x:ref> = <x:ref>Request-Line</x:ref> / <x:ref>Status-Line</x:ref> 1047 1053 </artwork></figure> 1048 <t>1049 In the interest of robustness, servers &SHOULD; ignore any empty1050 line(s) received where a Request-Line is expected. In other words, if1051 the server is reading the protocol stream at the beginning of a1052 message and receives a CRLF first, it should ignore the CRLF.1053 </t>1054 <t>1055 Certain buggy HTTP/1.0 client implementations generate extra CRLF's1056 after a POST request. To restate what is explicitly forbidden by the1057 BNF, an HTTP/1.1 client &MUST-NOT; preface or follow a request with an1058 extra CRLF.1059 </t>1060 1054 <t> 1061 1055 Whitespace (WSP) &MUST-NOT; be sent between the start-line and the first … … 1067 1061 with a 400 (Bad Request) response. 1068 1062 </t> 1069 </section> 1070 1071 <section title="Message Headers" anchor="message.headers"> 1063 1064 <section title="Message Parsing Robustness" anchor="message.robustness"> 1065 <t> 1066 In the interest of robustness, servers &SHOULD; ignore at least one 1067 empty line received where a Request-Line is expected. In other words, if 1068 the server is reading the protocol stream at the beginning of a 1069 message and receives a CRLF first, it should ignore the CRLF. 1070 </t> 1071 <t> 1072 Some old HTTP/1.0 client implementations generate an extra CRLF 1073 after a POST request as a lame workaround for some early server 1074 applications that failed to read message-body content that was 1075 not terminated by a line-ending. An HTTP/1.1 client &MUST-NOT; 1076 preface or follow a request with an extra CRLF. If terminating 1077 the request message-body with a line-ending is desired, then the 1078 client &MUST; include the terminating CRLF octets as part of the 1079 message-body length. 1080 </t> 1081 <t> 1082 The normal procedure for parsing an HTTP message is to read the 1083 start-line into a structure, read each header field into a hash 1084 table by field name until the empty line, and then use the parsed 1085 data to determine if a message-body is expected. If a message-body 1086 has been indicated, then it is read as a stream until an amount 1087 of OCTETs equal to the message-length is read or the connection 1088 is closed. Care must be taken to parse an HTTP message as a sequence 1089 of OCTETs in an encoding that is a superset of US-ASCII. Attempting 1090 to parse HTTP as a stream of Unicode characters in a character encoding 1091 like UTF-16 may introduce security flaws due to the differing ways 1092 that such parsers interpret invalid characters. 1093 </t> 1094 </section> 1095 1096 <section title="Header Fields" anchor="header.fields"> 1097 <x:anchor-alias value="header-field"/> 1072 1098 <x:anchor-alias value="field-content"/> 1073 1099 <x:anchor-alias value="field-name"/> 1074 1100 <x:anchor-alias value="field-value"/> 1075 <x:anchor-alias value="message-header"/> 1076 <t> 1077 HTTP header fields follow the same general format as Internet messages in 1078 <xref target="RFC5322" x:fmt="of" x:sec="2.1"/>. Each header field consists 1079 of a name followed by a colon (":"), optional whitespace, and the field 1080 value. Field names are case-insensitive. 1081 </t> 1082 <figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="message-header"/><iref primary="true" item="Grammar" subitem="field-name"/><iref primary="true" item="Grammar" subitem="field-value"/><iref primary="true" item="Grammar" subitem="field-content"/> 1083 <x:ref>message-header</x:ref> = <x:ref>field-name</x:ref> ":" OWS [ <x:ref>field-value</x:ref> ] OWS 1101 <x:anchor-alias value="OWS"/> 1102 <t> 1103 Each HTTP header field consists of a case-insensitive field name 1104 followed by a colon (":"), optional whitespace, and the field value. 1105 </t> 1106 <figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="header-field"/><iref primary="true" item="Grammar" subitem="field-name"/><iref primary="true" item="Grammar" subitem="field-value"/><iref primary="true" item="Grammar" subitem="field-content"/> 1107 <x:ref>header-field</x:ref> = <x:ref>field-name</x:ref> ":" OWS [ <x:ref>field-value</x:ref> ] OWS 1084 1108 <x:ref>field-name</x:ref> = <x:ref>token</x:ref> 1085 1109 <x:ref>field-value</x:ref> = *( <x:ref>field-content</x:ref> / <x:ref>OWS</x:ref> ) … … 1087 1111 </artwork></figure> 1088 1112 <t> 1089 Historically, HTTP has allowed field-content with text in the ISO-8859-1 1090 <xref target="ISO-8859-1"/> character encoding (allowing other character sets 1091 through use of <xref target="RFC2047"/> encoding). In practice, most HTTP 1092 header field-values use only a subset of the US-ASCII charset 1093 <xref target="USASCII"/>. Newly defined header fields &SHOULD; constrain 1094 their field-values to US-ASCII characters. Recipients &SHOULD; treat other 1095 (obs-text) octets in field-content as opaque data. 1096 </t> 1097 <t> 1098 No whitespace is allowed between the header field-name and colon. For 1113 No whitespace is allowed between the header field name and colon. For 1099 1114 security reasons, any request message received containing such whitespace 1100 &MUST; be rejected with a response code of 400 (Bad Request) and any such 1101 whitespace in a response message &MUST; be removed. 1102 </t> 1103 <t> 1104 The field value &MAY; be preceded by optional whitespace; a single SP is 1105 preferred. The field-value does not include any leading or trailing white 1115 &MUST; be rejected with a response code of 400 (Bad Request). A proxy 1116 &MUST; remove any such whitespace from a response message before 1117 forwarding the message downstream. 1118 </t> 1119 <t> 1120 A field value &MAY; be preceded by optional whitespace (OWS); a single SP is 1121 preferred. The field value does not include any leading or trailing white 1106 1122 space: OWS occurring before the first non-whitespace character of the 1107 field -value or after the last non-whitespace character of the field-value1108 is ignored and & MAY; be removed without changing the meaning of the header1123 field value or after the last non-whitespace character of the field value 1124 is ignored and &SHOULD; be removed without changing the meaning of the header 1109 1125 field. 1110 1126 </t> 1127 <t> 1128 The order in which header fields with differing field names are 1129 received is not significant. However, it is "good practice" to send 1130 header fields that contain control data first, such as Host on 1131 requests and Date on responses, so that implementations can decide 1132 when not to handle a message as early as possible. A server &MUST; 1133 wait until the entire header section is received before interpreting 1134 a request message, since later header fields might include conditionals, 1135 authentication credentials, or deliberately misleading duplicate 1136 header fields that would impact request processing. 1137 </t> 1138 <t> 1139 Multiple header fields with the same field name &MAY; be 1140 sent in a message if and only if the entire field value for that 1141 header field is defined as a comma-separated list [i.e., #(values)]. 1142 Multiple header fields with the same field name can be combined into 1143 one "field-name: field-value" pair, without changing the semantics of the 1144 message, by appending each subsequent field value to the combined 1145 field value in order, separated by a comma. The order in which 1146 header fields with the same field name are received is therefore 1147 significant to the interpretation of the combined field value; 1148 a proxy &MUST-NOT; change the order of these field values when 1149 forwarding a message. 1150 </t> 1151 <x:note> 1152 <t> 1153 <x:h>Note:</x:h> the "Set-Cookie" header as implemented in 1154 practice (as opposed to how it is specified in <xref target="RFC2109"/>) 1155 can occur multiple times, but does not use the list syntax, and thus cannot 1156 be combined into a single line. (See Appendix A.2.3 of <xref target="Kri2001"/> 1157 for details.) Also note that the Set-Cookie2 header specified in 1158 <xref target="RFC2965"/> does not share this problem. 1159 </t> 1160 </x:note> 1111 1161 <t> 1112 1162 Historically, HTTP header field values could be extended over multiple … … 1122 1172 or forwarding the message downstream. 1123 1173 </t> 1174 <t> 1175 Historically, HTTP has allowed field content with text in the ISO-8859-1 1176 <xref target="ISO-8859-1"/> character encoding and supported other 1177 character sets only through use of <xref target="RFC2047"/> encoding. 1178 In practice, most HTTP header field values use only a subset of the 1179 US-ASCII character encoding <xref target="USASCII"/>. Newly defined 1180 header fields &SHOULD; limit their field values to US-ASCII characters. 1181 Recipients &SHOULD; treat other (obs-text) octets in field content as 1182 opaque data. 1183 </t> 1124 1184 <t anchor="rule.comment"> 1125 1185 <x:anchor-alias value="comment"/> … … 1128 1188 the comment text with parentheses. Comments are only allowed in 1129 1189 fields containing "comment" as part of their field value definition. 1130 In all other fields, parentheses are considered part of the field1131 value.1132 1190 </t> 1133 1191 <figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="comment"/><iref primary="true" item="Grammar" subitem="ctext"/> … … 1136 1194 ; <x:ref>OWS</x:ref> / <<x:ref>VCHAR</x:ref> except "(", ")", and "\"> / <x:ref>obs-text</x:ref> 1137 1195 </artwork></figure> 1138 <t>1139 The order in which header fields with differing field names are1140 received is not significant. However, it is "good practice" to send1141 general-header fields first, followed by request-header or response-header1142 fields, and ending with the entity-header fields.1143 </t>1144 <t>1145 Multiple message-header fields with the same field-name &MAY; be1146 present in a message if and only if the entire field-value for that1147 header field is defined as a comma-separated list [i.e., #(values)].1148 It &MUST; be possible to combine the multiple header fields into one1149 "field-name: field-value" pair, without changing the semantics of the1150 message, by appending each subsequent field-value to the first, each1151 separated by a comma. The order in which header fields with the same1152 field-name are received is therefore significant to the1153 interpretation of the combined field value, and thus a proxy &MUST-NOT;1154 change the order of these field values when a message is forwarded.1155 </t>1156 <x:note>1157 <t>1158 <x:h>Note:</x:h> the "Set-Cookie" header as implemented in1159 practice (as opposed to how it is specified in <xref target="RFC2109"/>)1160 can occur multiple times, but does not use the list syntax, and thus cannot1161 be combined into a single line. (See Appendix A.2.3 of <xref target="Kri2001"/>1162 for details.) Also note that the Set-Cookie2 header specified in1163 <xref target="RFC2965"/> does not share this problem.1164 </t>1165 </x:note>1166 1196 1167 1197 </section> … … 1194 1224 The presence of a message-body in a request is signaled by the 1195 1225 inclusion of a Content-Length or Transfer-Encoding header field in 1196 the request's message-headers.1226 the request's header fields. 1197 1227 When a request message contains both a message-body of non-zero 1198 1228 length and a method that does not define any semantics for that … … 2371 2401 2372 2402 2373 <section title="Header Field Definitions" anchor="header.field s">2403 <section title="Header Field Definitions" anchor="header.field.definitions"> 2374 2404 <t> 2375 2405 This section defines the syntax and semantics of HTTP/1.1 header fields … … 4059 4089 </t> 4060 4090 <t> 4061 The line terminator for message-header fields is the sequence CRLF.4091 The line terminator for header fields is the sequence CRLF. 4062 4092 However, we recommend that applications, when parsing such headers, 4063 4093 recognize a single LF as a line terminator and ignore the leading CR. … … 4297 4327 <t> 4298 4328 Require that invalid whitespace around field-names be rejected. 4299 (<xref target=" message.headers"/>)4329 (<xref target="header.fields"/>) 4300 4330 </t> 4301 4331 <t> … … 4387 4417 <x:ref>HTTP-Version</x:ref> = HTTP-Prot-Name "/" 1*DIGIT "." 1*DIGIT 4388 4418 <x:ref>HTTP-date</x:ref> = rfc1123-date / obs-date 4389 <x:ref>HTTP-message</x:ref> = Request / Response 4419 <x:ref>HTTP-message</x:ref> = start-line *( header-field CRLF ) CRLF [ 4420 message-body ] 4390 4421 <x:ref>Host</x:ref> = "Host:" OWS Host-v 4391 4422 <x:ref>Host-v</x:ref> = uri-host [ ":" port ] … … 4475 4506 <x:ref>general-header</x:ref> = Cache-Control / Connection / Date / Pragma / Trailer 4476 4507 / Transfer-Encoding / Upgrade / Via / Warning 4477 <x:ref>generic-message</x:ref> = start-line *( message-header CRLF ) CRLF [4478 message-body ]4479 4508 4480 4509 <x:ref>hour</x:ref> = 2DIGIT … … 4486 4515 <x:ref>message-body</x:ref> = entity-body / 4487 4516 <entity-body encoded as per Transfer-Encoding> 4488 <x:ref> message-header</x:ref> = field-name ":" OWS [ field-value ] OWS4517 <x:ref>header-field</x:ref> = field-name ":" OWS [ field-value ] OWS 4489 4518 <x:ref>minute</x:ref> = 2DIGIT 4490 4519 <x:ref>month</x:ref> = %x4A.61.6E ; Jan
Note: See TracChangeset
for help on using the changeset viewer.