Changeset 1177


Ignore:
Timestamp:
Mar 13, 2011, 8:25:50 PM (8 years ago)
Author:
fielding@…
Message:

update generated HTML

Location:
draft-ietf-httpbis/latest
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • draft-ietf-httpbis/latest/p1-messaging.html

    r1175 r1177  
    383383      <link rel="Chapter" title="1 Introduction" href="#rfc.section.1">
    384384      <link rel="Chapter" title="2 HTTP-related architecture" href="#rfc.section.2">
    385       <link rel="Chapter" title="3 HTTP Message" href="#rfc.section.3">
     385      <link rel="Chapter" title="3 Message Format" href="#rfc.section.3">
    386386      <link rel="Chapter" title="4 Request" href="#rfc.section.4">
    387387      <link rel="Chapter" title="5 Response" href="#rfc.section.5">
     
    564564            </ul>
    565565         </li>
    566          <li>3.&nbsp;&nbsp;&nbsp;<a href="#http.message">HTTP Message</a><ul>
     566         <li>3.&nbsp;&nbsp;&nbsp;<a href="#http.message">Message Format</a><ul>
    567567               <li>3.1&nbsp;&nbsp;&nbsp;<a href="#message.robustness">Message Parsing Robustness</a></li>
    568568               <li>3.2&nbsp;&nbsp;&nbsp;<a href="#header.fields">Header Fields</a></li>
     
    815815         </p>
    816816      </div>
    817       <p id="rfc.section.1.2.2.p.3">The OWS rule is used where zero or more linear whitespace characters might appear. OWS <em class="bcp14">SHOULD</em> either not be produced or be produced as a single SP character. Multiple OWS characters that occur within field-content <em class="bcp14">SHOULD</em> be replaced with a single SP before interpreting the field value or forwarding the message downstream.
    818       </p>
    819       <p id="rfc.section.1.2.2.p.4">RWS is used when at least one linear whitespace character is required to separate field tokens. RWS <em class="bcp14">SHOULD</em> be produced as a single SP character. Multiple RWS characters that occur within field-content <em class="bcp14">SHOULD</em> be replaced with a single SP before interpreting the field value or forwarding the message downstream.
     817      <p id="rfc.section.1.2.2.p.3">The OWS rule is used where zero or more linear whitespace octets might appear. OWS <em class="bcp14">SHOULD</em> either not be produced or be produced as a single SP. Multiple OWS octets that occur within field-content <em class="bcp14">SHOULD</em> be replaced with a single SP before interpreting the field value or forwarding the message downstream.
     818      </p>
     819      <p id="rfc.section.1.2.2.p.4">RWS is used when at least one linear whitespace octet is required to separate field tokens. RWS <em class="bcp14">SHOULD</em> be produced as a single SP. Multiple RWS octets that occur within field-content <em class="bcp14">SHOULD</em> be replaced with a single SP before interpreting the field value or forwarding the message downstream.
    820820      </p>
    821821      <p id="rfc.section.1.2.2.p.5">BWS is used where the grammar allows optional whitespace for historical reasons but senders <em class="bcp14">SHOULD NOT</em> produce it in messages. HTTP/1.1 recipients <em class="bcp14">MUST</em> accept such bad optional whitespace and remove it before interpreting the field value or forwarding the message downstream.
     
    857857  <a href="#rule.quoted-string" class="smpl">obs-text</a>       = %x80-FF
    858858</pre><div id="rule.quoted-pair">
    859          <p id="rfc.section.1.2.2.p.12">  The backslash character ("\") can be used as a single-character quoting mechanism within quoted-string constructs:</p>
     859         <p id="rfc.section.1.2.2.p.12">  The backslash octet ("\") can be used as a single-octet quoting mechanism within quoted-string constructs:</p>
    860860      </div>
    861861      <div id="rfc.figure.u.11"></div><pre class="inline"><span id="rfc.iref.g.23"></span>  <a href="#rule.quoted-pair" class="smpl">quoted-pair</a>    = "\" ( <a href="#core.rules" class="smpl">WSP</a> / <a href="#core.rules" class="smpl">VCHAR</a> / <a href="#rule.quoted-string" class="smpl">obs-text</a> )
    862 </pre><p id="rfc.section.1.2.2.p.14">Producers <em class="bcp14">SHOULD NOT</em> escape characters that do not require escaping (i.e., other than DQUOTE and the backslash character).
     862</pre><p id="rfc.section.1.2.2.p.14">Senders <em class="bcp14">SHOULD NOT</em> escape octets that do not require escaping (i.e., other than DQUOTE and the backslash octet).
    863863      </p>
    864864      <h1 id="rfc.section.2"><a href="#rfc.section.2">2.</a>&nbsp;<a id="architecture" href="#architecture">HTTP-related architecture</a></h1>
     
    10221022      <div id="rfc.figure.u.17"></div><pre class="inline"><span id="rfc.iref.g.25"></span><span id="rfc.iref.g.26"></span>  <a href="#http.version" class="smpl">HTTP-Version</a>   = <a href="#http.version" class="smpl">HTTP-Prot-Name</a> "/" 1*<a href="#core.rules" class="smpl">DIGIT</a> "." 1*<a href="#core.rules" class="smpl">DIGIT</a>
    10231023  <a href="#http.version" class="smpl">HTTP-Prot-Name</a> = %x48.54.54.50 ; "HTTP", case-sensitive
    1024 </pre><p id="rfc.section.2.5.p.4">The HTTP version number consists of two non-negative decimal integers separated by the "." (period or decimal point) character.
    1025          The first number ("major version") indicates the HTTP messaging syntax, whereas the second number ("minor version") indicates
    1026          the highest minor version to which the sender is at least conditionally compliant and able to understand for future communication.
    1027          The minor version advertises the sender's communication capabilities even when the sender is only using a backwards-compatible
     1024</pre><p id="rfc.section.2.5.p.4">The HTTP version number consists of two non-negative decimal integers separated by a "." (period or decimal point). The first
     1025         number ("major version") indicates the HTTP messaging syntax, whereas the second number ("minor version") indicates the highest
     1026         minor version to which the sender is at least conditionally compliant and able to understand for future communication. The
     1027         minor version advertises the sender's communication capabilities even when the sender is only using a backwards-compatible
    10281028         subset of the protocol, thereby letting the recipient know that more advanced features can be used in response (by servers)
    10291029         or in future requests (by clients).
     
    11671167   http://EXAMPLE.com/%7Esmith/home.html
    11681168   http://EXAMPLE.com:/%7esmith/home.html
    1169 </pre><h1 id="rfc.section.3"><a href="#rfc.section.3">3.</a>&nbsp;<a id="http.message" href="#http.message">HTTP Message</a></h1>
     1169</pre><h1 id="rfc.section.3"><a href="#rfc.section.3">3.</a>&nbsp;<a id="http.message" href="#http.message">Message Format</a></h1>
    11701170      <div id="rfc.iref.h.3"></div>
    11711171      <div id="rfc.iref.h.4"></div>
    11721172      <div id="rfc.iref.h.5"></div>
    1173       <p id="rfc.section.3.p.1">All HTTP/1.1 messages consist of a start-line followed by a sequence of characters in a format similar to the Internet Message
     1173      <p id="rfc.section.3.p.1">All HTTP/1.1 messages consist of a start-line followed by a sequence of octets in a format similar to the Internet Message
    11741174         Format <a href="#RFC5322" id="rfc.xref.RFC5322.2"><cite title="Internet Message Format">[RFC5322]</cite></a>: zero or more header fields (collectively referred to as the "headers" or the "header section"), an empty line indicating
    11751175         the end of the header section, and an optional message-body.
     
    11861186                    [ <a href="#message.body" class="smpl">message-body</a> ]
    11871187  <a href="#http.message" class="smpl">start-line</a>      = <a href="#request-line" class="smpl">Request-Line</a> / <a href="#status-line" class="smpl">Status-Line</a>
    1188 </pre><p id="rfc.section.3.p.4">Whitespace (WSP) <em class="bcp14">MUST NOT</em> be sent between the start-line and the first header field. The presence of whitespace might be an attempt to trick a noncompliant
    1189          implementation of HTTP into ignoring that field or processing the next line as a new request, either of which might result
    1190          in security issues when implementations within the request chain interpret the same message differently. HTTP/1.1 servers <em class="bcp14">MUST</em> reject such a message with a 400 (Bad Request) response.
     1188</pre><p id="rfc.section.3.p.4">Implementations <em class="bcp14">MUST NOT</em> send whitespace between the start-line and the first header field. The presence of such whitespace in a request might be an
     1189         attempt to trick a server into ignoring that field or processing the line after it as a new request, either of which might
     1190         result in a security vulnerability if other implementations within the request chain interpret the same message differently.
     1191         Likewise, the presence of such whitespace in a response might be ignored by some clients or cause others to cease parsing.
    11911192      </p>
    11921193      <h2 id="rfc.section.3.1"><a href="#rfc.section.3.1">3.1</a>&nbsp;<a id="message.robustness" href="#message.robustness">Message Parsing Robustness</a></h2>
     
    11981199         the client <em class="bcp14">MUST</em> include the terminating CRLF octets as part of the message-body length.
    11991200      </p>
    1200       <p id="rfc.section.3.1.p.3">The normal procedure for parsing an HTTP message is to read the start-line into a structure, read each header field into a
     1201      <p id="rfc.section.3.1.p.3">When a server listening only for HTTP request messages, or processing what appears from the start-line to be an HTTP request
     1202         message, receives a sequence of octets that does not match the HTTP-message grammar aside from the robustness exceptions listed
     1203         above, the server <em class="bcp14">MUST</em> respond with an HTTP/1.1 400 (Bad Request) response.
     1204      </p>
     1205      <p id="rfc.section.3.1.p.4">The normal procedure for parsing an HTTP message is to read the start-line into a structure, read each header field into a
    12011206         hash table by field name until the empty line, and then use the parsed data to determine if a message-body is expected. If
    12021207         a message-body has been indicated, then it is read as a stream until an amount of octets equal to the message-body length
     
    12051210         might introduce security flaws due to the differing ways that such parsers interpret invalid characters.
    12061211      </p>
    1207       <p id="rfc.section.3.1.p.4">HTTP allows the set of defined header fields to be extended without changing the protocol version (see <a href="#header.field.registration" title="Header Field Registration">Section&nbsp;10.1</a>). Unrecognized header fields <em class="bcp14">MUST</em> be forwarded by a proxy unless the proxy is specifically configured to block or otherwise transform such fields. Unrecognized
     1212      <p id="rfc.section.3.1.p.5">HTTP allows the set of defined header fields to be extended without changing the protocol version (see <a href="#header.field.registration" title="Header Field Registration">Section&nbsp;10.1</a>). Unrecognized header fields <em class="bcp14">MUST</em> be forwarded by a proxy unless the proxy is specifically configured to block or otherwise transform such fields. Unrecognized
    12081213         header fields <em class="bcp14">SHOULD</em> be ignored by other recipients.
    12091214      </p>
     
    12201225      </p>
    12211226      <p id="rfc.section.3.2.p.4">A field value <em class="bcp14">MAY</em> be preceded by optional whitespace (OWS); a single SP is preferred. The field value does not include any leading or trailing
    1222          white space: OWS occurring before the first non-whitespace character of the field value or after the last non-whitespace character
     1227         white space: OWS occurring before the first non-whitespace octet of the field value or after the last non-whitespace octet
    12231228         of the field value is ignored and <em class="bcp14">SHOULD</em> be removed before further processing (as this does not change the meaning of the header field).
    12241229      </p>
     
    12401245      </div>
    12411246      <p id="rfc.section.3.2.p.8">Historically, HTTP header field values could be extended over multiple lines by preceding each extra line with at least one
    1242          space or horizontal tab character (line folding). This specification deprecates such line folding except within the message/http
     1247         space or horizontal tab octet (line folding). This specification deprecates such line folding except within the message/http
    12431248         media type (<a href="#internet.media.type.message.http" title="Internet Media Type message/http">Section&nbsp;10.3.1</a>). HTTP/1.1 senders <em class="bcp14">MUST NOT</em> produce messages that include line folding (i.e., that contain any field-content that matches the obs-fold rule) unless the
    12441249         message is intended for packaging within the message/http media type. HTTP/1.1 recipients <em class="bcp14">SHOULD</em> accept line folding and replace any embedded obs-fold whitespace with a single SP prior to interpreting the field value or
    12451250         forwarding the message downstream.
    12461251      </p>
    1247       <p id="rfc.section.3.2.p.9">Historically, HTTP has allowed field content with text in the ISO-8859-1 <a href="#ISO-8859-1" id="rfc.xref.ISO-8859-1.1"><cite title="Information technology -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1">[ISO-8859-1]</cite></a> character encoding and supported other character sets only through use of <a href="#RFC2047" id="rfc.xref.RFC2047.1"><cite title="MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text">[RFC2047]</cite></a> encoding. In practice, most HTTP header field values use only a subset of the US-ASCII character encoding <a href="#USASCII" id="rfc.xref.USASCII.2"><cite title="Coded Character Set -- 7-bit American Standard Code for Information Interchange">[USASCII]</cite></a>. Newly defined header fields <em class="bcp14">SHOULD</em> limit their field values to US-ASCII characters. Recipients <em class="bcp14">SHOULD</em> treat other (obs-text) octets in field content as opaque data.
     1252      <p id="rfc.section.3.2.p.9">Historically, HTTP has allowed field content with text in the ISO-8859-1 <a href="#ISO-8859-1" id="rfc.xref.ISO-8859-1.1"><cite title="Information technology -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1">[ISO-8859-1]</cite></a> character encoding and supported other character sets only through use of <a href="#RFC2047" id="rfc.xref.RFC2047.1"><cite title="MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text">[RFC2047]</cite></a> encoding. In practice, most HTTP header field values use only a subset of the US-ASCII character encoding <a href="#USASCII" id="rfc.xref.USASCII.2"><cite title="Coded Character Set -- 7-bit American Standard Code for Information Interchange">[USASCII]</cite></a>. Newly defined header fields <em class="bcp14">SHOULD</em> limit their field values to US-ASCII octets. Recipients <em class="bcp14">SHOULD</em> treat other (obs-text) octets in field content as opaque data.
    12481253      </p>
    12491254      <div id="rule.comment">
     
    12561261                 ; <a href="#rule.whitespace" class="smpl">OWS</a> / &lt;<a href="#core.rules" class="smpl">VCHAR</a> except "(", ")", and "\"&gt; / <a href="#rule.quoted-string" class="smpl">obs-text</a>
    12571262</pre><div id="rule.quoted-cpair">
    1258          <p id="rfc.section.3.2.p.12">  The backslash character ("\") can be used as a single-character quoting mechanism within comment constructs:</p>
     1263         <p id="rfc.section.3.2.p.12">  The backslash octet ("\") can be used as a single-octet quoting mechanism within comment constructs:</p>
    12591264      </div>
    12601265      <div id="rfc.figure.u.25"></div><pre class="inline"><span id="rfc.iref.g.43"></span>  <a href="#rule.quoted-cpair" class="smpl">quoted-cpair</a>    = "\" ( <a href="#core.rules" class="smpl">WSP</a> / <a href="#core.rules" class="smpl">VCHAR</a> / <a href="#rule.quoted-string" class="smpl">obs-text</a> )
    1261 </pre><p id="rfc.section.3.2.p.14">Producers <em class="bcp14">SHOULD NOT</em> escape characters that do not require escaping (i.e., other than the backslash character "\" and the parentheses "(" and ")").
     1266</pre><p id="rfc.section.3.2.p.14">Senders <em class="bcp14">SHOULD NOT</em> escape octets that do not require escaping (i.e., other than the backslash octet "\" and the parentheses "(" and ")").
    12621267      </p>
    12631268      <h2 id="rfc.section.3.3"><a href="#rfc.section.3.3">3.3</a>&nbsp;<a id="message.body" href="#message.body">Message Body</a></h2>
     
    13941399      </div>
    13951400      <h1 id="rfc.section.4"><a href="#rfc.section.4">4.</a>&nbsp;<a id="request" href="#request">Request</a></h1>
    1396       <p id="rfc.section.4.p.1">A request message from a client to a server includes, within the first line of that message, the method to be applied to the
    1397          resource, the identifier of the resource, and the protocol version in use.
     1401      <p id="rfc.section.4.p.1">A request message from a client to a server begins with a Request-Line, followed by zero or more header fields, an empty line
     1402         signifying the end of the header block, and an optional message body.
    13981403      </p>
    13991404      <div id="rfc.figure.u.27"></div><pre class="inline"><span id="rfc.iref.g.45"></span>  <a href="#request" class="smpl">Request</a>       = <a href="#request-line" class="smpl">Request-Line</a>              ; <a href="#request-line" title="Request-Line">Section&nbsp;4.1</a>
     
    14021407                  [ <a href="#message.body" class="smpl">message-body</a> ]          ; <a href="#message.body" title="Message Body">Section&nbsp;3.3</a>
    14031408</pre><h2 id="rfc.section.4.1"><a href="#rfc.section.4.1">4.1</a>&nbsp;<a id="request-line" href="#request-line">Request-Line</a></h2>
    1404       <p id="rfc.section.4.1.p.1">The Request-Line begins with a method token, followed by the request-target and the protocol version, and ending with CRLF.
    1405          The elements are separated by SP characters. No CR or LF is allowed except in the final CRLF sequence.
     1409      <p id="rfc.section.4.1.p.1">The Request-Line begins with a method token, followed by a single space (SP), the request-target, another single space (SP),
     1410         the protocol version, and ending with CRLF.
    14061411      </p>
    14071412      <div id="rfc.figure.u.28"></div><pre class="inline"><span id="rfc.iref.g.46"></span>  <a href="#request-line" class="smpl">Request-Line</a>   = <a href="#method" class="smpl">Method</a> <a href="#core.rules" class="smpl">SP</a> <a href="#request-target" class="smpl">request-target</a> <a href="#core.rules" class="smpl">SP</a> <a href="#http.version" class="smpl">HTTP-Version</a> <a href="#core.rules" class="smpl">CRLF</a>
     
    15111516            TCP connection,
    15121517         </li>
    1513          <li>the character sequence "://",</li>
     1518         <li>the octet sequence "://",</li>
    15141519         <li>the authority component, as specified in the Host header field (<a href="#header.host" id="rfc.xref.header.host.1" title="Host">Section&nbsp;9.4</a>), and
    15151520         </li>
     
    15411546                  [ <a href="#message.body" class="smpl">message-body</a> ]          ; <a href="#message.body" title="Message Body">Section&nbsp;3.3</a>
    15421547</pre><h2 id="rfc.section.5.1"><a href="#rfc.section.5.1">5.1</a>&nbsp;<a id="status-line" href="#status-line">Status-Line</a></h2>
    1543       <p id="rfc.section.5.1.p.1">The first line of a Response message is the Status-Line, consisting of the protocol version followed by a numeric status code
    1544          and its associated textual phrase, with each element separated by SP characters. No CR or LF is allowed except in the final
    1545          CRLF sequence.
     1548      <p id="rfc.section.5.1.p.1">The first line of a Response message is the Status-Line, consisting of the protocol version, a space (SP), the status code,
     1549         another space, a possibly-empty textual phrase describing the status code, and ending with CRLF.
    15461550      </p>
    15471551      <div id="rfc.figure.u.39"></div><pre class="inline"><span id="rfc.iref.g.50"></span>  <a href="#status-line" class="smpl">Status-Line</a> = <a href="#http.version" class="smpl">HTTP-Version</a> <a href="#core.rules" class="smpl">SP</a> <a href="#status.code.and.reason.phrase" class="smpl">Status-Code</a> <a href="#core.rules" class="smpl">SP</a> <a href="#status.code.and.reason.phrase" class="smpl">Reason-Phrase</a> <a href="#core.rules" class="smpl">CRLF</a>
     
    18041808      <div id="rfc.figure.u.53"></div><pre class="text">  User-Agent: CERN-LineMode/2.15 libwww/2.17b3
    18051809  Server: Apache/0.8.4
    1806 </pre><p id="rfc.section.6.3.p.5">Product tokens <em class="bcp14">SHOULD</em> be short and to the point. They <em class="bcp14">MUST NOT</em> be used for advertising or other non-essential information. Although any token character <em class="bcp14">MAY</em> appear in a product-version, this token <em class="bcp14">SHOULD</em> only be used for a version identifier (i.e., successive versions of the same product <em class="bcp14">SHOULD</em> only differ in the product-version portion of the product value).
     1810</pre><p id="rfc.section.6.3.p.5">Product tokens <em class="bcp14">SHOULD</em> be short and to the point. They <em class="bcp14">MUST NOT</em> be used for advertising or other non-essential information. Although any token octet <em class="bcp14">MAY</em> appear in a product-version, this token <em class="bcp14">SHOULD</em> only be used for a version identifier (i.e., successive versions of the same product <em class="bcp14">SHOULD</em> only differ in the product-version portion of the product value).
    18071811      </p>
    18081812      <h2 id="rfc.section.6.4"><a href="#rfc.section.6.4">6.4</a>&nbsp;<a id="quality.values" href="#quality.values">Quality Values</a></h2>
     
    29762980         can be interpreted unambiguously.
    29772981      </p>
    2978       <p id="rfc.section.A.p.2">Clients <em class="bcp14">SHOULD</em> be tolerant in parsing the Status-Line and servers <em class="bcp14">SHOULD</em> be tolerant when parsing the Request-Line. In particular, they <em class="bcp14">SHOULD</em> accept any amount of WSP characters between fields, even though only a single SP is required.
    2979       </p>
    2980       <p id="rfc.section.A.p.3">The line terminator for header fields is the sequence CRLF. However, we recommend that applications, when parsing such headers
     2982      <p id="rfc.section.A.p.2">The line terminator for header fields is the sequence CRLF. However, we recommend that applications, when parsing such headers
    29812983         fields, recognize a single LF as a line terminator and ignore the leading CR.
    29822984      </p>
    2983       <p id="rfc.section.A.p.4">The character set of a representation <em class="bcp14">SHOULD</em> be labeled as the lowest common denominator of the character codes used within that representation, with the exception that
     2985      <p id="rfc.section.A.p.3">The character encoding of a representation <em class="bcp14">SHOULD</em> be labeled as the lowest common denominator of the character codes used within that representation, with the exception that
    29842986         not labeling the representation is preferred over labeling the representation with the labels US-ASCII or ISO-8859-1. See <a href="#Part3" id="rfc.xref.Part3.6"><cite title="HTTP/1.1, part 3: Message Payload and Content Negotiation">[Part3]</cite></a>.
    29852987      </p>
    2986       <p id="rfc.section.A.p.5">Additional rules for requirements on parsing and encoding of dates and other potential problems with date encodings include:</p>
    2987       <p id="rfc.section.A.p.6"> </p>
     2988      <p id="rfc.section.A.p.4">Additional rules for requirements on parsing and encoding of dates and other potential problems with date encodings include:</p>
     2989      <p id="rfc.section.A.p.5"> </p>
    29882990      <ul>
    29892991         <li>HTTP/1.1 clients and caches <em class="bcp14">SHOULD</em> assume that an RFC-850 date which appears to be more than 50 years in the future is in fact in the past (this helps solve
     
    30733075      </p>
    30743076      <p id="rfc.section.B.3.p.2">Rules about implicit linear whitespace between certain grammar productions have been removed; now it's only allowed when specifically
    3075          pointed out in the ABNF. The NUL character is no longer allowed in comment and quoted-string text. The quoted-pair rule no
    3076          longer allows escaping control characters other than HTAB. Non-ASCII content in header fields and reason phrase has been obsoleted
     3077         pointed out in the ABNF. The NUL octet is no longer allowed in comment and quoted-string text. The quoted-pair rule no longer
     3078         allows escaping control characters other than HTAB. Non-ASCII content in header fields and reason phrase has been obsoleted
    30773079         and made opaque (the TEXT rule was removed) (<a href="#basic.rules" title="Basic Rules">Section&nbsp;1.2.2</a>)
    30783080      </p>
  • draft-ietf-httpbis/latest/p3-payload.html

    r1174 r1177  
    546546         </li>
    547547         <li>2.&nbsp;&nbsp;&nbsp;<a href="#protocol.parameters">Protocol Parameters</a><ul>
    548                <li>2.1&nbsp;&nbsp;&nbsp;<a href="#character.sets">Character Sets</a><ul>
     548               <li>2.1&nbsp;&nbsp;&nbsp;<a href="#character.sets">Character Encodings (charset)</a><ul>
    549549                     <li>2.1.1&nbsp;&nbsp;&nbsp;<a href="#missing.charset">Missing Charset</a></li>
    550550                  </ul>
     
    685685  <a href="#abnf.dependencies" class="smpl">qvalue</a>         = &lt;qvalue, defined in <a href="#Part1" id="rfc.xref.Part1.8"><cite title="HTTP/1.1, part 1: URIs, Connections, and Message Parsing">[Part1]</cite></a>, <a href="p1-messaging.html#quality.values" title="Quality Values">Section 6.4</a>&gt;
    686686</pre><h1 id="rfc.section.2"><a href="#rfc.section.2">2.</a>&nbsp;<a id="protocol.parameters" href="#protocol.parameters">Protocol Parameters</a></h1>
    687       <h2 id="rfc.section.2.1"><a href="#rfc.section.2.1">2.1</a>&nbsp;<a id="character.sets" href="#character.sets">Character Sets</a></h2>
    688       <p id="rfc.section.2.1.p.1">HTTP uses the same definition of the term "character set" as that described for MIME:</p>
    689       <p id="rfc.section.2.1.p.2">The term "character set" is used in this document to refer to a method used with one or more tables to convert a sequence
    690          of octets into a sequence of characters. Note that unconditional conversion in the other direction is not required, in that
    691          not all characters might be available in a given character set and a character set might provide more than one sequence of
    692          octets to represent a particular character. This definition is intended to allow various kinds of character encoding, from
    693          simple single-table mappings such as US-ASCII to complex table switching methods such as those that use ISO-2022's techniques.
    694          However, the definition associated with a MIME character set name <em class="bcp14">MUST</em> fully specify the mapping to be performed from octets to characters. In particular, use of external profiling information
    695          to determine the exact mapping is not permitted.
    696       </p>
    697       <div class="note" id="rfc.section.2.1.p.3">
    698          <p> <b>Note:</b> This use of the term "character set" is more commonly referred to as a "character encoding". However, since HTTP and MIME
    699             share the same registry, it is important that the terminology also be shared.
    700          </p>
    701       </div>
     687      <h2 id="rfc.section.2.1"><a href="#rfc.section.2.1">2.1</a>&nbsp;<a id="character.sets" href="#character.sets">Character Encodings (charset)</a></h2>
     688      <p id="rfc.section.2.1.p.1">HTTP uses charset names to indicate the character encoding of a textual representation.</p>
    702689      <div id="rule.charset">
    703          <p id="rfc.section.2.1.p.4">  HTTP character sets are identified by case-insensitive tokens. The complete set of tokens is defined by the IANA Character
     690         <p id="rfc.section.2.1.p.2">  A character encoding is identified by a case-insensitive token. The complete set of tokens is defined by the IANA Character
    704691            Set registry (&lt;<a href="http://www.iana.org/assignments/character-sets">http://www.iana.org/assignments/character-sets</a>&gt;).
    705692         </p>
    706693      </div>
    707694      <div id="rfc.figure.u.3"></div><pre class="inline"><span id="rfc.iref.g.1"></span>  <a href="#rule.charset" class="smpl">charset</a> = <a href="#core.rules" class="smpl">token</a>
    708 </pre><p id="rfc.section.2.1.p.6">Although HTTP allows an arbitrary token to be used as a charset value, any token that has a predefined value within the IANA
    709          Character Set registry <em class="bcp14">MUST</em> represent the character set defined by that registry. Applications <em class="bcp14">SHOULD</em> limit their use of character sets to those defined by the IANA registry.
    710       </p>
    711       <p id="rfc.section.2.1.p.7">HTTP uses charset in two contexts: within an Accept-Charset request header field (in which the charset value is an unquoted
     695</pre><p id="rfc.section.2.1.p.4">Although HTTP allows an arbitrary token to be used as a charset value, any token that has a predefined value within the IANA
     696         Character Set registry <em class="bcp14">MUST</em> represent the character encoding defined by that registry. Applications <em class="bcp14">SHOULD</em> limit their use of character encodings to those defined within the IANA registry.
     697      </p>
     698      <p id="rfc.section.2.1.p.5">HTTP uses charset in two contexts: within an Accept-Charset request header field (in which the charset value is an unquoted
    712699         token) and as the value of a parameter in a Content-Type header field (within a request or response), in which case the parameter
    713700         value of the charset parameter can be quoted.
    714701      </p>
    715       <p id="rfc.section.2.1.p.8">Implementors need to be aware of IETF character set requirements <a href="#RFC3629" id="rfc.xref.RFC3629.1"><cite title="UTF-8, a transformation format of ISO 10646">[RFC3629]</cite></a>  <a href="#RFC2277" id="rfc.xref.RFC2277.1"><cite title="IETF Policy on Character Sets and Languages">[RFC2277]</cite></a>.
     702      <p id="rfc.section.2.1.p.6">Implementors need to be aware of IETF character set requirements <a href="#RFC3629" id="rfc.xref.RFC3629.1"><cite title="UTF-8, a transformation format of ISO 10646">[RFC3629]</cite></a>  <a href="#RFC2277" id="rfc.xref.RFC2277.1"><cite title="IETF Policy on Character Sets and Languages">[RFC2277]</cite></a>.
    716703      </p>
    717704      <h3 id="rfc.section.2.1.1"><a href="#rfc.section.2.1.1">2.1.1</a>&nbsp;<a id="missing.charset" href="#missing.charset">Missing Charset</a></h3>
     
    810797      <p id="rfc.section.2.3.1.p.3">If a representation is encoded with a content-coding, the underlying data <em class="bcp14">MUST</em> be in a form defined above prior to being encoded.
    811798      </p>
    812       <p id="rfc.section.2.3.1.p.4">The "charset" parameter is used with some media types to define the character encoding (<a href="#character.sets" title="Character Sets">Section&nbsp;2.1</a>) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined
     799      <p id="rfc.section.2.3.1.p.4">The "charset" parameter is used with some media types to define the character encoding (<a href="#character.sets" title="Character Encodings (charset)">Section&nbsp;2.1</a>) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined
    813800         to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character encodings other than "ISO-8859-1"
    814801         or its subsets <em class="bcp14">MUST</em> be labeled with an appropriate charset value. See <a href="#missing.charset" title="Missing Charset">Section&nbsp;2.1.1</a> for compatibility problems.
     
    11511138      <div id="rfc.iref.h.2"></div>
    11521139      <h2 id="rfc.section.6.2"><a href="#rfc.section.6.2">6.2</a>&nbsp;<a id="header.accept-charset" href="#header.accept-charset">Accept-Charset</a></h2>
    1153       <p id="rfc.section.6.2.p.1">The "Accept-Charset" header field can be used by user agents to indicate what response character sets are acceptable. This
    1154          field allows clients capable of understanding more comprehensive or special-purpose character sets to signal that capability
    1155          to a server which is capable of representing documents in those character sets.
     1140      <p id="rfc.section.6.2.p.1">The "Accept-Charset" header field can be used by user agents to indicate what character encodings are acceptable in a response
     1141         payload. This field allows clients capable of understanding more comprehensive or special-purpose character encodings to signal
     1142         that capability to a server which is capable of representing documents in those character encodings.
    11561143      </p>
    11571144      <div id="rfc.figure.u.15"></div><pre class="inline"><span id="rfc.iref.g.16"></span><span id="rfc.iref.g.17"></span>  <a href="#header.accept-charset" class="smpl">Accept-Charset</a>   = "Accept-Charset" ":" <a href="#core.rules" class="smpl">OWS</a>
     
    11591146  <a href="#header.accept-charset" class="smpl">Accept-Charset-v</a> = 1#( ( <a href="#rule.charset" class="smpl">charset</a> / "*" )
    11601147                         [ <a href="#core.rules" class="smpl">OWS</a> ";" <a href="#core.rules" class="smpl">OWS</a> "q=" <a href="#abnf.dependencies" class="smpl">qvalue</a> ] )
    1161 </pre><p id="rfc.section.6.2.p.3">Character set values are described in <a href="#character.sets" title="Character Sets">Section&nbsp;2.1</a>. Each charset <em class="bcp14">MAY</em> be given an associated quality value which represents the user's preference for that charset. The default value is q=1. An
     1148</pre><p id="rfc.section.6.2.p.3">Character encoding values (a.k.a., charsets) are described in <a href="#character.sets" title="Character Encodings (charset)">Section&nbsp;2.1</a>. Each charset <em class="bcp14">MAY</em> be given an associated quality value which represents the user's preference for that charset. The default value is q=1. An
    11621149         example is
    11631150      </p>
    11641151      <div id="rfc.figure.u.16"></div><pre class="text">  Accept-Charset: iso-8859-5, unicode-1-1;q=0.8
    1165 </pre><p id="rfc.section.6.2.p.5">The special value "*", if present in the Accept-Charset field, matches every character set (including ISO-8859-1) which is
    1166          not mentioned elsewhere in the Accept-Charset field. If no "*" is present in an Accept-Charset field, then all character sets
    1167          not explicitly mentioned get a quality value of 0, except for ISO-8859-1, which gets a quality value of 1 if not explicitly
    1168          mentioned.
    1169       </p>
    1170       <p id="rfc.section.6.2.p.6">If no Accept-Charset header field is present, the default is that any character set is acceptable. If an Accept-Charset header
    1171          field is present, and if the server cannot send a response which is acceptable according to the Accept-Charset header field,
    1172          then the server <em class="bcp14">SHOULD</em> send an error response with the 406 (Not Acceptable) status code, though the sending of an unacceptable response is also allowed.
     1152</pre><p id="rfc.section.6.2.p.5">The special value "*", if present in the Accept-Charset field, matches every character encoding (including ISO-8859-1) which
     1153         is not mentioned elsewhere in the Accept-Charset field. If no "*" is present in an Accept-Charset field, then all character
     1154         encodings not explicitly mentioned get a quality value of 0, except for ISO-8859-1, which gets a quality value of 1 if not
     1155         explicitly mentioned.
     1156      </p>
     1157      <p id="rfc.section.6.2.p.6">If no Accept-Charset header field is present, the default is that any character encoding is acceptable. If an Accept-Charset
     1158         header field is present, and if the server cannot send a response which is acceptable according to the Accept-Charset header
     1159         field, then the server <em class="bcp14">SHOULD</em> send an error response with the 406 (Not Acceptable) status code, though the sending of an unacceptable response is also allowed.
    11731160      </p>
    11741161      <div id="rfc.iref.a.3"></div>
     
    17731760      </p>
    17741761      <p id="rfc.section.A.2.p.2">Where it is possible, a proxy or gateway from HTTP to a strict MIME environment <em class="bcp14">SHOULD</em> translate all line breaks within the text media types described in <a href="#canonicalization.and.text.defaults" title="Canonicalization and Text Defaults">Section&nbsp;2.3.1</a> of this document to the RFC 2049 canonical form of CRLF. Note, however, that this might be complicated by the presence of
    1775          a Content-Encoding and by the fact that HTTP allows the use of some character sets which do not use octets 13 and 10 to represent
    1776          CR and LF, as is the case for some multi-byte character sets.
     1762         a Content-Encoding and by the fact that HTTP allows the use of some character encodings which do not use octets 13 and 10
     1763         to represent CR and LF, respectively, as is the case for some multi-byte character encodings.
    17771764      </p>
    17781765      <p id="rfc.section.A.2.p.3">Conversion will break any cryptographic checksums applied to the original content unless the original content is already in
     
    18131800      </p>
    18141801      <h1 id="rfc.section.C"><a href="#rfc.section.C">C.</a>&nbsp;<a id="changes.from.rfc.2616" href="#changes.from.rfc.2616">Changes from RFC 2616</a></h1>
    1815       <p id="rfc.section.C.p.1">Clarify contexts that charset is used in. (<a href="#character.sets" title="Character Sets">Section&nbsp;2.1</a>)
     1802      <p id="rfc.section.C.p.1">Clarify contexts that charset is used in. (<a href="#character.sets" title="Character Encodings (charset)">Section&nbsp;2.1</a>)
    18161803      </p>
    18171804      <p id="rfc.section.C.p.2">Remove base URI setting semantics for Content-Location due to poor implementation support, which was caused by too many broken
Note: See TracChangeset for help on using the changeset viewer.