14/12/12 00:39:56 (10 years ago)

Clean up of MIME and conneg; partly addresses #419

1 edited


  • draft-ietf-httpbis/latest/p2-semantics.xml

    r2047 r2050  
    388388  <x:anchor-alias value="rule.charset"/>
    390    HTTP uses charset names to indicate or negotiate the character encoding
    391    scheme of a textual representation <xref target="RFC6365"/>.
     390   HTTP uses <x:dfn>charset</x:dfn> names to indicate or negotiate the
     391   character encoding scheme of a textual representation
     392   <xref target="RFC6365"/>.
    392393   A charset is identified by a case-insensitive token.
    404405<section title="Canonicalization and Text Defaults" anchor="canonicalization.and.text.defaults">
    406    Internet media types are registered with a canonical form. A
    407    representation transferred via HTTP messages &MUST; be in the
    408    appropriate canonical form prior to its transmission except for
    409    "text" types, as defined in the next paragraph.
    410 </t>
    411 <t>
    412    When in canonical form, media subtypes of the "text" type use CRLF as
    413    the text line break. HTTP relaxes this requirement and allows the
    414    transport of text media with plain CR or LF alone representing a line
    415    break when it is done consistently for an entire representation. HTTP
    416    applications &MUST; accept CRLF, bare CR, and bare LF as indicating
    417    a line break in text media received via HTTP. In
    418    addition, if the text is in a charset that does not
    419    use octets 13 and 10 for CR and LF respectively, as is the case for
    420    some multi-byte charsets, HTTP allows the use of whatever octet
    421    sequences are defined by that charset to represent the
    422    equivalent of CR and LF for line breaks. This flexibility regarding
    423    line breaks applies only to text media in the payload body; a bare CR
    424    or LF &MUST-NOT; be substituted for CRLF within any of the HTTP control
    425    structures (such as header fields and multipart boundaries).
     407   Internet media types are registered with a canonical form in order to be
     408   interoperable among systems with varying native encoding formats.
     409   Representations selected or transferred via HTTP ought to be in canonical
     410   form, for many of the same reasons described by MIME
     411   <xref target="RFC2049"/>.
     412   However, the performance characteristics of email deployments (i.e., store
     413   and forward messages to peers) are significantly different from those
     414   common to HTTP and the Web (server-based information services).
     415   Furthermore, MIME's constraints for the sake of compatibility with older
     416   mail transfer protocols do not apply to HTTP
     417   (see <xref target="differences.between.http.and.mime"/>).
     420   MIME's canonical form requires that media subtypes of the "text"
     421   type use CRLF as the text line break. HTTP allows the
     422   transfer of text media with plain CR or LF alone representing a line
     423   break, when such line breaks are consistent for an entire representation.
     424   HTTP senders &MAY; generate, and recipients &MUST; be able to parse,
     425   line breaks in text media that consist of CRLF, bare CR, or bare LF.
     426   In addition, text media in HTTP is not limited to charsets that
     427   use octets 13 and 10 for CR and LF, respectively.
     428   This flexibility regarding line breaks applies only to text within a
     429   representation that has been assigned a "text" media type; it does not
     430   apply to "multipart" types or HTTP elements outside the payload body
     431   (e.g., header fields).
    428434   If a representation is encoded with a content-coding, the underlying
    429    data &MUST; be in a form defined above prior to being encoded.
     435   data ought to be in a form defined above prior to being encoded.
    443    In general, HTTP treats a multipart message body no differently than
    444    any other media type: strictly as payload.  HTTP does not use the
    445    multipart boundary as an indicator of message body length.
    446    <!-- jre: re-insert removed text pointing to caching? -->
    447    In all other respects, an HTTP user agent &SHOULD; follow the same or similar
    448    behavior as a MIME user agent would upon receipt of a multipart type.
    449    The MIME header fields within each body-part of a multipart message body
    450    do not have any significance to HTTP beyond that defined by
    451    their MIME semantics.
    452 </t>
    453 <t>
    454    A recipient &MUST; treat an unrecognized multipart subtype
    455    as being equivalent to "multipart/mixed".
    456 </t>
    457 <x:note>
    458   <t>
    459     &Note; The "multipart/form-data" type has been specifically defined
    460     for carrying form data suitable for processing via the POST
    461     request method, as described in <xref target="RFC2388"/>.
    462   </t>
    463 </x:note>
     449   HTTP message framing does not use the multipart boundary as an indicator
     450   of message body length, though it might be used by implementations that
     451   generate or process the payload. For example, the "multipart/form-data"
     452   type is often used for carrying form data in a request, as described in
     453   <xref target="RFC2388"/>, and the "multipart/byteranges" type is defined
     454   by this specification for use in some <x:ref>206 (Partial Content)</x:ref>
     455   responses <xref target="Part5"/>.
    470463   The "Content-Type" header field indicates the media type of the
    471    representation, which defines both the data format and how that data
    472    &SHOULD; be processed by the recipient (within the scope of the request
    473    method semantics) after any <x:ref>Content-Encoding</x:ref> is decoded.
    474    For responses to the HEAD method, the media type is
    475    that which would have been sent had the request been a GET.
     464   associated representation: either the representation enclosed in
     465   the message payload or the selected representation, as determined by the
     466   message semantics.  The indicated media type defines both the data format
     467   and how that data is intended to be processed by a recipient, within the
     468   scope of the received message semantics, after any content codings
     469   indicated by <x:ref>Content-Encoding</x:ref> are decoded.
    477471<figure><artwork type="abnf2616"><iref primary="true" item="Grammar" subitem="Content-Type"/>
    481    Media types are defined in <xref target="media.type"/>. An example of the field is
     475   Media types are defined in <xref target="media.type"/>. An example of the
     476   field is
    483478<figure><artwork type="example">
    585580   If the media type includes an inherent encoding, such as a data format
    586    that is always compressed, then that encoding would not be restated as
    587    a Content-Encoding even if it happens to be the same algorithm as one
     581   that is always compressed, then that encoding would not be restated in
     582   Content-Encoding even if it happens to be the same algorithm as one
    588583   of the content codings.  Such a content coding would only be listed if,
    589584   for some bizarre reason, it is applied a second time to form the
    598    If the content-coding of a representation in a request message is not
    599    acceptable to the origin server, the server &SHOULD; respond with a
    600    status code of 415 (Unsupported Media Type).
     593   An origin server &MAY; respond with a status code of
     594   <x:ref>415 (Unsupported Media Type)</x:ref> if a representation in the
     595   request message has a content coding that is not acceptable.
    904899<section title="Content Negotiation" anchor="content.negotiation">
    905 <t>
    906    HTTP responses include a representation which contains information for
    907    interpretation, whether by a human user or for further processing.
    908    Often, the server has different ways of representing the
    909    same information; for example, in different formats, languages,
    910    or using different charsets.
    911 </t>
    912 <t>
    913    HTTP clients and their users might have different or variable
    914    capabilities, characteristics or preferences which would influence
    915    which representation, among those available from the server,
    916    would be best for the server to deliver. For this reason, HTTP
    917    provides mechanisms for "content negotiation" &mdash; a process of
    918    allowing selection of a representation of a given resource,
    919    when more than one is available.
    920 </t>
    921 <t>
    922    This specification defines two patterns of content negotiation;
     900  <x:anchor-alias value="content negotiation"/>
     902   When responses convey a representation, whether indicating a success or
     903   an error, the origin server often has different ways of representing that
     904   information; for example, in different formats, languages, or encodings.
     905   Likewise, different users or user agents might have differing capabilities,
     906   characteristics, or preferences that could influence which representation,
     907   among those available, would be best to deliver. For this reason, HTTP
     908   provides mechanisms for <x:ref>content negotiation</x:ref>.
     911   This specification defines two patterns of content negotiation that can
     912   be made visible within the protocol:
    923913   "proactive", where the server selects the representation based
    924    upon the client's stated preferences, and "reactive" negotiation,
    925    where the server provides a list of representations for the client to
    926    choose from, based upon their metadata. In addition,  there are
    927    other patterns: some applications use an "active content" pattern,
    928    where the server returns active content which runs on the client
    929    and, based on client available parameters, selects additional
    930    resources to invoke. "Transparent Content Negotiation" (<xref target="RFC2295"/>)
    931    has also been proposed.
    932 </t>
    933 <t>
    934    These patterns are all widely used, and have trade-offs in applicability
    935    and practicality. In particular, when the number of preferences or
    936    capabilities to be expressed by a client are large (such as when many
    937    different formats are supported by a user agent), proactive
    938    negotiation becomes unwieldy, and might not be appropriate. Conversely,
    939    when the number of representations to choose from is very large,
    940    reactive negotiation might not be appropriate.
    941 </t>
    942 <t>
    943    Note that, in all cases, the supplier of representations has the
    944    responsibility for determining which representations might be
    945    considered to be the "same information".
     914   upon the user agent's stated preferences, and "reactive" negotiation,
     915   where the server provides a list of representations for the user agent to
     916   choose from. Other patterns of content negotiation include
     917   "conditional content", where the representation consists of multiple
     918   parts that are selectively rendered based on user agent parameters,
     919   "active content", where the representation contains a script that
     920   makes additional (more specific) requests based on the user agent
     921   characteristics, and "Transparent Content Negotiation"
     922   (<xref target="RFC2295"/>), where content selection is performed by
     923   an intermediary. These patterns are not mutually exclusive, and each has
     924   trade-offs in applicability and practicality.
     927   Note that, in all cases, the supplier of representations to the origin
     928   server determines which representations might be considered to be the
     929   "same information".
    948932<section title="Proactive Negotiation" anchor="proactive.negotiation">
    949 <t>
    950    If the selection of the best representation for a response is made by
    951    an algorithm located at the server, it is called proactive
    952    negotiation. Selection is based on the available representations of
    953    the response (the dimensions over which it can vary; e.g., language,
    954    content-coding, etc.) and the contents of particular header fields in
    955    the request message or on other information pertaining to the request
    956    (such as the network address of the client).
     933  <x:anchor-alias value="proactive negotiation"/>
     934  <x:anchor-alias value="server-driven negotiation"/>
     936   When content negotiation preferences are sent by the user agent in a
     937   request in order to encourage an algorithm located at the server to
     938   select the preferred representation, it is called
     939   <x:dfn>proactive negotiation</x:dfn>
     940   (a.k.a., <x:dfn>server-driven negotiation</x:dfn>). Selection is based on
     941   the available representations for a response (the dimensions over which it
     942   might vary; e.g., language, content-coding, etc.) compared to various
     943   information supplied in the request, including both the explicit
     944   negotiation fields of <xref target="request.conneg"/> and implicit
     945   characteristics, such as the client's network address or parts of the
     946   <x:ref>User-Agent</x:ref> field.
    959949   Proactive negotiation is advantageous when the algorithm for
    960950   selecting from among the available representations is difficult to
    961    describe to the user agent, or when the server desires to send its
    962    "best guess" to the client along with the first response (hoping to
     951   describe to a user agent, or when the server desires to send its
     952   "best guess" to the user agent along with the first response (hoping to
    963953   avoid the round-trip delay of a subsequent request if the "best
    964954   guess" is good enough for the user). In order to improve the server's
    965    guess, the user agent &MAY; include request header fields (<x:ref>Accept</x:ref>,
    966    <x:ref>Accept-Language</x:ref>, <x:ref>Accept-Encoding</x:ref>, etc.) which describe its
    967    preferences for such a response.
    968 </t>
    969 <t>
    970    Proactive negotiation has disadvantages:
    971   <list style="numbers">
     955   guess, a user agent &MAY; send request header fields that describe
     956   its preferences.
     959   Proactive negotiation has serious disadvantages:
     960  <list style="symbols">
    972961    <t>
    973962         It is impossible for the server to accurately determine what
    975964         complete knowledge of both the capabilities of the user agent
    976965         and the intended use for the response (e.g., does the user want
    977          to view it on screen or print it on paper?).
     966         to view it on screen or print it on paper?);
    978967    </t>
    979968    <t>
    981970         request can be both very inefficient (given that only a small
    982971         percentage of responses have multiple representations) and a
    983          potential violation of the user's privacy.
     972         potential risk to the user's privacy;
    984973    </t>
    985974    <t>
    986975         It complicates the implementation of an origin server and the
    987          algorithms for generating responses to a request.
     976         algorithms for generating responses to a request; and,
    988977    </t>
    989978    <t>
    990          It might limit a public cache's ability to use the same response
    991          for multiple user's requests.
     979         It limits the reusability of responses for shared caching.
    992980    </t>
    993981  </list>
    996    Proactive negotiation allows the user agent to specify its preferences,
    997    but it cannot expect responses to always honor them. For example, the origin
    998    server might not implement proactive negotiation, or it might decide that
    999    sending a response that doesn't conform to the user agent's preferences is
    1000    better than sending a <x:ref>406 (Not Acceptable)</x:ref> response.
    1001 </t>
    1002 <t>
    1003    HTTP/1.1 includes the following header fields for enabling
    1004    proactive negotiation through description of user agent
    1005    capabilities and user preferences: <x:ref>Accept</x:ref>
    1006    (<xref target="header.accept"/>), <x:ref>Accept-Charset</x:ref>
    1007    (<xref target="header.accept-charset"/>), <x:ref>Accept-Encoding</x:ref>
    1008    (<xref target="header.accept-encoding"/>), <x:ref>Accept-Language</x:ref>
    1009    (<xref target="header.accept-language"/>), and <x:ref>User-Agent</x:ref>
    1010    (&header-user-agent;).
    1011    However, an origin server is not limited to these dimensions and &MAY; vary
    1012    the response based on any aspect of the request, including aspects
    1013    of the connection (e.g., IP address) or information within extension
    1014    header fields not defined by this specification.
    1015 </t>
    1016 <x:note>
    1017   <t>
    1018     &Note; In practice, <x:ref>User-Agent</x:ref> based negotiation is fragile,
    1019     because new clients might not be recognized.
    1020   </t>
    1021 </x:note>
    1022 <t>
    1023    The <x:ref>Vary</x:ref> header field (<xref target="header.vary"/>) can be
    1024    used to express the parameters the server uses to select a representation
    1025    that is subject to proactive negotiation.
     984   A user agent cannot rely on proactive negotiation preferences being
     985   consistently honored, since the origin server might not implement proactive
     986   negotiation for the requested resource or might decide that sending a
     987   response that doesn't conform to the user agent's preferences is better
     988   than sending a <x:ref>406 (Not Acceptable)</x:ref> response.
     991   An origin server &MAY; generate a <x:ref>Vary</x:ref> header field
     992   (<xref target="header.vary"/>) in responses that are subject to proactive
     993   negotiation to indicate what parameters of request information might
     994   be used in its selection algorithm, thereby providing a means for
     995   recipients to determine the reusability of that same response for
     996   user agents with differing request information.
    10291000<section title="Reactive Negotiation" anchor="reactive.negotiation">
    1030 <t>
    1031    With reactive negotiation, selection of the best representation
    1032    for a response is performed by the user agent after receiving an
    1033    initial response from the origin server. Selection is based on a list
    1034    of the available representations of the response included within the
    1035    header fields or body of the initial response, with each
    1036    representation identified by its own URI. Selection from among the
    1037    representations can be performed automatically (if the user agent is
    1038    capable of doing so) or manually by the user selecting from a
    1039    generated (possibly hypertext) menu.
     1001  <x:anchor-alias value="reactive negotiation"/>
     1002  <x:anchor-alias value="agent-driven negotiation"/>
     1004   With <x:dfn>reactive negotiation</x:dfn>
     1005   (a.k.a., <x:dfn>agent-driven negotiation</x:dfn>), selection of the best
     1006   representation for a response is performed by the user agent after
     1007   receiving an initial response from the origin server with a list of
     1008   alternative resources. If the user agent is not satisfied by the initial
     1009   response, it can perform a GET request on one or more of the alternative
     1010   resources, selected based on metadata included in the list, to obtain a
     1011   different form of representation. Selection of alternatives might be
     1012   performed automatically by the user agent or manually by the user selecting
     1013   from a generated (possibly hypertext) menu.
     1016   The <x:ref>300 (Multiple Choices)</x:ref> and
     1017   <x:ref>406 (Not Acceptable)</x:ref> status codes indicate reactive
     1018   negotiation when the origin server is unwilling or unable to provide a
     1019   varying response using proactive negotiation.
    1049    Reactive negotiation suffers from the disadvantage of needing a
    1050    second request to obtain the best alternate representation. This
    1051    second request is only efficient when caching is used. In addition,
    1052    this specification does not define any mechanism for supporting
    1053    automatic selection, though it also does not prevent any such
    1054    mechanism from being developed as an extension and used within
    1055    HTTP/1.1.
    1056 </t>
    1057 <t>
    1058    This specification defines the <x:ref>300 (Multiple Choices)</x:ref> and
    1059    <x:ref>406 (Not Acceptable)</x:ref> status codes for enabling reactive
    1060    negotiation when the server is unwilling or unable to provide a varying
    1061    response using proactive negotiation.
     1029   Reactive negotiation suffers from the disadvantages of transmitting
     1030   a list of alternatives to the user agent, which degrades user-perceived
     1031   latency if transmitted in the header section, and needing a second request
     1032   to obtain an alternate representation. Furthermore, this specification
     1033   does not define a mechanism for supporting automatic selection, though it
     1034   does not prevent such a mechanism from being developed as an extension.
    33243297   The server is refusing to service the request because the request
    33253298   payload is in a format not supported by this request method on the
    3326    target resource.
     3299   target resource. The format problem might be due to the request's
     3300   indicated <x:ref>Content-Type</x:ref> or <x:ref>Content-Encoding</x:ref>,
     3301   or as a result of inspecting the data directly.
Note: See TracChangeset for help on using the changeset viewer.