Changeset 209


Ignore:
Timestamp:
Feb 12, 2008, 5:46:02 AM (12 years ago)
Author:
julian.reschke@…
Message:

Remove character set defaulting for text media types (to be done: add security considerations WRT charset sniffing); relates to #20.

Location:
draft-ietf-httpbis/latest
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • draft-ietf-httpbis/latest/outlineALL.html

    r208 r209  
    324324               <li class="tocline0">2.&nbsp;&nbsp;&nbsp;<a href="p3-payload.html#notation">Notational Conventions and Generic Grammar</a></li>
    325325               <li class="tocline0">3.&nbsp;&nbsp;&nbsp;<a href="p3-payload.html#protocol.parameters">Protocol Parameters</a><ul class="toc">
    326                      <li class="tocline1">3.1&nbsp;&nbsp;&nbsp;<a href="p3-payload.html#character.sets">Character Sets</a><ul class="toc">
    327                            <li class="tocline1">3.1.1&nbsp;&nbsp;&nbsp;<a href="p3-payload.html#missing.charset">Missing Charset</a></li>
    328                         </ul>
    329                      </li>
     326                     <li class="tocline1">3.1&nbsp;&nbsp;&nbsp;<a href="p3-payload.html#character.sets">Character Sets</a></li>
    330327                     <li class="tocline1">3.2&nbsp;&nbsp;&nbsp;<a href="p3-payload.html#content.codings">Content Codings</a></li>
    331328                     <li class="tocline1">3.3&nbsp;&nbsp;&nbsp;<a href="p3-payload.html#media.types">Media Types</a><ul class="toc">
  • draft-ietf-httpbis/latest/p3-payload.html

    r207 r209  
    486486         <li class="tocline0">2.&nbsp;&nbsp;&nbsp;<a href="#notation">Notational Conventions and Generic Grammar</a></li>
    487487         <li class="tocline0">3.&nbsp;&nbsp;&nbsp;<a href="#protocol.parameters">Protocol Parameters</a><ul class="toc">
    488                <li class="tocline1">3.1&nbsp;&nbsp;&nbsp;<a href="#character.sets">Character Sets</a><ul class="toc">
    489                      <li class="tocline1">3.1.1&nbsp;&nbsp;&nbsp;<a href="#missing.charset">Missing Charset</a></li>
    490                   </ul>
    491                </li>
     488               <li class="tocline1">3.1&nbsp;&nbsp;&nbsp;<a href="#character.sets">Character Sets</a></li>
    492489               <li class="tocline1">3.2&nbsp;&nbsp;&nbsp;<a href="#content.codings">Content Codings</a></li>
    493490               <li class="tocline1">3.3&nbsp;&nbsp;&nbsp;<a href="#media.types">Media Types</a><ul class="toc">
     
    632629      <p id="rfc.section.3.1.p.8">Implementors should be aware of IETF character set requirements <a href="#RFC3629" id="rfc.xref.RFC3629.1"><cite title="UTF-8, a transformation format of ISO 10646">[RFC3629]</cite></a>  <a href="#RFC2277" id="rfc.xref.RFC2277.1"><cite title="IETF Policy on Character Sets and Languages">[RFC2277]</cite></a>.
    633630      </p>
    634       <h3 id="rfc.section.3.1.1"><a href="#rfc.section.3.1.1">3.1.1</a>&nbsp;<a id="missing.charset" href="#missing.charset">Missing Charset</a></h3>
    635       <p id="rfc.section.3.1.1.p.1">Some HTTP/1.0 software has interpreted a Content-Type header without charset parameter incorrectly to mean "recipient should
    636          guess." Senders wishing to defeat this behavior <em class="bcp14">MAY</em> include a charset parameter even when the charset is ISO-8859-1 (<a href="#ISO-8859-1" id="rfc.xref.ISO-8859-1.1"><cite title="Information technology -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1">[ISO-8859-1]</cite></a>) and <em class="bcp14">SHOULD</em> do so when it is known that it will not confuse the recipient.
    637       </p>
    638       <p id="rfc.section.3.1.1.p.2">Unfortunately, some older HTTP/1.0 clients did not deal properly with an explicit charset parameter. HTTP/1.1 recipients <em class="bcp14">MUST</em> respect the charset label provided by the sender; and those user agents that have a provision to "guess" a charset <em class="bcp14">MUST</em> use the charset from the content-type field if they support that charset, rather than the recipient's preference, when initially
    639          displaying a document. See <a href="#canonicalization.and.text.defaults" title="Canonicalization and Text Defaults">Section&nbsp;3.3.1</a>.
    640       </p>
    641631      <h2 id="rfc.section.3.2"><a href="#rfc.section.3.2">3.2</a>&nbsp;<a id="content.codings" href="#content.codings">Content Codings</a></h2>
    642632      <p id="rfc.section.3.2.p.1">Content coding values indicate an encoding transformation that has been or can be applied to an entity. Content codings are
     
    721711      <p id="rfc.section.3.3.1.p.3">If an entity-body is encoded with a content-coding, the underlying data <em class="bcp14">MUST</em> be in a form defined above prior to being encoded.
    722712      </p>
    723       <p id="rfc.section.3.3.1.p.4">The "charset" parameter is used with some media types to define the character set (<a href="#character.sets" title="Character Sets">Section&nbsp;3.1</a>) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined
    724          to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character sets other than "ISO-8859-1" or
    725          its subsets <em class="bcp14">MUST</em> be labeled with an appropriate charset value. See <a href="#missing.charset" title="Missing Charset">Section&nbsp;3.1.1</a> for compatibility problems.
     713      <p id="rfc.section.3.3.1.p.4">HTTP/1.1 recipients <em class="bcp14">MUST</em> respect the charset label provided by the sender; and those user agents that have a provision to "guess" a charset <em class="bcp14">MUST</em> use the charset from the content-type field if they support that charset, rather than the recipient's preference, when initially
     714         displaying a document.
    726715      </p>
    727716      <h3 id="rfc.section.3.3.2"><a href="#rfc.section.3.3.2">3.3.2</a>&nbsp;<a id="multipart.types" href="#multipart.types">Multipart Types</a></h3>
     
    12031192         as described by this document. The discussion does not include definitive solutions to the problems revealed, though it does
    12041193         make some suggestions for reducing security risks.
     1194      </p>
     1195      <p id="rfc.section.8.p.2"> <span class="comment">[sec.charset.sniffing: Point out the risks related to character set sniffing, in particular for UTF-7. See &lt;<a href="http://tools.ietf.org/wg/httpbis/trac/ticket/20#comment:4">http://tools.ietf.org/wg/httpbis/trac/ticket/20#comment:4</a>&gt;.]</span>
    12051196      </p>
    12061197      <h2 id="rfc.section.8.1"><a href="#rfc.section.8.1">8.1</a>&nbsp;<a id="privacy.issues.connected.to.accept.headers" href="#privacy.issues.connected.to.accept.headers">Privacy Issues Connected to Accept Headers</a></h2>
     
    15001491      <p id="rfc.section.C.2.p.1">Clarify contexts that charset is used in. (<a href="#character.sets" title="Character Sets">Section&nbsp;3.1</a>)
    15011492      </p>
    1502       <p id="rfc.section.C.2.p.2">Remove reference to non-existant identity transfer-coding value tokens. (<a href="#no.content-transfer-encoding" title="No Content-Transfer-Encoding">Appendix&nbsp;A.4</a>)
     1493      <p id="rfc.section.C.2.p.2">Remove character set defaulting for text media types. (<a href="#canonicalization.and.text.defaults" title="Canonicalization and Text Defaults">Section&nbsp;3.3.1</a>)
     1494      </p>
     1495      <p id="rfc.section.C.2.p.3">Remove reference to non-existant identity transfer-coding value tokens. (<a href="#no.content-transfer-encoding" title="No Content-Transfer-Encoding">Appendix&nbsp;A.4</a>)
    15031496      </p>
    15041497      <h1 id="rfc.section.D"><a href="#rfc.section.D">D.</a>&nbsp;Change Log (to be removed by RFC Editor before publication)
     
    15371530      <h2 id="rfc.section.D.3"><a href="#rfc.section.D.3">D.3</a>&nbsp;Since draft-ietf-httpbis-p3-payload-01
    15381531      </h2>
    1539       <p id="rfc.section.D.3.p.1">Ongoing work on ABNF conversion (&lt;<a href="http://www3.tools.ietf.org/wg/httpbis/trac/ticket/36">http://www3.tools.ietf.org/wg/httpbis/trac/ticket/36</a>&gt;):
     1532      <p id="rfc.section.D.3.p.1">Ongoing work on text media type charset defaults (&lt;<a href="http://www3.tools.ietf.org/wg/httpbis/trac/ticket/20">http://www3.tools.ietf.org/wg/httpbis/trac/ticket/20</a>&gt;):
     1533      </p>
     1534      <ul>
     1535         <li>Remove the ISO-8859-1 default.</li>
     1536      </ul>
     1537      <p id="rfc.section.D.3.p.2">Ongoing work on ABNF conversion (&lt;<a href="http://www3.tools.ietf.org/wg/httpbis/trac/ticket/36">http://www3.tools.ietf.org/wg/httpbis/trac/ticket/36</a>&gt;):
    15401538      </p>
    15411539      <ul>
     
    16701668            <li class="indline0"><a id="rfc.index.I" href="#rfc.index.I"><b>I</b></a><ul class="ind">
    16711669                  <li class="indline1">identity&nbsp;&nbsp;<a class="iref" href="#rfc.iref.i.1">3.2</a></li>
    1672                   <li class="indline1"><em>ISO-8859-1</em>&nbsp;&nbsp;<a class="iref" href="#rfc.xref.ISO-8859-1.1">3.1.1</a>, <a class="iref" href="#ISO-8859-1"><b>10.1</b></a></li>
     1670                  <li class="indline1"><em>ISO-8859-1</em>&nbsp;&nbsp;<a class="iref" href="#ISO-8859-1"><b>10.1</b></a></li>
    16731671               </ul>
    16741672            </li>
  • draft-ietf-httpbis/latest/p3-payload.xml

    r207 r209  
    348348   <xref target="RFC2277"/>.
    349349</t>
    350 
    351 <section title="Missing Charset" anchor="missing.charset">
    352 <t>
    353    Some HTTP/1.0 software has interpreted a Content-Type header without
    354    charset parameter incorrectly to mean "recipient should guess."
    355    Senders wishing to defeat this behavior &MAY; include a charset
    356    parameter even when the charset is ISO-8859-1 (<xref target="ISO-8859-1"/>) and &SHOULD; do so when
    357    it is known that it will not confuse the recipient.
    358 </t>
    359 <t>
    360    Unfortunately, some older HTTP/1.0 clients did not deal properly with
    361    an explicit charset parameter. HTTP/1.1 recipients &MUST; respect the
    362    charset label provided by the sender; and those user agents that have
    363    a provision to "guess" a charset &MUST; use the charset from the
    364    content-type field if they support that charset, rather than the
    365    recipient's preference, when initially displaying a document. See
    366    <xref target="canonicalization.and.text.defaults"/>.
    367 </t>
    368 </section>
    369350</section>
    370351
     
    514495</t>
    515496<t>
    516    The "charset" parameter is used with some media types to define the
    517    character set (<xref target="character.sets"/>) of the data. When no explicit charset
    518    parameter is provided by the sender, media subtypes of the "text"
    519    type are defined to have a default charset value of "ISO-8859-1" when
    520    received via HTTP. Data in character sets other than "ISO-8859-1" or
    521    its subsets &MUST; be labeled with an appropriate charset value. See
    522    <xref target="missing.charset"/> for compatibility problems.
     497   HTTP/1.1 recipients &MUST; respect the    charset label provided by the
     498   sender; and those user agents that have a provision to "guess" a charset
     499   &MUST; use the charset from the content-type field if they support that
     500   charset, rather than the recipient's preference, when initially displaying
     501   a document.
    523502</t>
    524503</section>
     
    14541433   some suggestions for reducing security risks.
    14551434</t>
     1435<t>
     1436  <cref anchor="sec.charset.sniffing">
     1437    Point out the risks related to character set sniffing, in particular for
     1438    UTF-7. See <eref target="http://tools.ietf.org/wg/httpbis/trac/ticket/20#comment:4"/>.
     1439  </cref>
     1440</t>
    14561441
    14571442<section title="Privacy Issues Connected to Accept Headers" anchor="privacy.issues.connected.to.accept.headers">
     
    23672352  Clarify contexts that charset is used in.
    23682353  (<xref target="character.sets"/>)
     2354</t>
     2355<t>
     2356  Remove character set defaulting for text media types.
     2357  (<xref target="canonicalization.and.text.defaults"/>)
    23692358</t>
    23702359<t>
     
    24412430<section title="Since draft-ietf-httpbis-p3-payload-01">
    24422431<t>
     2432  Ongoing work on text media type charset defaults (<eref target="http://www3.tools.ietf.org/wg/httpbis/trac/ticket/20"/>):
     2433  <list style="symbols">
     2434    <t>
     2435      Remove the ISO-8859-1 default.
     2436    </t>
     2437  </list>
     2438</t>
     2439<t>
    24432440  Ongoing work on ABNF conversion (<eref target="http://www3.tools.ietf.org/wg/httpbis/trac/ticket/36"/>):
    24442441  <list style="symbols">
Note: See TracChangeset for help on using the changeset viewer.