Re: TICKET 259: 'treat as invalid' not defined from Adam Barth on 2010-12-11 (ietf-http-wg@w3.org from October to December 2010)

From: Adam Barth <ietf@adambarth.com>
Date: Sat, 11 Dec 2010 11:42:46 -0800
To: Julian Reschke <julian.reschke@gmx.de>
Cc: Mark Nottingham <mnot@mnot.net>, httpbis <ietf-http-wg@w3.org>
Message-ID: <AANLkTimhebkqJbjcci1h4p=qCMEZ0+jmeDU3pk+EMOzQ@mail.gmail.com>
Thanks for making a counter-proposal.  A few notes below.

On Fri, Dec 10, 2010 at 1:59 AM, Julian Reschke <julian.reschke@gmx.de> wrote:
> On 12.11.2010 08:53, Julian Reschke wrote:
>> On 12.11.2010 05:58, Mark Nottingham wrote:
>>>
>>> I'm confused. I thought that we were going to talk about error
>>> handling in an appendix, but it appears you're starting to talk about
>>> it here.
>>
>> 1) Yes, it should be an appendix.
>>
>> 2) Well, it's parsing advice. It appears that some readers have trouble
>> understanding how to derive a parsing strategy from the way how we
>> currently write specs, so this is an attempt to describe just that.
>
> Here's an updated proposal (see also
> <http://trac.tools.ietf.org/wg/httpbis/trac/attachment/ticket/259/i259.diff>):
>
> -- snip --Appendix D. �Parsing
>
> � This document does not require any specific handling of invalid
> � header field values. �With this in mind, the text below describes a
> � simple strategy for parsing the header field and detecting problems
> � in general, or in specific parameters.
>
> D.1. �Combine Multiple Instances of Content-Disposition
>
> � If the HTTP message contains multiple instances of the Content-
> � Disposition header field, combine all field values into a single one
> � as specified in Section 4.2 of [RFC2616].
>
> D.2. �Parsing for Disposition Type and Parameters
>
> � Using the simplified grammar below:
>
> � � field-value = disp-type *( ";" param )
> � � disp-type � = token
> � � param � � � = token "=" value
>
> � ...parse the field value into a disp-type (disposition type) and a
> � sequence of parameters (pairs of name (token) and value). �Lower-case
> � all disposition types and parameter names.
>
> � If the field value does not conform to the grammar (such as when not
> � exactly one disposition type is specified), ignore the whole header
> � field.

This doesn't cover cases like the following:

Content-Disposition: attachment; inline; filename=foo.exe

We want to treat those as an attachment.  Another grammer we could use
might be the following:

 � � field-value = item *( ";" item )
 � � item          = disp-type / param
 � � disp-type � = <OCTET, except ";" and "=">
 � � param � � � = param-name "=" param-value
     param-name = <OCTET, except "=">
     param-value = <OCTET, except ";">

We could then say that first disp-type and the first param are the
ones that matter.  (I'm not sure this grammar handles <"> correctly,
but I'm sure we can sort that out.)

> D.3. �Checking Cardinality Constraints
>
> � If the parameter sequence contains multiple instances of the same
> � parameter name, ignore the whole header field.

We'd prefer to use the first one rather than ignore the header field.

> D.4. �Post-Process Parameter Values
>
> � For each parameter, post-process the associated value part according
> � to the grammar:
>
> � o �According to Section 3.2.1 of [RFC5987] for parameters using the
> � � �RFC 5987 syntax (such as "filename*"). �If this fails, just ignore
> � � �this parameter.
>
> � o �According to the grammar for quoted-string (Section 2.2 of
> � � �[RFC2616]) for values starting with a double quote character (").

Does this imply \-decoding?  We don't want to do \-decoding.

> � o �Verbatim otherwise.

We'd like to do %-decoding both for the quoted and unquoted cases.

> � Note that this step starts with an octet sequence obtained from the
> � HTTP message, and results in a sequence of Unicode characters.

Somewhere we want to say what character set we're using.

> D.5. �Extracting the Disposition Type
>
> � The parsing step (Appendix D.2) has returned the disposition type (to
> � be matched case-insensitively), which can be "attachment", "inline",
> � or an extension type. �If the type is unknown, treat it like
> � "attachment" (see Section 3.2).

What if there's no disposition type?

Content-Disposition: filename=foo.exe
Content-Disposition: foo=bar

If I remember correctly, we're supposed to treat the former as inline
and the later as attachment.

> D.6. �Determining the File Name
>
> � The parsing and post-processing steps resulted in a set of parameters
> � (name/value pairs). �The suggested file name is the value of the
> � "filename*" parameter (when present), otherwise the value of the
> � "filename" parameter.
>
> � If neither is given, the UA can determine a name based on the
> � associated URI; for instance based on the last path segment.
>
> � Otherwise, the UA ought to post-process the suggested filename
> � according following Section 3.3. [[anchor10: We could say here that
> � UAs may reject filenames for security reasons, such as those with a
> � path separator character.]]

I'll update the wiki shortly to respond to your previous feedback and
with information from this message.

Thanks,
Adam
Received on Saturday, 11 December 2010 19:43:53 UTC