Re: SpeechRecognitionAlternative.interpretation when interpretation can't be provided from Satish S on 2012-08-17 (public-speech-api@w3.org from August 2012)

From: Satish S <satish@google.com>
Date: Fri, 17 Aug 2012 15:41:58 +0100
To: Deborah Dahl <dahl@conversational-technologies.com>
Cc: Bjorn Bringert <bringert@google.com>, Hans Wennborg <hwennborg@google.com>, public-speech-api@w3.org
Message-ID: <CAHZf7RkQ6gGvjdiEyV8MQJ9Gccfc6-7J8HDCinuUSaWO0Xk==A@mail.gmail.com>
> I may have missed something, but I don�t see in the spec where it says
that �interpretation� is optional.

Developers specify the interpretation value with SISR and if they don't
specify there is no 'default' interpretation available. In that sense it is
optional because grammars don't mandate it. So I think this API shouldn't
mandate providing a default value if the engine did not provide one, and
return null in such cases.

Cheers
Satish


On Fri, Aug 17, 2012 at 1:57 PM, Deborah Dahl <
dahl@conversational-technologies.com> wrote:

> I may have missed something, but I don�t see in the spec where it says
> that �interpretation� is optional. ****
>
> *From:* Satish S [mailto:satish@google.com]
> *Sent:* Thursday, August 16, 2012 7:38 PM
> *To:* Deborah Dahl
> *Cc:* Bjorn Bringert; Hans Wennborg; public-speech-api@w3.org
>
> *Subject:* Re: SpeechRecognitionAlternative.interpretation when
> interpretation can't be provided****
>
> ** **
>
> 'interpretation' is an optional attribute because engines are not required
> to provide an interpretation on their own (unlike 'transcript'). As such I
> think it should return null when there isn't a value to be returned as that
> is the convention for optional attributes, not 'undefined' or a copy of
> some other attribute.****
>
> ** **
>
> If an engine chooses to return the same value for 'transcript' and
> 'interpretation' or do textnorm of the value and return in 'interpretation'
> that will be an implementation detail of the engine. But in the absence of
> any such value for 'interpretation' from the engine I think the UA should
> return null.****
>
>
> Cheers
> Satish
>
> ****
>
> On Thu, Aug 16, 2012 at 2:52 PM, Deborah Dahl <
> dahl@conversational-technologies.com> wrote:****
>
> That's a good point. There are lots of use cases where some simple
> normalization is extremely useful, as in your example, or collapsing all
> the ways that the user might say "yes" or "no". However, you could say that
> once the implementation has modified or normalized the transcript that
> means it has some kind of interpretation, so putting a normalized value in
> the interpretation slot should be fine. Nothing says that the
> "interpretation" has to be a particularly fine-grained interpretation, or
> one with a lot of structure.****
>
>
>
> > -----Original Message-----
> > From: Bjorn Bringert [mailto:bringert@google.com]
> > Sent: Thursday, August 16, 2012 9:09 AM
> > To: Hans Wennborg
> > Cc: Conversational; public-speech-api@w3.org
> > Subject: Re: SpeechRecognitionAlternative.interpretation when
> > interpretation can't be provided
> >
> > I'm not sure that it has to be that strict in requiring that the value
> > is the same as the "transcript" attribute. For example, an engine
> > might return the words recognized in "transcript" and apply some extra
> > textnorm to the text that it returns in "interpretation", e.g.
> > converting digit words to digits ("three" -> "3"). Not sure if that's
> > useful though.
> >
> > On Thu, Aug 16, 2012 at 1:58 PM, Hans Wennborg
> > <hwennborg@google.com> wrote:
> > > Yes, the raw text is in the 'transcript' attribute.
> > >
> > > The description of 'interpretation' is currently: "The interpretation
> > > represents the semantic meaning from what the user said. This might be
> > > determined, for instance, through the SISR specification of semantics
> > > in a grammar."
> > >
> > > I propose that we change it to "The interpretation represents the
> > > semantic meaning from what the user said. This might be determined,
> > > for instance, through the SISR specification of semantics in a
> > > grammar. If no semantic meaning can be determined, the attribute must
> > > be a string with the same value as the 'transcript' attribute."
> > >
> > > Does that sound good to everyone? If there are no objections, I'll
> > > make the change to the draft next week.
> > >
> > > Thanks,
> > > Hans
> > >
> > > On Wed, Aug 15, 2012 at 5:29 PM, Conversational
> > > <dahl@conversational-technologies.com> wrote:
> > >> I can't check the spec right now, but I assume there's already an
> attribute
> > that currently is defined to contain the raw text. So I think we could
> say that
> > if there's no interpretation the value of the interpretation attribute
> would be
> > the same as the value of the "raw string" attribute,
> > >>
> > >> Sent from my iPhone
> > >>
> > >> On Aug 15, 2012, at 9:57 AM, Hans Wennborg <hwennborg@google.com>
> > wrote:
> > >>
> > >>> OK, that would work I suppose.
> > >>>
> > >>> What would the spec text look like? Something like "[...] If no
> > >>> semantic meaning can be determined, the attribute will a string
> > >>> representing the raw words that the user spoke."?
> > >>>
> > >>> On Wed, Aug 15, 2012 at 2:24 PM, Bjorn Bringert
> > <bringert@google.com> wrote:
> > >>>> Yeah, that would be my preference too.
> > >>>>
> > >>>> On Wed, Aug 15, 2012 at 2:19 PM, Conversational
> > >>>> <dahl@conversational-technologies.com> wrote:
> > >>>>> If there isn't an interpretation I think it would make the most
> sense
> > for the attribute to contain the literal string result. I believe this
> is what
> > happens in VoiceXML.
> > >>>>>
> > >>>>>> My question is: for implementations that cannot provide an
> > >>>>>> interpretation, what should the attribute's value be? null?
> > undefined?
> >
> >
> >
> > --
> > Bjorn Bringert
> > Google UK Limited, Registered Office: Belgrave House, 76 Buckingham
> > Palace Road, London, SW1W 9TQ
> > Registered in England Number: 3977902
>
>
> ****
>
> ** **
>
Received on Friday, 17 August 2012 14:42:27 UTC