- From: AUDRAIN LUC <LAUDRAIN@hachette-livre.fr>
- Date: Wed, 8 Apr 2015 18:41:35 +0200
- To: Ivan Herman <ivan@w3.org>, Bill Kasdorf <bkasdorf@apexcovantage.com>
- CC: "Stein, Ayla" <astein@illinois.edu>, Thierry Michel <tmichel@w3.org>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
Dear Ivan, In EPUB3 files, HTML content is tagged with empty anchors like : "Ils n�en veulent pas, ils n�en <a id="page_182"/>veulent pas, elle l�che dans un soupir en attrapant encore une lettre." This means that a new paper page starts at word � veulent �. In parallel, the EPUB3 nav document contains an ordered list of navigation points in a <nav epub:type="page-list �> element : <li> <a href="chap22.html#page_182">Page 182</a> </li> Then the label of this paper page 182 is � Page 182 �. In term of worflow, by good practice, we produced a new EPUB file as soon as text corrections have been inserted in the reprint book. Best, Luc Luc Audrain Hachette Livre Direction Innovation et Technologie Num�rique 11, rue Paul Bert, 92240 Malakoff Fixe : +33 (0) 1 4123 6370 Mobile : +33 (0) 6 48 38 21 41 Le 08/04/2015 18:27, � Ivan Herman � <ivan@w3.org> a �crit : >In spite of being a W3C digerati, ie, the worst possible sort:-), I do >understand... > >It is good that this came up: it must be recorded as a requirement... > >But how does it work, eg, in EDUPUB? Does it mean that the, say, HTML >file contains some non-visible spans with ID-s, where the ID somehow >reflects the page number of the print version? And what happens if there >is a new printed version (but no new digital version)? > >Ivan > > >> On 08 Apr 2015, at 16:55 , Bill Kasdorf <bkasdorf@apexcovantage.com> >>wrote: >> >> This issue is mainly pertinent to publications originally published in >>print and only later provided in digital form. There are of course >>millions of such publications in libraries, which is the main ___domain of >>the HathiTrust. >> >> The reason this is important is that there are four primary use cases >>characteristic of this "print is the version of record" situation: >> >> --The indexes in print books typically (though not universally) point >>to arbitrary points in the content: the print page breaks. >> --Cross-references in the text of print books typically refer to print >>page breaks. >> --Citations in the literature (very important in scholarship) point to >>print page breaks. >> --The accessibility community strongly advocates the recording of print >>page breaks in digital versions of print publications, particularly >>textbooks, so that when the teacher says "turn to page 53" the >>print-disabled user can find that spot (as can any user of the digital >>version). >> >> While most W3C folks would argue that this is a relic of print-based >>publishing (and it is), and would argue that these should be replaced >>with real links to meaningful points in the content, not to something as >>arbitrary as a print page break (which is indisputably better), it >>unfortunately happens to be a real need when we are in this transitional >>phase; and all of those millions of old books, and the citations to >>their pages, do actually exist. So it really does turn out to be useful >>to have "markers" in a digital file designating where the print page >>breaks are--accompanied, btw, with an ability to designate _which_ print >>edition the markers refer to. >> >> As distasteful as that is to digerati like us. ;-) >> >> And btw, in the context of EPUB-WEB, for these very reasons (especially >>the accessibility issue), providing such print page break markers is >>recommended in the EDUPUB spec, which provides a recommended syntax for >>the marker. It doesn't attempt to contain the page with a >>start-and-end-tag pair, because you run into well-formedness issues; >>instead, it just provides an empty element that says, in effect, "page >>53 in the print book starts here." >> >> --Bill K >> >> -----Original Message----- >> From: Ivan Herman [mailto:ivan@w3.org] >> Sent: Wednesday, April 08, 2015 4:30 AM >> To: Stein, Ayla >> Cc: Thierry Michel; Bill Kasdorf; W3C Digital Publishing IG >> Subject: Re: [dpub identifiers] Please review updated Identifiers TF >>wiki >> >> Thank you Ayla. >> >> Without going into the details of the proposal, the question it raises >>to me, as part of the EPUB-WEB discussion, is what is the role (if any) >>of an identifier that identifies a *page*. Indeed, depending on the >>style of the online document, a page is >> >> * a very ephemeral entity and thereby it is not really a suitable >>target for an identifier (a flowing book, whose pagination is based on >>user interaction, is the obvious example) >> * a fixed entity, ie, for fixed layout document >> >> it strikes me that an identifier approach for an EPUB-WEB document >>needs to cover the second item, too. AFAIK, CFI can do that only if the >>fixed layout document is organized in terms of a series of separate >>files within the package, but that may not cover all the cases (e.g., if >>a presentation slide show is stored as a portable document, and the >>'pagination' is the result of a javascript running on one single source). >> >> Whether the approach taken by the HathiTrust document is the right one >>(as far as I could understand from a cursory look it assigns a UDDI type >>URN to each page, which is then combined with the identifier of a >>'volume') is a different question. I am not sure this is a general >>solution but I guess the more general questions are certainly valid! >> >> Thanks again >> >> Ivan >> >> >>> On 07 Apr 2015, at 20:21 , Stein, Ayla <astein@illinois.edu> wrote: >>> >>> Matt's comment about content version reminded me of some ongoing work >>>at the HathiTrust Research Center. One of the problems they're looking >>>into is identifying an object at a specific point in time. Their >>>initial proposal document discusses several different issues regarding >>>identifiers in HTRC and can be accessed here: >>>https://www.ideals.illinois.edu/handle/2142/73147. I've also added it >>>as an attachment to this email. >>> >>> I know there's also been some work on a prototype for identifying >>>versions, but the draft of that document is not yet available for >>>circulation. While these aren't necessarily solutions that can be >>>implemented here, I think it's of interest and relevance to this >>>discussion. >>> >>> Thanks, >>> >>> Ayla >>> >>> -----Original Message----- >>> From: Ivan Herman [mailto:ivan@w3.org] >>> Sent: Tuesday, March 24, 2015 3:32 AM >>> To: Thierry Michel >>> Cc: Bill Kasdorf; W3C Digital Publishing IG >>> Subject: Re: [dpub identifiers] Please review updated Identifiers TF >>> wiki >>> >>> >>>> On 24 Mar 2015, at 09:30 , Ivan Herman <ivan@w3.org> wrote: >>>> >>>> I have added the media fragment URI to the wiki with few examples. >>>>Thierry, if you want to add something, please do at: >>> >>> Sorry, pushed the send button too soon: >>> >>> https://www.w3.org/dpub/IG/wiki/Task_Forces/identifiers#W3C.E2.80.99s_ >>> Media_Fragment >>> >>> Thanks >>> >>> ivan >>> >>>> >>>> >>>>> On 23 Mar 2015, at 08:20 , Thierry MICHEL <tmichel@w3.org> wrote: >>>>> >>>>> Bill, >>>>> >>>>> I would also suggest Media Fragments URI 1.0 It specifies the syntax >>>>> for constructing media fragment URIs and explains how to handle them >>>>>when used over the HTTP protocol. >>>>> >>>>> http://www.w3.org/TR/2012/REC-media-frags-20120925/ >>>>> a W3C Recommendation 25 September 2012. >>>>> >>>>> Best, >>>>> >>>>> thierry. >>>>> >>>>> On 22/03/2015 17:51, Bill Kasdorf wrote: >>>>>> Thanks to Tzviya, we have some substantive content for review on >>>>>> the Identifiers TF wiki at [1]. >>>>>> >>>>>> This initial draft of background information gives brief >>>>>> descriptions, links, discussion, and examples of three possible >>>>>> options for consideration as the basis for our initial work on a >>>>>>Fragment Identifier: >>>>>> >>>>>> --EPUB CFI >>>>>> >>>>>> --W3C Packaging for the Web Fragment Identifiers >>>>>> >>>>>> --The Open Annotations Fragment Selector >>>>>> >>>>>> In addition, there's a placeholder for XPath, and we need to >>>>>> collect suggestions for other relevant specs or technologies to >>>>>> take into account, e.g. XPointer. >>>>>> >>>>>> Please take a look at this before the Monday IG call and suggest >>>>>> any others we should add. Feel free to add a placeholder (ideally >>>>>> with a >>>>>> link) if you aren't prepared to add the prose. >>>>>> >>>>>> And although we now have a good list of participants in this TF, >>>>>> please add your name if you'd like to participate as well. We will >>>>>> discuss next steps on the call Monday, which will probably involve >>>>>> a TF conference call later this week if we can find a time that >>>>>>works for everybody. >>>>>> >>>>>> --Bill K >>>>>> >>>>>> [1] >>>>>> https://www.w3.org/dpub/IG/wiki/Task_Forces/identifiers#Background >>>>>> >>>>>> Bill Kasdorf >>>>>> >>>>>> Vice President, Apex Content Solutions >>>>>> >>>>>> Apex CoVantage >>>>>> >>>>>> W: +1 734-904-6252 >>>>>> >>>>>> M: +1 734-904-6252 >>>>>> >>>>>> @BillKasdorf <http://twitter.com/#!/BillKasdorf> // >>>>>> >>>>>> _bkasdorf@apexcovantage.com_ >>>>>> >>>>>> ISNI: 0000 0001 1649 0786__ >>>>>> >>>>>> https://orcid.org/0000-0001-7002-4786 >>>>>> <https://orcid.org/0000-0001-7002-4786?lang=en> >>>>>> >>>>>> www.apexcovantage.com <http://www.apexcovantage.com/> >>>>>> >>>>>> Corporate Logo-Copy >>>>>> >>>>> >>>> >>>> >>>> ---- >>>> Ivan Herman, W3C >>>> Digital Publishing Activity Lead >>>> Home: http://www.w3.org/People/Ivan/ >>>> mobile: +31-641044153 >>>> ORCID ID: http://orcid.org/0000-0003-0782-2704 >>>> >>>> >>>> >>>> >>> >>> >>> ---- >>> Ivan Herman, W3C >>> Digital Publishing Activity Lead >>> Home: http://www.w3.org/People/Ivan/ >>> mobile: +31-641044153 >>> ORCID ID: http://orcid.org/0000-0003-0782-2704 >>> >>> >>> >>> >>> <IdentifiersProposal.pdf> >> >> >> ---- >> Ivan Herman, W3C >> Digital Publishing Activity Lead >> Home: http://www.w3.org/People/Ivan/ >> mobile: +31-641044153 >> ORCID ID: http://orcid.org/0000-0003-0782-2704 >> >> >> >> >> > > >---- >Ivan Herman, W3C >Digital Publishing Activity Lead >Home: http://www.w3.org/People/Ivan/ >mobile: +31-641044153 >ORCID ID: http://orcid.org/0000-0003-0782-2704 > > > >
Received on Wednesday, 8 April 2015 16:42:14 UTC