Despite wide agreement on the importance of the EPUB 3 standard for ebooks, its implementation by reading systems and its use by publishers are currently incomplete and inconsistent. This situation is compounded by the fact that certain publishers are reluctant to use EPUB 3 features that can’t reliably be expected to work across reading systems, and the fact that reading systems developers appear reluctant to implement features they seldom find in the EPUBs provided by publishers.
In an effort to help improve this situation, the Digital Issues Working Group of the Association of American Publishers (AAP DIWG) launched the “AAP EPUB 3 Implementation Project” in July, 2013. The goal was to bring together a group of people who could provide perspectives from a variety of publishers, reading system developers, retailers, service providers, and the accessibility community to jointly articulate priorities for the implementation of EPUB 3 features by reading systems and best practices for the creation of EPUBs, with a special emphasis on enabling accessibility. This is particularly urgent because many publishers plan to distribute large numbers of EPUB 3 files in the first quarter of 2014 and are already finding the need to prepare those files with workarounds due to the inconsistencies in the ecosystem.
It is important to note the near-term focus of this initiative. It is generally agreed that all features of EPUB 3 are important and useful. Complete implementation of all features, and the best practices in their use in the EPUBs created by publishers and their service providers, is a widely shared goal. The purpose of the AAP EPUB 3 Implementation Initiative was not to reassess the EPUB 3 specification, but to focus on the most important priorities and best practices in the near term. This initiative was intended to initiate a process, not to conclude one. It is hoped that it will help stimulate and support further collaborative work by other organizations to advance the development of the EPUB ecosystem by developing preflighting and conformance testing tools, model EPUB files, and other resources. It is also hoped that this initiative will encourage publishers, reading system developers, service providers, and retailers to contribute—either through financial support, participation, or technical contributions—to such efforts as well.
People from many related industry organizations participated in this initiative, and several of those organizations have indicated support and a willingness to help further this effort.
A group of volunteers from across the industry, coordinated by the AAP, was divided into “work streams“ focusing, respectively, on assessing priorities for EPUB 3 feature implementation, articulating best practices for accessibility in EPUBs, addressing metadata issues, and developing use cases. These work streams conducted intensive virtual meetings through August in order to provide a framework for discussion in a face-to-face meeting held in New York City on September 10 attended by 90 publishing, technology, and accessibility professionals.
This White Paper summarizes the results of that meeting. The issues that the meeting participants deemed most critical for both publishers and reading systems are in the following categories (note that these are not in priority order):
Most Critical Issues as Identified in the Workshop
These categories were developed in the afternoon session of the workshop, which followed morning presentations by each of the work streams. Those presentations are provided as supplements to this White Paper, along with links to relevant resources. The findings of the Features and Accessibility Work Streams are also documented in the following section of this paper. A summary is provided here.
The Features Work Stream systematically assessed the relative priorities of 36 specific features of the EPUB 3.0 specification, with the publishers anonymously ranking them as critical, important, or nice to have, and the reading systems anonymously assessing whether they would be easy, medium, or hard to implement. That process, described in more detail below, resulted in the following ten categories of features (listed in order of relative priority):
The Accessibility Work Stream systematically assessed the 129 issues documented in the IDPF’s EPUB 3 Accessibility Guidelines and developed a list of “Top Tips for Accessible EPUB 3,” for publishers to consider as they implement:
The Metadata Work Stream addressed the following general issues regarding the use of metadata in, or in support of, EPUB 3 publications:
The Use Cases Work Stream presentation provided examples of issues currently encountered by publishers in attempting to provide consistent EPUBs to reading systems and the variation in rendering and behavior currently encountered. Key issues in EPUB 3 implementation articulated by this work stream were:
In conclusion, the participants in the workshop agreed that it will be critical to continue the work of this initiative by involving other organizations, especially those that are already making meaningful contributions to various aspects of the issues articulated by this initiative. A willingness to support and extend this work was expressed by the IDPF, the Readium Foundation, the BISG, Benetech, the American Printing House for the Blind, the National Federation of the Blind, the DAISY Consortium, and other organizations committed to the successful implementation of EPUB 3.
While it is understood that both reading systems developers and publishers will need to make their own decisions in regard to their systems and publications, it was clear to all participants in this initiative that improvements in both reading system feature implementation and practices for creating EPUBs on the part of publishers are not just important, they are urgent. As mentioned above, the current situation is one of great inconsistency, requiring workarounds and multiple variant files, a situation that must be improved by the better implementation of this important standard.
The Features Work Stream articulated the following key items for implementation of EPUB 3 features by reading systems developers and their proper use by publishers.
Publishers should consider including alternate files and proper encoding so that reading systems that do not support specific functionality can “fall back” to simpler functionality. For example, reading system support for video is optional, but if a reading system does not support video, it would fall back to an image provided in the EPUB by the publisher. In addition, there are fallbacks inherent to the HTML5 and CSS specifications, such as specifying a color in extended key terms of CSS3 but with a fallback to RGB.
EPUB is based on XHTML5 structure in files and reading system processing. Without proper structure and nesting, accessibility is compromised, many features do not work, and many best practices cannot be implemented by the reading system or publisher. The proper use of and support for such tags as headings (<h2>-<h6>), <aside>, the @hidden attribute, and tags for lists and tables are considered particularly crucial.
One of the goals of EPUB3 was creating a simplified navigation document that meets the needs of publishers and the accessibility community. The HTML5 <nav> document uses HTML structure to meet the needs of accessibility while providing publishers the option of incorporating design elements. Many publishers have expressed that they dislike creating multiple TOCs, which, in their view, can lead to error and confusion. Many publishers expressed that they would like the navigation document to be the only TOC in the EPUB and for its features (including the use of @hidden) to be fully and properly implemented by reading systems.
Several tools are being developed that will lead to improvements in EPUB files and Reading Systems. The IDPF and BISG are developing a second generation EPUB 3 Support Grid based on IDPF's Conformance Test Suite. Each test file will test specific functions of a reading system and document whether that system passes or fails. The DAISY Consortium has developed an “Accessibility Screening Methodology” designed primarily for testing the accessibility of a reading system's user interface and functionality. The Workshop stressed the need for financial and technical support of initiatives like these.
1.Navigation
2. Audio:The EPUB spec requires support for one of two audio formats (this is a prerequisite for media overlays):
3. SVG:Scalable Vector Graphics are XML-based vector images that can be zoomed infinitely without loss of clarity. This is important to publishers, which can use SVG images without grappling with responsive design. Further, SVG is accessible in that the content is machine readable and can work with Braille readers and even 3D printers.
4. Fonts:It is important to see support for fonts and information about fonts in EPUBs and EPUB 3 reading systems. Fonts are not just for design purposes; they are often needed in order to render characters that are not part of default character sets. This is common in non-English words and for special characters, such as those used in phonetics, physics, or math. Important font issues are:
5. Media Overlays: Representation of audio synchronized with text content. This can be a visual display, such as a bouncing ball highlighting words in an early reading book, or it can be an audio book/ebook that keeps one’s place between reading and listening. Use of Media Overlays is particularly important for publishers in segments such as children’s books as well as for the accessibility community,
6. Semantic inflection (e.g., page-break, part, chapter, index, glossary, sidebar, footnote): an attribute that allows for precise statements to be made about markup within HTML (e.g., "noteref" and "footnote"). This allows for precise and uniform markup and rendering across publications and reading systems, allowing for contextual search and improved navigation.
7. MathML: An XML vocabulary for describing mathematical notations and capturing both its structure and content. MathML allows for reflowable, resizable, searchable math equations. It is considered crucial for STEM and Education publishers, otherwise math must be displayed as static images. It is also critical for the accessibility community, as images are meaningless and descriptions of images do not convey the same meaning as the MathML.
8. Video:The EPUB specification offers two options for video support. Neither is a requirement. However, if video is not supported, the reading system should support a fallback to an image. Being able to “not support” video properly is as important as supporting video, so that each publisher can create one file for all retailers. The two video formats supported by EPUB 3 are:
9. Float: The CSS property that enables elements to be positioned independently of the linear narrative flow. This is often a property of the underlying system or browser upon which the reading system is built.
10. Fixed Layout (FXL): A method for creating static, fixed e-book “pages“. Some content is better suited for a “fixed“ instead of reflowable display. This is important to some segments of the publishing industry.
The following are considered best practices for providing content that is accessible to users who are visually impaired or have other print disabilities (such as dyslexia, etc.). They are based on the DIAGRAM Center’s “Top Tips for Creating Accessible EPUB 3 Files,“ which is available at http://diagramcenter.org/resources/diagram-related-links/54-tips-for-creating-accessible-epub-3-files.html . Note that more detailed best practices are provided in the International Digital Publishing Forum (IDPF) EPUB 3 Accessibility Guidelines at http://www.idpf.org/accessibility/guidelines/. Additional links are provided below to locations within those guidelines and to other resources that address specific issues.
Visual reading is only one way of accessing content. It is not a good practice to use visual-only cues such as colored text, font size or positioning as the only clue to the meaning or importance of a word or section, or to use tables or pictures of text to control the appearance of the content. The meaning of the content should be the same both with and without any styles or formatting applied. It is important for all the text of the publication to be available in a logical reading order. It is not a good practice to use inline CSS or the @style attribute. See http://www.idpf.org/accessibility/guidelines/content/semantics/separation.php for more detail.
It is best practice to include a complete table of contents in the front matter and consider including smaller tables of contents at the start of each section. It is also best practice to use <section>, <figure>, and <aside> tags in the content to define a logical reading order. This is particularly important for academic, educational, and other texts with complicated visual layouts like many children’s books.
It’s best to create a structure by using numbered headings in a logical structure. For other tagged structures, it's best to specify their content with the epub:type attribute. For example, the tag that contains the preface of a book might look like <section epub:type="preface">. Specific tags are for specific content only (i.e., the <cite> tag is only for citations) and should be used according to the HTML5 standard. Use the most specific tag available and do not automatically wrap <div> or <span> tags around everything.
Consider including semantic information to describe the content of a tag. A section tag for the table of contents would look like <section epub:type="toc"> or a list of definitions in a glossary would be tagged with <dl epub:type="glossary">. The EPUB 3 Structural Semantics Vocabulary as defined at (http://idpf.org/epub/vocab/structure/) can help to identify content.
Any content embedded in an image is not available to visually impaired readers. If the textual contents of a table or image are required for comprehension of the document, it's important to use proper and complete markup for text and tabular data, including headers and scope attributes for tables. If images of tables are unavoidable, it's best to provide a link to a separate page containing the properly marked up tabular data. If images of text are unavoidable, then it's important to provide a description and transcription of the text and for accessible SVG (http://www.w3.org/2000/10/wcag2-svg-techs-020318) to be used. Accessible SVG graphics allow text in images to be rendered in an accessible way. They can also make it possible to deliver tactile images electronically to blind users with appropriate devices or to help automate the creation of tactile images that can be mailed to the reader with minimum human intervention.
In order to be accessible, every image should have a description, caption or alt text unless it is solely decorative. See specific mark-up recommendations in the DIAGRAM Center Image Description Guidelines, including special mark up for decorative images: http://diagramcenter.org/standards-and-practices/59-image-guidelines-for-epub-3.html. . See also the IDPF Accessibility for EPUB 3 Guidelines for Images at http://www.idpf.org/accessibility/guidelines/content/xhtml/images. and the Described and Captioned Media Program Captioning Key at http://www.dcmp.org/captioningkey/.
Page numbers are the way many people navigate within a book. For any book with a print equivalent, the epub:type="pagebreak" attribute is used to designate page numbers. It is best to include the ISBN of the source of the page numbers in the package metadata for the book. A tag for a page number might look like <span xml:id="page361" epub:type="pagebreak">361</span>. See IDPF EPUB 3 Accessibility Guidelines and Examples for Page Numbers at http://www.idpf.org/accessibility/guidelines/content/xhtml/pagenum.php.
Providing the default language of the content in the root html tag can ensure that each word will be rendered correctly by assistive technology. Likewise, the @xml:lang attribute is used to indicate any words, phrases or passages in a different language, e.g., <span xml:lang="fr" lang="fr">rue Saint-Andre-des-Arts</span>. See IDPF EPUB 3 Accessibility Guidelines and Examples for Language at http://www.idpf.org/accessibility/guidelines/content/xhtml/lang.php.
MathML makes mathematical equations accessible to everyone by eliminating the ambiguity of a verbal description of a picture. There are many tools available to support MathML creation. See the IDPF EPUB 3 Accessibility Guidelines and Examples for MathML at http://www.idpf.org/accessibility/guidelines/content/mathml/desc.php.
When the native controls for video and audio content in HTML5 are enabled by default, media content is much more accessible. Fallback options such as captions or descriptions for video and transcripts for audio are also important. Sign language is also important to many deaf users. See the IDPF EPUB 3 Accessibility Guidelines for Audio and Video at http://www.idpf.org/accessibility/guidelines/content/xhtml/audio.php and http://www.idpf.org/accessibility/guidelines/content/xhtml/video.php and Described and Captioned Media Program Captioning Key at http://www.dcmp.org/captioningkey/.
In order to make interactive content using JavaScript or SVG accessible, all custom controls should fully implement ARIA roles, states and properties, as appropriate. Native controls do not usually require ARIA. See also the IDPF EPUB 3 Accessibility Guidelines for Scripted Interactivity sections covering Progressive Enhancement, Content Validity, WAI-ARIA & Custom Controls, Forms, Live Regions, and Canvas at http://www.idpf.org/accessibility/guidelines/, as well as the IDPF EPUB 3 Accessibility Guidelines for SVG Interactivity at http://www.idpf.org/accessibility/guidelines/content/svg/script.php and the W3C Web Accessibility Initiative: SVG Techniques for Web Content Accessibility Guidelines at http://www.w3.org/2000/10/wcag2-svg-techs-020318.
As part of a general good practice of documenting the accessibility of your content, providing accessibility metadata in your files lets end users know what features are there and search engines can discover your accessible materials. See the IDPF EPUB 3 Accessibility Guidelines for Metadata at http://www.idpf.org/accessibility/guidelines/content/meta/onix.php.
The Accessibility Work Stream articulated the following best practices for publisher workflow processes:
(For further guidance on these points, see Accessible Publishing, Best Practice Guidelines for Publishers at http://www.editeur.org/109/Enabling-Technologies-Framework-Guidelines/.)
A fundamental purpose of the AAP EPUB 3 Implementation Project was to help stimulate dialog and collaboration among a wide variety of participants in the EPUB ecosystem—publishers, service providers, retailers, reading system developers, accessibility services and advocates, and others—who share a common interest in the success and wide adoption of the EPUB 3 standard.
The workshop summarized by this White Paper (and the work leading up to it) revealed an encouraging degree of consensus. While the inconsistencies and deficiencies of the current EPUB 3 ecosystem are widely acknowledged, the priorities for feature implementation and the best practices for creating EPUBs are largely shared among participating publishers of many different types, and acknowledged by retailers and reading system developers as well.
And while it is understood that there are obstacles to the implementation of some types of features—especially by reading systems that are based on certain browsers or other rendering technologies, and thus are dependent on those features being implemented in those underlying systems—many of the issues surfaced by this initiative only require good markup on the part of publishers and their service providers and full implementation of fundamental HTML5 and CSS features by reading systems.
While the ecosystem will never be perfect—both the EPUB 3 standard and the reading systems that implement it will continue to evolve—the prospect of a well-functioning EPUB 3 ecosystem is actually quite close: an ecosystem in which a great many fundamental and important features can be used consistently by publishers with the expectation that they will be implemented in a wide range of reading systems and platforms.
A variety of organizations are already contributing in many ways to furthering this. The final section of this White Paper lists a number of resources that were identified in the course of this initiative as being relevant and helpful in furthering reading system development and best practices in the creation of EPUBs. In addition, many organizations have indicated a willingness to take responsibility for various aspects of the work that still needs to be done. The participants in the workshop encouraged organizations of all types—publishers, retailers, technology companies, service providers, and others—to contribute to these activities.
Activities moving forward outside of AAP at other organizations which have been identified by this initiative include the following:
The IDPF, BISG, and DAISY Consortium have all agreed to participate in this ongoing work.
Resources identified by the work streams in the course of this initiative included the following:
For additional information on implementing these coding practices, including examples, a checklist for quality assurance, and more, please consult the International Digital Publishing Forum EPUB 3 Accessibility Guidelines maintained here: http://www.idpf.org/accessibility/guidelines/.
The following additional resources will also be useful.