EPUB 3 Changes from EPUB 2.0.1

Working Group Draft 16 May 2011

This version
http://www.idpf.org/epub/30/spec/epub30-changes-20110516.html
Latest version
http://www.idpf.org/epub/30/spec/epub30-changes.html
Previous version
http://www.idpf.org/epub/30/spec/epub30-changes-20110506.html

A diff of changes from the previous Working Draft is available at this link.

Editors

William McCoy, International Digital Publishing Forum (IDPF)

Markus Gylling, DAISY Consortium

Table of Contents

1. Introduction
1.1. EPUB Revision History
2. Changes to EPUB Specification Documents
2.1. Changes in Document Organization
2.2. Changes in Terminology
3. New and Changed Functionality in EPUB 3
3.1. Content Documents
3.1.1. HTML5
3.1.2. SVG
3.1.3. MathML
3.1.4. Semantic Inflection
3.1.5. Content Switching
3.2. Navigation
3.3. Linking
3.4. Scripting and Interactivity
3.4.1. Scripting
3.4.2. Triggers
3.4.3. Bindings
3.5. Styling and Layout
3.5.1. CSS
3.5.2. Embedded Fonts
3.5.3. Font Obfuscation
3.6. Rich Media
3.6.1. Audio and Video
3.6.2. Media Overlays
3.7. Metadata
3.7.1. Publication Metadata and Identity
3.7.2. Resource Metadata
3.8. Speech
3.9. Manifest Fallbacks
3.10. Containment
3.10.1. Remote Resources
3.10.2. Whitespace in MIMETYPE file
3.10.3. Disallowed characters in OCF file names
3.11. XML and Unicode
4. EPUB 2.0.1 Features Replaced in EPUB 3
4.1. Features Removed from EPUB 3
4.1.1. DTBook
4.1.2. Out-of-Line XML Islands
4.1.3. Tours
4.1.4. Filesystem Container
4.2. Features Deprecated in EPUB 3
4.2.1. Guide
4.2.2. NCX
A. Acknowledgements and Contributors
References

 1 Introduction

EPUB® is an interchange and delivery format for digital publications, based on XML and Web Standards. An EPUB Publication can be thought of as a reliable packaging of Web content that represents a digital book, magazine, or other type of publication, and able to distributed for online and offline consumption.

This document, EPUB 3 Changes from EPUB 2.0.1, describes changes made in the third major revision of EPUB, including some rationale for the changes, and some guidance for content authors and Reading System developers regarding backwards compatibility considerations.

This document is non-normative. The EPUB specification documents should be consulted for definitive information on EPUB 3:

  • EPUB Publications 3.0 [Publications30], which defines publication-level semantics and overarching conformance requirements for EPUB Publications.

  • EPUB Content Documents 3.0 [ContentDocs30], which defines profiles of XHTML, SVG and CSS for use in the context of EPUB Publications.

  • EPUB Open Container Format (OCF) 3.0 [OCF3], which defines a file format and processing model for encapsulating a set of related resources into a single-file (ZIP) Container.

  • EPUB Media Overlays 3.0 [MediaOverlays30], which defines a format and a processing model for synchronization of text and audio.

Unless otherwise specified, terms used herein have the meaning defined in these specifications.

 1.1 EPUB Revision History

EPUB had its roots in the interchange format known as the Open EBook Publication Structure (OEBPS). OEBPS 1.0 was approved in 1999 by the Open eBook Forum, an organization that later became the International Digital Publishing Forum (IDPF). Subsequent revisions 1.1 and 1.2 were approved by the IDPF in 2001 and 2002 respectively.

It was realized that a need existed for a format standard that could be used for delivery as well as interchange, and work began in late 2005 on a single-file container format for OEPBS, which was approved by the IDPF as the OEBPS Container Format (OCF) in 2006. Work on a 2.0 revision of OEBPS began in parallel which was approved as the renamed EPUB 2.0 in October, 2007, consisting of a triumvirate of specifications: Open Package Format (OPF), Open Publication Format (OPF) together with OCF. EPUB 2.0.1, a maintenance update to the 2.0 specification set primarily intended clarify and correct errata in the specifications, was approved in September, 2010. [OPF2] [OPS2] [OCF2]

 2 Changes to EPUB Specification Documents

In addition to significant changes in functionality, the EPUB 3 specifications are structured and named differently than EPUB 2.0.1, and certain terminology changes have been made to improve clarity. The following sections describe these changes.

 2.1 Changes in Document Organization

In order to help those familiar with EPUB 2.0.1 to understand the mapping of information in EPUB 3, the following table shows where information in EPUB 3 is located relative to the EPUB 2.0.1 specifications.

 

Specification Document Organization

AreaEPUB 3 SpecificationEPUB 2.0.1 Specification
OverviewEPUB 3 Overview(throughout)
Publication-level Specification & Package DocsEPUB Publications 3.0Open Packaging Format 2.0.1
Content-level SpecificationEPUB Content Documents 3.0Open Publication Structure 2.0.1
EPUB Navigation DocumentsEPUB Content Documents 3.0N/A (NCX referenced as DAISY specification)
Media OverlaysEPUB Media Overlays 3.0N/A
Container packagingEPUB Open Container Format 3.0Open Container Format 2.0.1
Changes from previous versionEPUB 3 Changes from EPUB 2.0.1(throughout)

 2.2 Changes in Terminology

Maintaining consistent use of terminology from EPUB 2.0.1 to EPUB 3 was a consideration during development, but changes in document organization, feature set and conformance requirements inevitably resulted in a number of changes.

Each specification contains a Terminology section near the top that defines and explains the new terms (e.g., Terminology [Publications30]).

 3 New and Changed Functionality in EPUB 3

This section describes the major new and changed functionality and constructs present in EPUB 3.

 3.1 Content Documents

 3.1.1 HTML5

EPUB 3's base content format is now based on the XML serialization of HTML5 (XHTML5) [ContentDocs30], whereas EPUB2 supported two basic content types: a profile of XHTML 1.1 and DTBook [OPS2] (a semantically-enhanced markup focused on accessibility concerns).

The EPUB 3 XHTML Content Document definition includes both extensions to and restrictions on its HTML5 base, many of which are discussed below. Refer to HTML5 Extensions and Enhancements [ContentDocs30] and HTML5 Deviations and Constraints [ContentDocs30] for complete information.

 3.1.2 SVG

SVG documents can now appear in the spine in EPUB 3 (i.e., SVG no longer needs to be nested within an XHTML document).

 3.1.3 MathML

Support for MathML [ContentDocs30] is new in EPUB 3.

 3.1.4 Semantic Inflection

A method for inflecting domain-specific semantics in XHTML Content Documents using attributes has been added. Refer to Semantic Inflection [ContentDocs30] for more information.

 3.1.5 Content Switching

The switch element, initially introduced in [OPS2], has been simplified by having its processing model defined so that it does not require document preprocessing, and by removing the requiredModules attribute. This simplification is backwards compatible with existing EPUB 2 Reading System implementations. Refer to Content Switching [ContentDocs30] for more information.

 3.2 Navigation

EPUB 3 defines a new human- and machine-readable grammar for publication-wide navigation information via a specialized adaptation of the general EPUB XHTML Content Document. EPUB Navigation Documents [ContentDocs30] supersedes the NCX grammar used in EPUB 2.

While NCX support was optional for EPUB2 Reading Systems, inclusion of and support for EPUB Navigation Documents is required in EPUB 3.

As noted in NCX Superseded [Publications30], EPUB 3 Publications may include the EPUB 2 NCX for EPUB 2 Reading System forward compatibility purposes.

 3.3 Linking

The IDPF has established a registry of linking schemes.

[EPUBCFI] is the first scheme added to the registry, and can be used for linking into, between and within Publications. Reading System support for this scheme is required.

 3.4 Scripting and Interactivity

 3.4.1 Scripting

EPUB 3 Reading Systems may optionally support scripting, which was explicitly discouraged in EPUB 2. Scripted content must be identified as such in the package manifest [Publications30] and is subject to other restrictions and limitations as further described in Scripted Content Documents [ContentDocs30].

The new custom epubReadingSystem JavaScript object [ContentDocs30], provides scripts a means of querying a Reading System to determine its capabilities.

 3.4.2 Triggers

To facilitate content-specific user experiences for audio and video controls without requiring scripting, a new trigger element is defined in the EPUB profile of HTML5 [ContentDocs30] that allows declarative binding of activation events from image or textual elements to properties of audio and video players (e.g., play, stop, pause).

 3.4.3 Bindings

The new bindings [Publications30] element provides a means to define script-based handlers for non-standard media types.

 3.5 Styling and Layout

 3.5.1 CSS

EPUB 3 defines a profile of CSS based on CSS 2.1 with added modules from CSS3, whereas EPUB 2 was based on a specific subset of CSS 2. Refer to EPUB Style Sheets [ContentDocs30] for more information.

Support for Alternate Style Tags [ContentDocs30] has been added, allowing Users to switch between predefined alternate viewing modes, such as day/night and horizontal/vertical modes.

 3.5.2 Embedded Fonts

EPUB 3 requires Reading Systems to support the OpenType and WOFF font formats for embedded fonts in conjunction with the CSS @font-face rules. Refer to CSS Fonts Level 3 [ContentDocs30] for more information.

 3.5.3 Font Obfuscation

A new normative section on Font Obfuscation [OCF3] has been added the Open Container Format specification. This issue was previously outlined in an IPDF informational document.

 3.6 Rich Media

 3.6.1 Audio and Video

EPUB 3 inherits support for the HTML5 audio and video elements.

EPUB 3 further specifies in its definition of support for Core Media Types [Publications30] that all Reading Systems that support audio playback must support MP3 audio and should support MP4 AAC LC audio. While no video Core Media Types are defined in this version of EPUB, an informative recommendation on codec support is provided as guidance to publishers and Reading System developers.

 3.6.2 Media Overlays

The EPUB Media Overlays 3.0 [MediaOverlays30] specification defines a format and a processing model for publication-wide synchronization of text and audio.

 3.7 Metadata

 3.7.1 Publication Metadata and Identity

The minimally required Package metadata as defined in EPUB 2.0.1 remains fundamentally unchanged. Only one new required metadata property, dcterms:modified, has been added. This new property contributes to the new solution to persistence in Publication Identifiers, as further discussed in Publication Identifiers [Publications30].

The generic Package Document meta element has been enhanced [Publications30] with a declarative term vocabulary association mechanism, as well as the ability to describe not only the Publication as a whole, but also individual resources and/or fragments within it. A set of EPUB-specific metadata properties has been added, allowing for example the identification of the Publication cover image, and sorting of related titles in the bookshelf.

A new metadata link [Publications30] element has been added to the Package Document, allowing the association of external supplementary metadata resources with the publication (e.g., ONIX or XMP records).

 3.7.2 Resource Metadata

The new properties attribute on the Package Document manifest item and spine itemref elements allows for the declaration of metadata about individual Publication Resources.

These declarations are required in EPUB 3 in certain defined circumstances (e.g., for declaring that a Content Document contains scripting). Refer to Manifest item Properties [Publications30] and Spine itemref Properties [Publications30] for more information.

 3.8 Speech

Multiple features to assist Text-to-Speech (TTS) engines have been added. These include Package-level Pronunciation Lexicons, SSML attributes in XHTML Content Documents, and support for the CSS3 Speech Module. [ContentDocs30]

 3.9 Manifest Fallbacks

The manifest fallback mechanism has been restricted to only apply to documents in the spine. Publication Resources referenced from XHTML and SVG Content Documents and CSS must now be Core Media Types unless referenced in a context that provides native intrinsic fallback capabilities.

 3.10 Containment

 3.10.1 Remote Resources

There are new restrictions on references [Publications30] to remote resources (i.e., Publication Resources not located in the OCF Container). The implications of this change are more fully described in Removal of Filesystem Container.

 3.10.2 Whitespace in MIMETYPE file

[OCF2] restricted the required MIMETYPE file from having any leading or trailing spaces; in [OCF3] the restriction against trailing whitespace has been removed.

 3.10.3 Disallowed characters in OCF file names

The list of characters disallowed in OCF file names has been extended.

 3.11 XML and Unicode

Support for XML 1.1, which was deprecated in OPS 2.0.1, has been removed. All XML documents must now be conformant to XML 1.0.

The referenced version of XML 1.0 is the fifth edition, which means that Unicode version 5.0.0 is now supported (OPS 2.0.1, via its use of XML 1.0 fourth edition, supported Unicode 2.0).

 4 EPUB 2.0.1 Features Replaced in EPUB 3

A number of features in EPUB 3 and its building-block Web Standards replace existing features in EPUB 2.0.1, and several features in EPUB 2.0.1 that were not widely adopted by content authors or Reading Systems are discontinued. Such features, from a content conformance perspective, are either removed (which means that conformant EPUB 3 content may not use the construct) or deprecated (which means that use of the construct in EPUB 3 is allowed but not recommended. Note that, in most cases, Reading Systems are still required to support these constructs for backwards compatibility reasons (as normatively stated in the relevant specifications).

The following sections list the EPUB 2.0.1 features removed and deprecated in EPUB 3.

 4.1 Features Removed from EPUB 3

 4.1.1 DTBook

DAISY DTBook [Z3986-2005] was an alternative syntax to XHTML 1.1 for Content Documents in OPS 2.0.1 [OPS2] in order to provide an option for more semantic, and thus more accessible, content. As HTML5 includes intrinsic semantic markup capabilities of a similar nature to DTBook, DTBook is no longer an alternative syntax in EPUB 3.

 4.1.2 Out-of-Line XML Islands

OPF 2.0.1. specified an optional extension mechanism enabling a spine item to be a "custom module" XHTML or arbitrary XML styled with CSS. This feature was not widely adopted by content or Reading Systems, and has been removed from EPUB 3. As a result the item element no longer has an optional fallback-style attribute.

 4.1.3 Tours

The Package Document schema no longer includes the tours element (which was deprecated in OPF 2.0.1).

 4.1.4 Filesystem Container

OCF 3.0 [OCF3] only defines a single-file (ZIP-based) container, and no longer defines a "Filesystem Container" abstraction. This change was made in conjunction with new restrictions in Publications 3.0 restricting references to remote resources in EPUB Publications to specific media types and contexts. Taken together, these changes mean that the only instantiation of an EPUB Publication defined at this time is the EPUB ZIP Container, and that EPUB files must in general contain all constituent parts of the Publication, with certain well-defined exceptions.

These changes may seem counter-intuitive given that online consumption of content is increasingly prevalent, as are browser-based Reading System implementations. The Working Group recognizes this, and understands that in an online environment, particularly when browser-based, it will often be inefficient or even impractical to download an entire EPUB file to a client system before reading can occur. A number of significant issues exist for browser-based Reading Systems, however, including cross-domain resource loading restrictions in the browser security model and the potential for inadvertent interaction between script-based interactivity within EPUB 3 content and script-based Reading System implementations. Publishers and content distributors providing EPUB content are presently utilizing server-based software to manage these issues, in effect creating a distributed client-server Reading System in which a packaged EPUB file is ingested on a server and may be transformed en route to client software into whatever set of resources is convenient for that implementation. Consequently, there was no pressing requirement to define an interoperable distributed form of an EPUB Publication in order to meet the requirements of the Working Group charter.

EPUB 2.x (via the OCF Filesystem container and by being relatively vague in the OPF specification about where absolute URLs were legal) can be considered to have incompletely described distributed publications without specifying conformance requirements for them. This was a combination of historical (with respect to OPF which was a revision to a predecessor specification that pre-dated any ZIP-based container) and aspirational (with respect to OCF) factors.

Since the Working Group did have a goal to improve the interoperability of the EPUB ecosystem by increasing the clarity and rigor of our conformance requirements, it was decided that these partial definitions were unhelpful and should be removed from the EPUB 3 base specifications. The Working Group understands that networked Publications will be increasingly important, and expects future work to include development of robust interoperable conformance definitions for distributed EPUB Publications based on emerging content publisher and Reading System requirements.

 4.2 Features Deprecated in EPUB 3

 4.2.1 Guide

Use of the optional guide element in the Package Document has been deprecated in favor of the EPUB Navigation Document landmarks feature. Refer to EPUB Navigation Documents [ContentDocs30] for more information.

 4.2.2 NCX

As described in more detail in Navigation above, the NCX has been superseded in favor of EPUB Navigation Documents [ContentDocs30].

 Appendix A. Acknowledgements and Contributors

This appendix is informative

EPUB has been developed by the International Digital Publishing Forum in a cooperative effort, bringing together publishers, vendors, software developers, and experts in the relevant standards.

The EPUB 3 specifications were prepared by the International Digital Publishing Forum’s EPUB Maintenance Working Group, operating under a charter approved by the membership in May, 2010 under the leadership of:

Active members of the working group included:

IDPF Members

Invited Experts/Observers

For more detailed acknowledgements and information about contributors to each version of EPUB, refer to Acknowledgements and Contributors [EPUB3Overview].

 References

Informative References

[EPUB3Overview] EPUB 3 Overview . Garth Conboy, et al.