EPUB Distributable Objects 1.0

Draft Specification 23 July 2015

This version:

http://www.idpf.org/epub/do/epub-do-20150723.html

Latest version:

http://www.idpf.org/epub/do/

Previous version:

http://www.idpf.org/epub/do/epub-do-20150706.html

Copyright © 2014-2015 International Digital Publishing Forum™

All rights reserved. This work is protected under Title 17 of the United States Code. Reproduction and dissemination of this work with changes is prohibited except with the written permission of the International Digital Publishing Forum (IDPF).

EPUB® is a registered trademark of the International Digital Publishing Forum.

Editors

Darryl Lehmann, Imagineeringart.com

Bill Kasdorf, Apex CoVantage

Markus Gylling, International Digital Publishing Forum (IDPF)

Matt Garrish, Invited Expert

Status of this Document

This document is a work in progress Draft Specification, produced by the IDPF EPUB 3 Working Group. This document may be updated, replaced, or rendered obsolete by other documents at any time.

Table of Contents

1. Overview

1.1 Purpose and Scope

1.2 Terminology

1.3 Typographic Conventions

1.4 Conformance Statements

2. Object States

2.1 Introduction

2.2 Embedded Objects

2.2.1 Conformance

2.2.2 The distributable-object Collection

2.2.2.1 Package Document Structure

2.2.2.2 Nested Objects

2.2.3 Metadata

2.2.3.1 General

2.2.3.2 Release Identifier

2.3.3.3 Fixed Layout

2.2.3.3 Accessibility

2.2.3.4 The local-manifest-designator Property

2.2.3.5 Publication Type Conformance

2.2.4 Discrete Entities

2.2.5 Resource Location

2.3 Packaged Objects

2.3.1 Introduction

2.3.2 Conformance

2.3.3 Metadata

2.3.4 EPUB Content Documents

2.3.5 Embedded Objects

2.4 Identifiers

2.5 Rights

2.6 Multi-Component Distribution

3. Translating Objects

3.1 Introduction

3.2 Embedded to Packaged

3.2.1 Translation Process

3.2.2 Create Package Document

3.2.3 Filter EPUB Content Documents

3.2.4 Resource Renaming

3.2.5 EPUB Navigation Document

3.3 Packaged to Embedded

3.3.1 Translation Process

3.3.2 Create the distributable-object Collection

3.3.3 Migrate Resources

3.3.4 Incorporate Content Fragments

3.3.5 EPUB Navigation Document

3.3.6 Media Overlays

3.3.76 Non-Rendering Resources

3.4 Encryption and Obfuscation

3.5 EPUB Canonical Fragment Identifiers

Appendix A ‒ Example

A.1 Side-by-Side Comparison

A.2 Package Document

A.3 Collection

References

Normative References

Informative References

1. Overview

1.1 Purpose and Scope

This section is informative

This specification, EPUB Distributable Objects, defines a method for the encapsulation, transportation, and integration of Distributable Objects in EPUB® Publications.

Distributable Objects are components of an EPUB Publication that can be reused in other contexts. A Distributable Object can be a complete EPUB Content Document (e.g., a chapter of a book), a section of such a document (e.g., an exercise or a promotional excerpt), a media resource (e.g., a video or interactive feature), or a combination of such resources that are not necessarily contiguous within the parent EPUB Publication but are intended to be able to be distributed as a unit.

This specification provides a framework for identifying these Distributable Objects within an EPUB Publication, extracting them for transport, and integrating them again into a Destination EPUB. Note, however, that both extraction and integration are optional stages in the lifecycle of a Distributable Object; it can be born and live only in its packaged state, be born packaged and only integrated, or be embedded in a source and extracted.

The specification does not address the nature of Distributable Objects or how they are exchanged. This omission is intentional, as the framework for addressing and sharing objects is designed to facilitate many different workflows: exchanging Distributable Objects in an open textbook environment, integrating Distributable Objects into content and learning management systems, identifying components of an EPUB Publication for sale separately, and so on. In that vein, only re-integration in an EPUB Publication is defined, as there are many possible non-EPUB integration scenarios.

In effect, this framework enables the exchange of Distributable Objects between any two parties, while at the same time providing a consistent framework for exchange standards to be developed without continually having to reinvent the markup and processing.

1.2 Terminology

Refer to the EPUB Specifications for definitions of EPUB-specific terminology used in this document.

Destination EPUB

        The EPUB Publication into which a Distributable Object is integrated.

Distributable Object

A collection of discrete content entities and associated resources that comprise a single logical unit of content, usable in other contexts.

Embedded Object

A Distributable Object as defined in a collection element in the Package Document of a Source or Destination EPUB.

Packaged Object

An EPUB Publication whose sole content is a single Distributable Object for distribution.

Processing Agent

The system, or human, responsible for translating a Distributable Object from one state to another.

Source EPUB

        The EPUB Publication into which a Distributable Object is integrated.

1.3 Typographic Conventions

This section is informative

The following typographic conventions are used in this specification:

markup

All markup (elements, attributes, properties), code (JavaScript, pseudo-code), machine processable values (string, characters, media types) and file names are in red-orange monospace font.

markup

Links to markup and code definitions are underlined and in red-orange monospace font. Only the first instance in each section is linked.

http://www.idpf.org/

URIs are in navy blue monospace font.

hyperlink

Hyperlinks are underlined and in blue.

[reference]

Normative and informative references are enclosed in square brackets.

Term

Terms defined in the Terminology are in capital case.

Informative markup examples are in monospace font.

NOTE

Informative notes are preceded by a "Note" header.

1.4 Conformance Statements

The keywords must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this document are to be interpreted as described in [RFC2119].

All sections of this specification are normative except for examples, or except for sections identified by the informative status label "This section is informative". The application of informative status to sections and appendices applies to all child content and subsections they may contain.

2. Object States

2.1 Introduction

This section is informative

There are two states in which Distributable Objects exist at any point in time: embedded and packaged. The embedded state occurs when the Distributable Object is integrated into an EPUB Publication intended for Users, whether the initial source it was created for or a destination it has been reused or integrated in. In this state, the Distributable Object is inseparable from the rest of the content it has been integrated with, and only a collection element in the Package Document identifies the metadata and resources necessary to extract it.

To create the packaged version of the Distributable Object, the information in the embedded collection is used to extract all the needed resources and create a Package Document for the new Packaged Object ‒ an EPUB Publication that contains only the Distributable Object. (Note that nothing in this specification prevents an Distributable Object from being "born" directly in its transport state.)

The Packaged Object might be intended for Users, or could exist only to get the Distributable Object from one location to another. There is fundamentally no difference between an Embedded Object and a Packaged Object, all that differs is the manner in which the information is serialized. That the Distributable Object can be moved around in a consistent manner is the key.

The specification cannot guarantee that the Distributable Object will pass validation in this state, however, as the entities that compose the Distributable Object might not constitute valid Content Documents on their own (e.g., shared SVG components, or fragments of HTML not valid as body markup).

2.2 Embedded Objects

2.2.1 Conformance

A conformant Embedded Object must meet all of the following criteria:

 It must conform to all structural requirements in 2.2.2 The distributable-object collection.

 It must include all required metadata as defined in 2.2.3 Metadata.

 It must reference one or more discrete content entities as defined in 2.2.4 Discrete Entities.

2.2.2 The distributable-object Collection

2.2.2.1 Package Document Structure

The Package Document collection element [Publications301] is used to identify the presence of Distributable Objects embedded within an EPUB Rendition, enumerate its resources and indicate its reading order.

The collection must have a role attribute value specified. The generic identifier "distributable-object" is required for any collection that can be extracted or integrated using this framework, unless overridden by other specifications that make use of this framework for transporting specialized types.

The collection must include the following components, which constitute a minimal Package Document:

The following example shows the minimal collection structure for an extractable chapter.

<collection role="distributable-object">

   <metadata>

      <dc:title>Phantom Textbook - Chapter 1</dc:title>

     

   </metadata>

   <collection role="manifest">

      <link href="css/epub.css" media-type="text/stylesheet"/>

      <link href="xhtml/chapter01.xhtml" media-type="application/xhtml+xml"/>

      …

   </collection>

   <link href="xhtml/chapter01.xhtml"/>

</collection>

The collection may also nest one or more collection elements (e.g., Nested Objects).

NOTE

For more information about the use of the collection element as a Package Document, refer to [CollectionPkgInfo].

2.2.2.2 Nested Objects

A Distributable Object may be composed of one or more child Distributable Objects ‒ called Nested Objects. These Nested Objects may, in turn, nest other Distributable Objects, and so on.

In such cases, the Embedded Object must include a child distributable-object collection for each nested Distributable Object.

The following example shows a chapter with a nested shared sidebar on the monarchy. Note that the sidebar is a fragment of the chapter's XHTML Content Document.

<collection role="distributable-object">

   <metadata>

      …

   </metadata>

   <collection role="manifest">

      …

   </collection>

   <collection role="distributable-object">

      <metadata>

         …

      </metadata>

      <collection role="manifest">

         …

      </collection>

      <link href="chapter01.xhtml#history-monarchy" media-type"application/xhtml+xml"/>

   </collection>

   <link href="chapter01.xhtml" media-type"application/xhtml+xml"/>

</collection>

The manifest collection for the Embedded Object must list all unique resources necessary to render all Nested Objects, including all those nested inside Nested Objects. The manifest collection for each Nested Object, in turn, must list only those resources required to render it and any Nested Objects inside it.

Consequently, resources may be listed in the parent manifest and in one or more manifest of Nested Objects.

The following example shows the manifests for successive nestings of Distributable Objects. Notice that chapter01.xhtml is listed in all the manifests, as it also contains the necessary fragments for each Nested Object.

<collection role="distributable-object">

   …

   <collection role="manifest">

      <!-- components needed for the assessment -->

      <link href="css/assessment.css"/>

      <!-- components needed for the scripted component -->

      <link href="css/component.css"/>

      <link href="css/component.js"/>

      <!-- components needed for the chapter -->

      <link href="css/book.css"/>

      <link href="img/c01img01.jpg"/>

      <link href="img/c01img02.jpg"/>

      <link href="img/c01img03.jpg"/>

      <link href="audio/c01clip01.jpg"/>

      <link href="xhtml/chapter01.xhtml"/>

   </collection>

   <!-- nested object for a testing component -->

   <collection role="distributable-object">

      …

      <collection role="manifest">

         <link href="css/assessment.css"/>

         <link href="css/component.css"/>

         <link href="css/component.js"/>

         <link href="xhtml/chapter01.xhtml"/>

      </collection>

     

      <!-- nested object for a scripted component -->

      <collection role="distributable-object">

         …

         <collection role="manifest">

            <link href="css/component.css"/>

            <link href="css/component.js"/>

            <link href="xhtml/chapter01.xhtml"/>

         </collection>

         <link href="xhtml/chapter01.xhtml#interactive"/>

      </collection>

      <link href="xhtml/chapter01.xhtml#test01"/>

   </collection>

   <link href="xhtml/chapter01.xhtml"/>

</collection>

NOTE

Resources in Nested Objects have to be listed in their ancestor manifests as each Nested Object could be extracted using this framework, not just the top-most Embedded Object. Failure to list all resources complicates such extraction.

2.2.3 Metadata

2.2.3.1 General

Embedded Objects must include the required Dublin Core metadata properties defined in [Publications301].

The Distributable Object must include a dc:type element with the identifier "distributable-object" for any generic Distributable Object that can be extracted or integrated using this framework. Other specifications that use this framework may override this requirement and use their own identifiers. In either case, the role attribute and dc:type element values must match.

NOTE

Although the role attribute identifies the nature of the collection, including a dc:type element simplifies moving an Distributable Object from its embedded to packaged states.

The following [DCMES] elements are additionally recommended for describing the Distributable Object:

If the Embedded Object includes Media Overlays [MediaOverlays301], a media:duration metadata property must be included that identifies the total time for playback of the object.

Other metadata may be used to describe the Distributable Object, and conformance with specific EPUB Publication types [Publications301] may introduce other metadata requirements.

The following example shows an extended metadata set for an extractable chapter.

<collection role="distributable-object">

   <metadata>

      <dc:type>distributable-object</dc:type>

      <dc:title>Phantom Textbook - Chapter 1</dc:title>

      <dc:language>en</dc:language>

      <dc:identifier id="e6c30273-7d23-4b8f-90f8-9fab5c1cdadc">urn:isbn:9781234567890</dc:identifier>

      <meta property="identifier-type" refines="#e6c30273-7d23-4b8f-90f8-9fab5c1cdadc">unique-identifier</meta>

      <meta property="dcterms:modified">2014-11-11T12:34:56Z</meta>

      <dc:creator>Jane Doe</dc:creator>

      <dc:description>Introduction to the history of phantasm. For sale separately.</dc:description>

      <dc:source>urn:isbn:9780987654321</dc:source>

      <dc:date>2014-10-31</dc:date>

      <dc:rights>All rights reserved. Not available for use or sale except by authorized vendors.</dc:right>

     

   </metadata>

   …

</collection>

2.2.3.2 Release Identifier

A dc:identifier element and dcterms:modified property serve as the release identifier [Publications301] for the Distributable Object.

To ensure the creation of this identifier, Embedded Objects must have exactly one primary expression of the dcterms:modified property. As collections lack a unique-identifier attribute [Publications301], the dc:identifier element that contains the unique identifier [Publications301] must be identified by attaching an identifier-type property [Publications301] with the value "unique-identifier" to it.

The following examples shows how the unique identifier is identified.

<dc:identifier id="uid">urn:isbn:9780000000001</dc:identifier>

<meta property="identifier-type" refines="#uid">unique-identifier</meta>

The dcterms:modified value must not be updated when only a change in the state of the Distributable Object occurs (i.e., when it is repackaged as a Packaged Object or integrated into a Destination EPUB). In order to identify the Distributable Object as it moves around, changes to the identifier and last modified time only occur when making changes to the content and metadata. For example:

2.3.3.3 Fixed Layout

Embedded Objects must not rely on fixed-layout metadata expressions set using meta element properties [Publications301] of the Source EPUB, as such expressions are not easily translated to the Packaged Object (and vice versa). Authors should use spine itemref overrides [Publications301] to indicate the fixed-layout properties of EPUB Content Documents, as these are more directly tied to the documents (i.e., they can be looked up and translated without a complex system of comparisons, additions, subtractions and overrides).

For example, if the XHTML Content Documents in an Packaged Object are identified as fixed layout using a global meta expression, that information is lost as the meta element ends up in the Embedded Collection when integrated. As a result, the XHTML Content Documents will take on the nature of the Destination EPUB by default, which could be reflowable.

If, on the other hand, the properties are defined on each XHTML Content Document's spine itemref entry, the properties are easily copied between states and remain local to the documents.

2.2.3.3 Accessibility

The accessible nature of a Distributable Object is identified using the accessibility metadata properties [A11YProperties] from the [schema.org] CreativeWork type.

Each Distributable Object must identify all applicable accessibility features using the accessibilityFeature property. The property must be repeated for each applicable value.

If the given Rendition contains no accessible features, or the Author is unable to state which apply, a single accessibilityFeature declaration specifying the value "none" must be included.

It is recommended that the other properties be specified whenever applicable.

The vocabulary of recommended terms to use with these properties is maintained at the W3C Web Schemas Wiki [A11YProperties].

For more information on using these properties in the Package Document, refer to the Schema.org Integration Guide [SchemaGuide].

2.2.3.4 The local-manifest-designator Property

Warning: This property is subject to removal before this specification reaches recommendation status. Please see issue 484.

Description

The local-manifest-designator allows manifest properties [Publications301] that are restricted to one instance in an EPUB Rendition ‒ such as "nav" and "cover-image" ‒ to be specified for resources in a collection. Authors can then create these these special resources specifically for collections, without their use being lost during packaging into a standalone EPUB Publication.

The property must be specified on meta elements whose refines attributes reference the resource the property applies to.

This property should not be used to specify manifest properties that can be carried directly on an item without restriction (e.g., “scripted” and “mathml”).

Allowed Value(s)

Any value allowed in the properties attribute [Publications301].

No default value.

Cardinality

In the metadata section: Zero or more

Attached to resources: Zero or more

Extends

Publication Resources in a collection.

Example

<meta property="local-manifest-designator" refines="object01/nav.xhtml">nav</meta>

2.2.3.5 Publication Type Conformance

A Distributable Object may conform to a particular type of EPUB Publication [Publications301], in which case it must include a dc:type element declaration for each type to which it conforms.

2.2.4 Discrete Entities

Objects are composed of one or more discrete content entities, each of which must be identified in a child link element in the of the Embedded Object collection.

Each link  element must point to an EPUB Content Document, or element fragment thereof. The sequence of child link elements represents the default linear reading order of the Distributable Object (i.e., its spine).

If the IRI [RFC3987] in a link element href attribute includes a fragment identifier, only the referenced element is part of the Distributable Object. The link element href attribute must not include EPUB canonical fragment identifiers [EPUBCFI].

This specification places no restrictions on what markup a discrete entity can reference, but see 2.3.4 EPUB Content Documents for additional consideration.

The Author of the Distributable Object must ensure that all resources necessary to render the Distributable Object are identified in the manifest and child link elements, including all linked entities that belong to the Distributable Object (e.g., via [HTML5] a elements).

Conformance with specific EPUB Publication types may introduce other requirements.

The following example shows a Distributable Object declaration for a learning object that contains MathML markup, a fallback image and two style sheets.

<collection role="distributable-object">

   <metadata>

      <dc:type>distributable-object</dc:type>

      <dc:title>Learning Object 1</dc:title>

        <dc:identifier id="e6c30273-7d23-4b8f-90f8-9fab5c1cdadc">urn:uuid:5856b480-e1be-11e3-8b68-0800200c9a66
</dc:identifier>

      <meta property="identifier-type" refines="#e6c30273-7d23-4b8f-90f8-9fab5c1cdadc">unique-identifier</meta>

<dc:creator>Object Author</dc:creator>

<dc:language>en</dc:language>

<meta property="schema:typicalAgeRange">9-11</meta>

      <meta property="schema:accessibilityFeature">mathml</meta>

      …

   </metadata>

   <collection role="manifest">

      <link href="images/mathml01.gif" media-type="image/gif"/>

      <link href="css/epub.css" media-type="text/stylesheet"/>

      <link href="css/mathml.css" media-type="text/stylesheet"/>

      <link href="fonts/STIXGeneral.woff" media-type="application/font-woff"/>

      …

   </collection>

   <link href="xhtml/chapter01.xhtml#learningobject01"/>

</collection>

2.2.5 Resource Location

Although the local resources needed to render the Distributable Object can be hosted anywhere in the EPUB Container, storing them all in a directory with a unique name is recommended (e.g., a directory named using a universally unique identifier [RFC4122] or the dc:identifier value). This practice will help avoid naming collisions as the Distributable Object moves around.

As this framework allows fragments to be transported, and resources necessary for the rendering are often shared, encapsulation in a single directory will not always be possible.

Specifications that make use of this framework and can ensure their Distributable Objects are completely self-contained should enforce encapsulation.

2.3 Packaged Objects

2.3.1 Introduction

This section outlines the requirements for a Packaged Object ‒ a Distributable Object that has been bundled separately as an EPUB Publication for transport. As a Packaged Object is just a different serialization of an Embedded Object, the requirements are largely similar.

References to the appropriate sections in Embedded Objects are provided, with only requirements particular to packaging as an EPUB Publication detailed.

2.3.2 Conformance

A conformant Packaged Object must meet all of the following criteria:

It must follow all requirements for EPUB Publications in [Publications301], but allows for document fragments (see 2.3.5 EPUB Content Documents).

It must include all required metadata as defined in 2.3.3 Metadata.

2.3.3 Metadata

The Package Document metadata must include all metadata for Embedded Objects defined in 2.2.3 Metadata.

2.3.4 EPUB Content Documents

EPUB Content Documents in a Packaged Object may contain markup fragments that invalidate the document. As a result, a Packaged Object is not always a valid EPUB Publication.

Authors should avoid creating invalid EPUB Content Documents, however, reserving such fragment transmission to private-use exchange where there is agreement between the distributor and recipient.

2.3.5 Embedded Objects

A Packaged Object may contain Embedded Object definitions in its Package Document.

These Embedded Objects become Nested Objects when the Packaged Objected is translated into an Embedded Object.

2.4 Identifiers

If a Distributable Object is intended to be integrated into a Destination EPUB, all id attribute values in the Package Document related to the Distributable Object (e.g., on metadata elements and manifest item elements) should be universally unique to avoid naming collisions. For example, the values could be UUIDs [RFC4122] that begin with alphabetic characters.

2.5 Rights

The presence of a Distributable Object, whether embedded or packaged, does not, in itself, confer any rights to reuse the Distributable Object in any other context.

Usage is only permitted according to the terms of an included rights statement, or a link to where such information can be found. In the absence of such a statement, no rights can be inferred.

2.6 Multi-Component Distribution

This specification does not define a method for defining more than one Distributable Object in an EPUB Container. Each Container must include only a single Distributable Object.

Multiple Distributable Objects can be zipped together when it is necessary to bundle more than one together for distribution.

3. Translating Objects

3.1 Introduction

This section is informative

This section details the general steps necessary to translate a Distributable Object from its embedded state to its packaged state, and vice versa.

An exact method of translating content is not defined, as many technologies could be used in the translations (XSLT, procedural languages, etc.). It is also possible for humans to manually translate Distributable Objects from one state to another.

Depending on the complexity of the Distributable Object being translated, and the state into which it is being translated, the translation process might include a mix of automated and manual. In the case of content fragments, for example, a machine can integrate the EPUB Content Document containing the fragments into an EPUB Publication, but a human might have to determine whether the fragments needs to be subsequently moved to a more appropriate location in another EPUB Content Document, and where that location is.

As a result, a general understanding of translation process is necessary for anyone looking to work with Distributable Objects, as manual modifications can lead to changes to the Embedded Object definition (e.g., path and file name changes).

3.2 Embedded to Packaged

3.2.1 Translation Process

This section is informative

The process of creating a Packaged Object from an Embedded Object involves the following general steps:

Although these steps are broken out logically in this document in order to analyze the necessary transformation requirements, in real-world practice they could happen in different order or even simultaneously. There are also many technologies that could be used in the translation process, and different intended end uses of the Distributable Object can impact on processing.

As a result, this section has been structured to avoid making strict requirements about how the content is transformed ‒ detailed is only what needs to happen to create a valid Packaged Object.

Also omitted is all standard requirements for packaging an EPUB Publication, such as creating the container.xml file and the OCF Container.

NOTE

The translation process detailed in this section only deals with the core EPUB 3.0.1 specifications. Distributable Objects that employ features defined in other IDPF specifications might require additional handling to ensure proper translation.

3.2.2 Create Package Document

The following list provides the general set of steps necessary to translate the elements of an Embedded Object's distributable-object collection to a Package Document.

  1. Copy any internationalization attributes (dir, xml:lang) on the collection element to the package element. If these attributes are specified on the package element of the Source EPUB, copy them from that element, with precedence given to declarations on the collection.
  2. Copy the metadata element unchanged to the package element, ensuring any namespace declarations are preserved.
  3. If any non-reserved prefixes are bound to the collection (whether declared in the collection or on related item/itemref entries), copy their declarations to a prefix attribute on the package element.
  4. Add a unique-identifier attribute to the package element that references the id of the dc:identifier element marked as the unique identifier for the collection (see 2.2.3.1 General).
  5. For each link element in the manifest collection:
  1. Use its href attribute value to look up the corresponding manifest item element and copy that item to the new Package Document's manifest. Avoid simple string comparisons of the attribute values, as different paths may refer to the same resource (e.g., "./foo.xhtml" and "foo.xhtml"). Also, avoid translating the link elements directly into item elements unless the nature of the Distributable Object can always be predicted, as important information could be lost ‒ e.g., the manifest properties and fallback attribute values.
  2. Copy the referenced resource to the same location in the Packaged Object's directory structure. (See also 3.2.3 EPUB Content Documents and 3.2.4 Resource Renaming.)
  1. Translate the collection element's child link elements to spine itemref elements as follows:
  1. If the URL specified in the link href attribute includes a fragment identifier:
  1. if the EPUB Content Document containing the fragment has not been encountered in a preceding link element, add an itemref to the spine.
  2. if a fragment from the EPUB Content Document has already been encountered, ignore the link.
  1. If no fragment identifier is included, the itemref must reference the manifest entry for the specified EPUB Content Document.
  1. For each spine itemref created in the previous step, check whether or not the referenced EPUB Content Document is linear or not by inspecting the spine of the Source EPUB and copy the appropriate attribute. Also, copy the properties attribute, if present.
  2. If the Source EPUB makes use of bindings [Publications301], check the Package Document bindings element for any mediaType rules that match a resource type in the Distributable Object's manifest and copy them to the new Package Document.

The distributable-object collection must not be copied into the Packaged Object to avoid problems reintegrating the Distributable Object into a Destination EPUB. If the collection contains Nested Objects, each of those Distributable Object collection elements becomes a top-level distributable-object collection in the resulting Package Document. (This requirement applies only to direct children; any Nested Objects in the child collections remain nested.)

The translation process may require additional steps not outlined above, particularly in the case of other specifications that implement this framework, and for features defined outside the core EPUB 3 specifications.

3.2.3 Filter EPUB Content Documents

As a Distributable Object may reference one or more markup fragments in an EPUB Content Document, a Processing Agent must be able to filter documents to remove all non-distributable markup when generating the Packaged Object.

The process for determining which documents to filter, and what content to preserve is as follows:

  1. Create two lists: one for complete EPUB Content Documents to keep and another for documents containing fragments. The fragment list should allow a list of identifiers to be preserved for each entry (e.g., a hashtable) or else a separate list will need to be maintained.
  2. Inspect each child link element in the distributable-object collection:
  1. if the referenced EPUB Content Document includes a fragment identifier, and has not already been encountered, add it to the list to filter. Append the identifier to the list of fragments to keep from that document.
  2. if the referenced EPUB Content Document does not include a fragment identifier, add the document to the list of complete documents to keep
  1. Iterate over the list of EPUB Content Documents to filter that was generated in step 2, and strip all non-distributable markup from each, leaving only the referenced fragments. (Filtering might result in the document no longer being valid; see 2.3.4 EPUB Content Documents for more information.)

NOTE

If any EPUB Content Documents are present in both lists after step 2, the collection is invalid. The Processing Agent can continue by keeping the entire EPUB Content Document and ignoring any fragments, but the source of the invalidity needs to be investigated.

NOTE

Rather than rely on automated processing, Authors might prefer to prepare specific versions of Distributable Object files containing only partially-relevant content, particularly if the Distributable Object is destined for Users. For example, relying on fragment filtering of indexes, bibliographies, notes and other backmatter could lead to suboptimal rendering.

3.2.4 Resource Renaming

Resources may be renamed when moving them from the Embedded Object to the Packaged Object.

This specification does not impose any requirements on the renaming of resources except that the names must be valid file names as defined in [OCF301].

If a Processing Agent renames files, it is responsible for also ensuring that all references to the original names are updated in all resources (all href and src attributes in XHTML Content Documents, any linked CSS and @import rules, etc.).

3.2.5 EPUB Navigation Document

The Packaged Object must include an EPUB Navigation Document, which must be generated by the Processing Agent.

The Processing Agent may choose any method to generate this file. The following suggestions are provided only as guidance:

  1. If the Packaged Object is intended for direct User consumption (e.g., a chapter of a book for sale separately), ensure a complete table of contents is provided. The required toc nav [ContentDocs301] can be generated in these cases by traversing all EPUB Content Documents that belong to the Distributable Object and adding links to all structural headings.
  2. If the Packaged Object is being created only for transport purposes, the Processing Agent can either:
  1. create the required toc nav by adding generic links to each EPUB Content Document in the spine, and leave it to the Author implementing the Distributable Object to add the necessary structure to their Publication.
  2. create structured navigation by inspecting each EPUB Content Document and adding links to all structural headings to aid later integration.

How to handle other components in the EPUB Navigation Document is at the discretion of the Processing Agent (e.g., extracting landmarks, page lists and lists of tables or figure specific to the Distributable Object).

Specifications that implement this framework may provide stricter requirements on how to generate the EPUB Navigation Document.

3.3 Packaged to Embedded

3.3.1 Translation Process

This section is informative

The process of creating an Embedded Object from a Packaged Object is effectively a reversal of the steps needed to translate from embedded to packaged:

As the process of integrating content into an existing EPUB Publication is not trivial, a greater degree of manual intervention is likely in this process. The EPUB Navigation Document has to be updated with more precision than packaging requires, fragments integrated into their correct destination document, collection links updated to reflect the new fragment locations, the spine reordered, and so on.

Similar to the process detailed in the previous section, the following subsections do not necessarily represent a linear order in which translation events have to occur. The intent is only to provide an overview of the essential steps involved.

NOTE

The translation process detailed in this section only deals with the core EPUB 3.0.1 specifications. Distributable Objects that employ features defined in other IDPF specifications might require additional handling to ensure proper translation.

3.3.2 Create the distributable-object Collection

The following list provides the general set of steps necessary to create the distributable-object collection in the Destination EPUB:

  1. Create a new collection element in the Destination EPUB and assign it the role attribute value "distributable-object". (If the Distributable Object's metadata contains any other dc:type values, additional roles might be required depending on the requirements of the publication type to which the Distributable Object conforms.)
  2. Copy any internationalization attributes (dir, xml:lang) on the package element to the collection element.
  3. If present, add any prefix attribute declarations to the Destination EPUB's package element.
  4. Copy the Packaged Object's metadata section into the collection as its required first child element.
  5. Create a new manifest collection and translate all of the Packaged Object's manifest item elements into child link elements.
  6. If the Packaged Object contains any distributable-object collections, copy each after the manifest collection.
  7. Translate each spine itemref in the Package Object to a child link element, replacing the ID reference in the idref attribute with the location of the resource when creating the link element's href attribute.

3.3.3 Migrate Resources

The following list provides the general set of steps necessary to merge the Packaged Object's resources into the Destination EPUB:

  1. Copy all of the Packaged Object's manifest item elements into the manifest of the Destination EPUB, excluding the entry for the EPUB Navigation Document.
  2. Copy all resources listed in the Packaged Object's manifest to the same location in the Destination EPUB, again excluding the EPUB Navigation Document. If a resource with the same name already exists at the specified location, the Processing Agent is responsible for assigning it a new name and modifying all references to it.
  3. Copy each spine itemref in the Packaged Object to the Destination EPUB's spine, retaining the order in which they are listed. (The position of these entries will typically have to be manually corrected to reflect the correct reading order.)

Note that the Package Document must not be copied into the Destination EPUB. A new Package Document can be created from the collection information during translation from embedded to packaged states.

3.3.4 Incorporate Content Fragments

Although all EPUB Content Documents from the Packaged Object are copied into the Destination EPUB, some may contain fragments that are to be integrated into the existing content.

This specification does not prescribe how such fragments are to be integrated. A tool could be used that allows the Author to select the location, or a person could have to manually copy and paste the fragments.

Whatever the scenario, if fragments are moved into another EPUB Content Document the Processing Agent must update all references in the distributable-object collection to the new locations. The child link element set will also have to be updated to add references to the fragments.

3.3.5 EPUB Navigation Document

The EPUB Navigation Document from the Packaged Object must not be copied to the Destination EPUB.

This specification does not prescribe how the Destination EPUB's Navigation Document is to be updated to reflect the added content. The document could be regenerated automatically, or a person could have to manually update.

3.3.6 Media Overlays

If the Packaged Object includes Media Overlays [MediaOverlays301], in addition to integrating the Media Overlay Documents, the media:duration metadata value for the Packaged Object has to be added to the Destination EPUB's total duration.

If the Packaged Object contains content fragments, the duration of each fragment has to be added to the total for the EPUB Content Document it is integrated into.

3.3.7 Non-Rendering Resources

This section is informative

It is possible to include resources in an EPUB Container that are not intended for use when rendering any of the given Renditions. As these resources are typically not listed in the Package Document manifest, however, there is no way to automatically detect their presence.

Consequently, although such resources can be bundled with a Packaged Object, their translation has to be done manually.

3.4 Encryption and Obfuscation

The translation processes defined in this section assume that the resources in a Distributable Object are not encrypted or obfuscated, or that the translation process occurs before, or after, decryption or deobfuscation has occured.

Whether resources have to be re-encrypted or re-obfuscated each time the Distributable Object changes state is outside scope of this framework. A Processing Agent might inspect the META-INF/encryption.xml file to determine this information, but that will only reveal that the Distributable Object is currently encrypted/obfuscated, not whether it always has to be.

If resources can only be distributed provided they are encrypted/obfuscated, the Author of the Distributable Object must specify which resources and what method to use (see usage rights).

3.5 EPUB Canonical Fragment Identifiers

Although canonical fragment identifiers [EPUBCFI] are not permitted for use defining discrete entities (see 2.2.4 Discrete Entities), they can appear in the resources included in the Embedded Object. When translating states, these identifiers can become invalid, as the spine locations of the referenced EPUB Content Documents will often change.

It is therefore recommended that [EPUBCFI] be avoided unless there is no alternative. Processing Agents need to be aware that these identifiers may exist and may need to be updated.


Appendix A ‒ Example

This appendix is informative

A.1 Side-by-Side Comparison

This section provides a side-by-side comparison of a distributable-object collection (embedded state) and Package Document (transport state) for the same Distributable Object. To see the markup for each separately, refer to the following subsections.

Package Document

Collection

1

<package version="3.0" xmlns="http://www.idpf.org/2007/opf" unique-identifier="c5bba4b9-3d3c-4da4-bafd-8e185f7b35b2">

<collection role="distributable-object">

2

<metadata xmlns:dc="http://purl.org/dc/elements/1.1/

">

<metadata xmlns:dc="http://purl.org/dc/elements/1.1/

">

3

<dc:type>distributable-object</dc:type>

<dc:type>distributable-object</dc:type>

4

<dc:identifier id="c5bba4b9-3d3c-4da4-bafd-8e185f7b35b2">urn:uuid:a46825d1-e796-4cc3-a633-5160f529a1e0</dc:identifier>

<dc:identifier id="c5bba4b9-3d3c-4da4-bafd-8e185f7b35b2">urn:uuid:a46825d1-e796-4cc3-a633-5160f529a1e0</dc:identifier>

5

<meta property="identifier-type"  refines="#c5bba4b9-3d3c-4da4-bafd-8e185f7b35b2">unique-identifier</meta>

<meta property="identifier-type" refines="#c5bba4b9-3d3c-4da4-bafd-8e185f7b35b2">unique-identifier</meta>

6

<meta property="dcterms:modified">2014-11-10T19:30:22Z</meta>

<meta property="dcterms:modified">2014-11-10T19:30:22Z</meta>

7

<!-- all metadata is identical, so additional elements have been omitted for brevity -->

8

</metadata>

</metadata>

9

<manifest>

<collection role="manifest">

10

<item id="e89d1b63-7ba8-4089-bd36-71a5a7e39c90" href="nav.xhtml" media-type="application/xhtml+xml" properties="nav"/>

<!-- omitted -->

11

<item id="b82e04c7-cdf2-42ad-b363-c5dda9e68103" href="xhtml/chapter01.xhtml" media-type="application/xhtml+xml" properties="scripted"/>

<link href="xhtml/chapter01.xhtml" media-type="application/xhtml+xml"/>

12

<item id="e911ca8f-674e-4033-aaba-51fc72332cbc" href="xhtml/notes.xhtml" media-type="application/xhtml+xml"/>

<link href="xhtml/notes.xhtml" media-type="application/xhtml+xml"/>

13

<item id="ca0d932e-bfca-4ab6-9985-bfe1227db33a" href="xhtml/biblio.xhtml" media-type="application/xhtml+xml"/>

<link href="xhtml/biblio.xhtml" media-type="application/xhtml+xml"/>

14

<item id="eaae0272-3277-446e-a93d-b74ac43e539c" href="css/epub.css" media-type="text/css"/>

<link href="css/epub.css" media-type="text/css"/>

15

</manifest>

</collection>

16

<spine>

<!-- the spine is inferred -->

17

<itemref idref="b82e04c7-cdf2-42ad-b363-c5dda9e68103"/>

<link href="xhtml/chapter01.xhtml"/>

18

<itemref idref="e911ca8f-674e-4033-aaba-51fc72332cbc"/>

<link href="xhtml/notes.xhtml#c01"/>

19

<itemref idref="ca0d932e-bfca-4ab6-9985-bfe1227ldb33a"/>

<link href="xhtml/biblio.xhtml#b001"/>

20

<!-- repeat for fragments is not listed in the package document spine -->

<link href="xhtml/biblio.xhtml#b023"/>

21

<link href="xhtml/biblio.xhtml#b029"/>

22

</spine>

23

</package>

</collection>

A.2 Package Document

<package version="3.0" xmlns="http://www.idpf.org/2007/opf" unique-identifier="c5bba4b9-3d3c-4da4-bafd-8e185f7b35b2">

   <metadata xmlns:dc="http://purl.org/dc/elements/1.1/">

      <dc:type>distributable-object</dc:type>

      <dc:identifier id="c5bba4b9-3d3c-4da4-bafd-8e185f7b35b2">1</dc:identifier>

      <meta property="identifier-type"  refines="#c5bba4b9-3d3c-4da4-bafd-8e185f7b35b2">unique-identifier</meta>

      <meta property="dcterms:modified">2014-11-10T19:30:22Z</meta>

      <dc:title>Phantom Textbook - Chapter 1</dc:title>

      <dc:language>en</dc:language>

      <dc:creator>Jane Doe</dc:creator>

      <dc:description>Introduction to the history of phantasms. For sale separately.</dc:description>

      <dc:source>urn:isbn:9780987654321</dc:source>

      <dc:date>2014-10-31</dc:date>

      <dc:rights>All rights reserved. Not available for use or sale except by authorized vendors.</dc:right>

   </metadata>

   <manifest>

      <item id="e89d1b63-7ba8-4089-bd36-71a5a7e39c90" href="nav.xhtml" media-type="application/xhtml+xml" properties="nav"/>

      <item id="b82e04c7-cdf2-42ad-b363-c5dda9e68103" href="xhtml/chapter01.xhtml" media-type="application/xhtml+xml" properties="scripted"/>

      <item id="e911ca8f-674e-4033-aaba-51fc72332cbc" href="xhtml/notes.xhtml" media-type="application/xhtml+xml"/>

      <item id="ca0d932e-bfca-4ab6-9985-bfe1227db33a" href="xhtml/biblio.xhtml" media-type="application/xhtml+xml"/>

      <item id="eaae0272-3277-446e-a93d-b74ac43e539c" href="css/epub.css" media-type="text/css"/>

   </manifest>

   <spine>

      <itemref idref="b82e04c7-cdf2-42ad-b363-c5dda9e68103"/>

      <itemref idref="e911ca8f-674e-4033-aaba-51fc72332cbc"/>

      <itemref idref="ca0d932e-bfca-4ab6-9985-bfe1227ldb33a"/>

      <!-- repeated fragments are not listed in the package document spine -->

   </spine>

</package>

A.3 Collection

<collection role="distributable-object">

   <metadata xmlns:dc="http://purl.org/dc/elements/1.1/">

      <dc:type>distributable-object</dc:type>

      <dc:identifier id="c5bba4b9-3d3c-4da4-bafd-8e185f7b35b2">urn:uuid:a46825d1-e796-4cc3-a633-5160f529a1e0</dc:identifier>

      <meta property="identifier-type" refines="#c5bba4b9-3d3c-4da4-bafd-8e185f7b35b2">unique-identifier</meta>

      <meta property="dcterms:modified">2014-11-10T19:30:22Z</meta>

      <dc:title>Phantom Textbook - Chapter 1</dc:title>

      <dc:language>en</dc:language>

      <dc:creator>Jane Doe</dc:creator>

      <dc:description>Introduction to the history of phantasms. For sale separately.</dc:description>

      <dc:source>urn:isbn:9780987654321</dc:source>

      <dc:date>2014-10-31</dc:date>

      <dc:rights>All rights reserved. Not available for use or sale except by authorized vendors.</dc:right>   </metadata>

   <collection role="manifest">

      <link href="xhtml/chapter01.xhtml" media-type="application/xhtml+xml"/>

      <link href="xhtml/notes.xhtml" media-type="application/xhtml+xml"/>

      <link href="xhtml/biblio.xhtml" media-type="application/xhtml+xml"/>

      <link href="css/epub.css" media-type="text/css"/>

   </collection>

   <link href="xhtml/chapter01.xhtml"/>

   <link href="xhtml/notes.xhtml#c01"/>

   <link href="xhtml/biblio.xhtml#b001"/>

   <link href="xhtml/biblio.xhtml#b023"/>

   <link href="xhtml/biblio.xhtml#b029"/>

</collection>

Appendix B. Acknowledgements and Contributors

This appendix is informative

EPUB has been developed by the International Digital Publishing Forum in a cooperative effort, bringing together publishers, vendors, software developers, and experts in the relevant standards.

The EPUB Distributable Objects 1.0 specification was prepared by the International Digital Publishing Forum's EPUB Working Group, operating under under the leadership of:

Active members of the working group at the time of publication were:

IDPF Members

Invited Experts/Observers

References

Normative References

[A11YProperties] Schema.org Accessibility Metadata Properties.

[ContentDocs301] EPUB Content Documents 3.0.1.

[DCMES] Dublin Core Metadata Element Set, Version 1.1.

[DCMI] DCMI Metadata Terms.

[EPUBCFI] EPUB Canonical Fragment Identifier (epubcfi) Specification.

[HTML5] HTML5: A vocabulary and associated APIs for HTML and XHTML.

[Manifest] EPUB Manifest Role.

[MediaOverlays301] EPUB Media Overlays 3.0.1.

[OCF301] Open Container Format 3.0.1.

[Publications301] EPUB Publications 3.0.1.

[RFC2119] Key words for use in RFCs to Indicate Requirement Levels (RFC 2119) . March 1997.

[RFC3987] Internationalized Resource Identifiers (IRIs) (RFC 3987). M Duerst, et al. January 2005.

[RFC4122] A Universally Unique IDentifier (UUID) URN Namespace (RFC 4122). P. Leach, et al. July 2005.

[schema.org] schema.org.

Informative References

[CollectionPkgInfo] Using collection Elements as Embedded Package Documents

[SchemaGuide] Schema.org Metadata Integration Guide for EPUB 3.