3. EDUPUB Basics: Elements, Classes and ePub:type
Sample 3: paragraph and an unordered list
3.2 Adding Meaning Through the Class Attribute and ePub:type
Sample 1: differentiating sections
Sample 2: differentiating asides
Sample 1: Local_Semantic Class
Sample 2: list with literal text marker
Standard Object Attributes (http://www.w3.org/TR/html-markup/object.html#object)
9. File Naming and Folder Layout
9.2 package.opf File Structure
10. Image Conversion Specifications
Appendix A: Concepts and Definitions
QTI (IMS Question & Test Interoperability)
Appendix B: Accessibility Features
Separation of style with ability to adjust size/color
Images with textual descriptions
Scalable Vector Graphics (SVG)
Resizable text without loss of functionally
Escapable and skippable elements
Highlighted words during narration
Footnotes outside of text stream
As part of its business transformation strategy to standards based, digital-first content creation workflows, Pearson is developing EPUB3 output profiles for educational content. To encourage open standards across our industry Pearson is submitting to the IDPF one of these output profiles - EDUPUB - as the basis for an educational profile for EPUB3. In doing so Pearson hopes to provide a standard that any publisher, vendor, and content distributor can embrace and contribute to.
The EDUPUB output profile will provide the following for educational content:
Benefits include:
By reducing the number of variable formats for similar content - publishers, vendors, and content distributors can devote more of their resources to improving content, services, and end user experience and less on creating redundant output formats that provide no competitive advantage - truly a win-win for all involved.
The purpose of this document is to define EDUPUB concepts, terms and requirements; and to provide a general understanding of the markup and how it semantically describes a publishers content. In conjunction with the EDUPUB Content Model this is the specification for EDUPUB.
An EDUPUB document will be immediately familiar to anyone who understands XHTML5[e]/EPUB 3. All the content markup is XHTML5. The following three code snippets are all fully conformant to XHTML5.
<p>One of the most important factors in oil, gas, and coal exploration and extraction is technology. The ability to find and extract fossil fuels has changed dramatically with the development of new techniques...</p>
<blockquote><p>The law isn’t justice. It’s a very imperfect mechanism. If you press exactly the right buttons and are also lucky, justice may show up in the answer. A mechanism is all the law was ever intended to be.</p></blockquote>
<p>The advantages of a total patient care system include:</p>
<ul><li><p>Continuous, holistic, expert nursing care</p></li>
<li><p>Total accountability for the nursing care of the assigned patient(s) for that shift</p></li>
<li><p>Continuity of communication with the patient, family, physician(s), and staff from other departments</p></li></ul>
The XHTML5 class attribute enables users to describe content in their own terms. EDUPUB uses the class attribute to provide (additional) semantic meaning. For example, the section element is defined in the W3C specification as "a generic section of a document or application". EDUPUB defines specific classes for "part, chapter, section..." which provides additional semantic information.
Primary class names are a fixed list of editorial terms for important publishing, navigation and accessibility structures (e.g., "summary", "sidebar", "objective", "nav", "longdesc") elements.
The class names represent a superset of the semantics defined in the EPUB 3 Structural Semantics Vocabulary. Where possible, class names were derived from the corresponding ePub:type or DAISY accessibility standard, a few exceptions are listed below.[f]
Class Names | ePub:Type |
index-xxxxx | index:xxxx |
name-index, subject-index | index:body |
EDUPUB requires that both the class and ePub:type attributes are provided as defined in the Content Model.
<section class= "part" id="..." epub:type= "part" id="..." >...
<section class= "chapter" id="..." epub:type= "chapter" id="..." >..
<section id="..." >
</section>
</section>
</section>
<aside class= "sidebar" id="..." epub:type= "sidebar" id="..." >...</aside>
<aside class= "pullquote" id= "..." >...</aside>
<aside class= "footnote" epub:type= "footnote" id="..." id= "..." >...</aside>
In addition to Primary Classes, there are three more class types: Design, Local_Semantic and Literal_Style Classes which are discussed below.
Sometimes there may be multiple "types" of a given Primary Class (e.g., a single title might have 3 different "types" of sidebars that appear throughout the title: one referred to as "Case Study", another called "Concepts", another called "Active Research"). EDUPUB allows these distinctions to be made by tagging both the Content and Design classes in the class attribute (class="sidebar sidebar_1"). If adding a Design class to an element that does not have a Primary class, then use the element name, an underscore and a number (for example <span class="span_1">.)
Design class names follow the form xxxxxxxx_n, where xxxxxxx is the element or base class (figure, sidebar,...) and n is an integer value.
Design classes allow formatting distinctions to be made in a way that is often helpful to the reader, but is not essential to the understanding of the text. These distinctions will be lost when the content is aggregated with other content.
Design Classes must be added to the css folder in design.css. Any file containing a use of that class should include a link to design.css.
<link rel= "stylesheet" type= "text/css" href= "../css/design.css" />
...
<aside class= "sidebar sidebar_1" id="..." epub:type="sidebar" >...</aside>
In design.css
.sidebar_1 header h1 {
/* Green headings for sidebars about the web */
color: #005a30;
}
Local_Semantic classes are used for adding semantics to content where no Primary class exists. For example a product discussing grammar may need a semantic class for proper nouns or a product on programming may need a semantic class for methods.
Local_Semantic class names follow the form xxxxxxxx_lc_n, where xxxxxxx is the element or base class (figure, sidebar,...) and n is an integer value.
Local_Semantic classes allow formatting distinctions to be made where it is essential to the understanding of the text. The specific formatting used may be changed when this content is reused (colors might be used in one product, but some other treatment would be required in a black and white product), but the distinctions should be preserved.
If the product has text that describes a particular rendering, the paragraph(s) containing that rendering description are wrapped in <div class="rendering-notes">.
Local_Semantic Classes must be added to the css folder in local.css. Any file containing a use of that class should include a link to local.css.
<link rel="stylesheet" type="text/css" title="local" [g]href="../css/local.css" />
...
<div class="rendering-notes"><p>In the following section common nouns are underlined...</p></div>
…
<p>give the balance sheet to <span class="span_lc_1">Melissa</span><p>
In local.css
.span_lc_1 {
text-decoration:underline;
}
In EDUPUB the markup indicates what objects are as opposed to how they look. However, in some cases the formatting of content is intrinsic to its meaning. For example, in content about how to write an effective letter - a sample letter may require a particular layout or style to convey the author's meaning. In these cases, classes will be created to describe the characteristics that are intrinsic to the author's intent. These classes will take precedence over any theme provided CSS.
Literal_Style class names follow the form xxxxxxxx_ls_n, where xxxxxxx is the element or base class (figure, sidebar,...) and n is an integer value.
<link rel="stylesheet" type="text/css" title="literal" href="../css/literal.css" />
...
<div class="div_ls_1">
7 Fairlane Road
...
</div>
In the Theme css
.div_ls_1 {
/* sample letter address appears flush right */
text-align:right;
}
EDUPUB content is organized into 6 levels: product, volume, part, chapter, module and card. These levels are identified in the markup and will have consistent semantics across all products that are supported by the tools and documentation. Some of the levels are optional and some levels can be further subdivided as dictated by the content.
The levels and their definitions are:
<section class="volume" id="id">
Numbering in EDUPUB is handled using three methods depending on the object and numbering needs:
The lists are staticlists because the numbers are already statically part of the content. The staticlist class tells the CSS not to autonumber ol's with this class value.
<ol>
<li><p>lakes and rivers</p></li>
<li><p>roads and bridges</p></li>
</ol>
<ol start="3">
<li><p>buildings and structures</p></li>
</ol>
<ol class= "staticlist" >
<li><p> <span class= "number" > 1.1 </span> lakes and rivers</p> </li>
<li><p> <span class= "number" > 1.2 </span> roads and bridges</p> </li>
</ol>
<figure><figcaption><header>
<h1><span class="label">Figure</span> <span class="number">9.2</span>
House Bill on Ethics Reform
</h1>
</header></figcaption>
</figure>
This section describes references to assets, elements within the current file and to elements within the current ePub but external to the current file.
EDUPUB references to content in the current ePub have three forms:
The id attribute within EDUPUB must be a valid XML ID (one or more characters followed by characters and digits and having no spaces), and must be unique within the ePub.
<img src="../images/M03_SULL4546_i123.jpg" alt="..." />
For a digital version of a print product it is helpful but not required to provide markers for the start of each page. This allows navigation into the digital content using page numbers and support for classrooms where students and the professor could have either the print or the digital version.
When including print pagination references, the package document metadata must also include a dc:source element identifying the print source.
If an EDUPUB document contains page markers:
Page Markers are tagged as empty span elements and allowed only where PCDATA is supported (i.e. not between two list items or two chapters). They should be placed before the first visible content of the page they are defining.
Elements that are from namespaces other than the HTML namespace and that convey content but not metadata, are embedded content for the purposes of the content models defined in this specification. (For example, MathML, or SVG.)
Some embedded content elements can have fallback content: content that is to be used when the external resource cannot be used (e.g. because it is of an unsupported format). The element definitions state what the fallback is, if any.
Embedded content is content that imports another resource (e.g., xml, swf, ...) into the document, or content from another vocabulary (e.g., CML) that is inserted into the document. The embed element is is used for this purpose.
8.1 Video Recommendations
Video content is specified using the <video> element.
Best practice is to provide two formats of audio: mp4 and webm.
<div class="fallback"> is required and should include an error message for platforms that are unable to play video.
Best practice is to provide the track element for captions.
<video controls="controls" poster="../images/fraser.jpg"> <source src="../video/fraser_amrev_720480.mp4" type="video/mp4"/> <source src="../video/fraser_amrev_720480.webm" type="video/webm"/> <track src="../video/fraser_amrev_720480.vtt" kind="captions" srclang="en" label="English"/> <div class="fallback"> <p> Sorry, it appears your system either does not support video playback or cannot play the MP4 format or WebM format provided. </p> </div> </video>
8.2 Audio Content
Audio content is specified using the <audio> element.
Best practice is to provide two formats of audio: ogg and mp3.
<div class="fallback"> is required and should include an error message for platforms that are unable to play audio.
<audio controls="controls">
<source src="audio/04_01.ogg" type="audio/ogg" />
<source src="audio/04_01.mp3" type="audio/mpeg" />
<div class="fallback">
<p>
Sorry, it appears your system either does not support audio playback or
cannot play the MP3 format or OGG format provided.
</p>
</div>
</audio>
The object element is used to reference external resources in EDUPUB such as “widgets/gadgets". This enables authoring tools and browsers to natively display the content if needed (vs. using a div which would require special processing to display the gadget). On output the object can be transformed (if needed) to other elements such as a div for embedded display or a hyperlink to launch in a new window.
A fallback can be specified by using flow content. Fallbacks are commonly an image with some text.
In order to capture all the necessary information about a gadget, a mix of standard object attributes, custom “data" attributes, and parms as we will be used. If parameters are needed to initialize a gadget, the object can contain one or more param elements.
The object can contain fallback content
<object class="gadget gadget_dcat" data="#URI#" type="#Text#" height=""
width="#Text#" lang="#Text#" title="#Text#"
data-responsivedesigned="#yes/no#" data-minwidth="#Text#"
data-minheight="#Text#" data-lmsrequired="#yes/no#"
data-offlinesupport="#yes/no#"
data-displaytarget="#embed/new_window#" data-icon="#URI#"
data-iconwidth="#Text#" data-iconheight="#Text#"> <!-- gadget params
required to initialize the gadget --> <param name="#CDATA#"
value="#CDATA#"/>
<!-- fallback could be an image and/or flow content -->
<span class="fallback"><img src="#URI#" alt="#Text#" /></span> </object>
The object can contain fallback content. It should follow any param elements (if they exist). Fallback content is typically an image and/or some text, but can be any elements classified as “flow content" by the HTML5 spec.
The EDUPUB specification does not add any new requirements beyond what is documented in the ePub3 specification.
The EDUPUB specification does not have requirements on how parts/chapters/sections should be "chunked" into files within the ePub, but chunking at the first (A-Head) section within a chapter is considered a Best Practice.
The EDUPUB specification does not add any new requirements beyond what is documented in the HTML5 specification. The following guidelines are recommended as best-practices.
For digital media the colorspace is sRGB.
Concepts and definitions were developed during the creation of this document. Not all concepts and definitions are discussed in this document and are defined for future reference in related documents.
Chemical Markup Language (CML) is still under evaluation for inclusion in the EDUPUB Spec.
CML has been developed by Peter Murray-Rust and Henry Rzepa since 1995. It is the de facto XML for chemistry, accepted by publishers and with more than 1 million lines of Open Source code supporting it. CML can be validated and built into authoring tools (for example the Chemistry Add-in for Microsoft Word).[2]
A learning object is "a collection of content items, practice items, and assessment items that are combined based on a single learning objective" [Cisco Systems, Reusable Information Object Strategy].
EDUPUB embodies the notion of Learning Objects (LO). Content is created and presented as self-contained textual, audio-visual, interactive and assessment components that combine to satisfy learning objectives.
Math equations will be authored and stored in Presentation MathML or Content MathML.
MathML is the industry standard XML markup for displaying and processing mathematical equations and notations. One of the benefits of MathML is its ability to facilitate accessibility options. Tools will be provided to facilitate the authoring and transformation of content to MathML.
Assessment content will use the IMS Global Learning Consortium's Question and Test Interoperability version 2.1 (QTI v2.1) Specification as an exchange format. EDUPUB can reference assessment content using the embed element.
In Q12013 we will add support for authoring "Low stakes" assessment in XHTML5 markup and included in the EDUPUB source.
Scalable Vector Graphics (SVG) is a family of specifications of an XML-based file format for two-dimensional vector graphics, both static and dynamic (i.e.,interactive or animated). The SVG specification is an open standard that has been under development by the World Wide Web Consortium (W3C) since 1999.
SVG images and their behaviors are defined in XML text files. This means that they can be searched, indexed, scripted, and, if need be, compressed.[3]
Description: Visual appearance of content is not the only way to convey meaning to readers.
Benefit: The meaning behind the text formatting won’t be lost when displayed or transformed. Also allows for adjustable font face, font size, and background/foreground color without loss of meaning or functionality.
Requirement: Avoid text formatting (e.g. italic, underline, bold, font size) as the only way to provide information that goes beyond emphasis. An example of this issue would be if a quiz question asked: “What is the significance of the italic text in the following paragraph?" There must be a second way to locate the text in this type of situation.
content will be marked-up semantically using XHTML5 element and class names as well as leveraging EPUB3 conventions (e.g., epub:type) and QTI APIP for assessment. Authoring guidelines & validation will be used to avoid specific references to formatting.
Description: Roles will be identified (e.g. heading, numbered list, bulleted list, data table, paragraph, emphasized text)
Benefit: When spoken text features are used, it is possible to skip entire lists and continue reading the main text. Also provides the ability to meaningfully skim the content and return to sections of the page without sight.
Requirement: Ensure that semantic tagging practices identify chunks of textual content with the appropriate role. Content will be marked-up semantically using element and class names as well as leveraging EPUB3 conventions (e.g., epub:type) and QTI/APIP for assessment.
Description/Benefit: Content follows a logical and sequential order. Allows the content to be automatically transformed to other visual formats and to audio formats without loss of meaning.
Requirement: Establish standards and best practices to define proper reading order.
Content will be authored in the logical reading order and preserved throughout. The aside element introduces content that can be read out of order (for example a sidebar) and is authored where the reader might consider reading it.
Description/Benefit: Accessible descriptions will be available for images (photographs and rendered art) except where the image is purely decorative.
Requirement: Content expert will create a text alternative that effectively conveys the instructional intent of the image and provides the same information that the image provides.
The alt text attribute will be empty when the image is purely decorative, or will contain the text alternative. This will be enforced via validation rules.
Description: SVG assets look great at any size. Scaled object will not appear pixelated. These assets can also have titles and descriptions.
Benefit: Ability to scale images without the need for specialized zoom software. Textual data (title & description) can be accessed via accessible technology.
Requirement: Production to create and/or convert line art images to SVG. Content expert will create a text alternative for the SVG image.
Referencing of SVG images is supported and will always appear in a switch/case architecture allowing for HTML, alternative text or image alternative if SVG can not be natively supported.
Description/Benefit: Allows reader to quickly determine what they are reading at any given point in the table.
Requirement: Apply industry standard markup to indicate proper order and identify column/row headers.
This markup is supported but can be very labor intensive based on the complexity and size of the table.
Description/Benefit: MathML eliminates the need for content experts to create alternative descriptive text of math structures because MathML is machine readable and can be understood by screen readers and other assistive technologies. Also eliminates the production and quality assurance cost of creating images of math structures.
Requirement: Production to ensure the composition of the book supports MathML.
MathML is supported and will always appear with an alternative text or image alternative if MathML cannot be natively supported
Description: Text will be zoomable up to at least 200%.
Benefit: No left or right swiping needed for basic text reading. Minimal up/down swiping needed when zoomed at or above 200%. This capability is part of the spec for the NexText platform.
All textual content is in XHTML5. Zooming is a feature of the device/platform.
Description/Benefit: Allows user to jump to the end of chunk of content in a list or table.
Requirement: Production will use the “seq" element in lists and tables to define smaller chunks of related data.
Description: Video with: captions, transcripts, audio descriptions, and graceful fallbacks.
Benefit: Using text to describe a video, or communicate the audio portion of a video, allows people with hearing disabilities to have a similar experience compared to people who can hear. This also helps in noisy environments.
Requirement: Create captions, transcript, audio description for the video. Include a “poster image" of the video as well as several renditions at different resolutions and/or codecs.
All images and media are required to have alt text unless purely decorative. This will be enforced via validation rules. Native XHTML5 track element will be used for subtitle and caption information within the video element.
Description: Timed Tracks allow for properly synchronized text and audio.
Benefit: Captions and subtitles will be in sync with video.
All images and media are required to have alt text unless purely decorative. This will be enforced via validation rules. Native XHTML5 track element will be used for subtitle and caption information within the audio element.
Description: Speech synthesized with proper pronunciation for the chosen language.
Benefit: An alternative to human narration.
Requirement: Production to provide code to explicitly declare the language. Spoken text is part of the spec for the NexText platform.
SMIL and Media Overlay capabilities of EPUB3 will be supported.
Description: Media Overlays provide reading/listening options and the ability to easily switch between them.
Benefits: Simultaneous audio-only, text-only, and eBook production. Compliance with the SMIL standard.
Requirement: Production to link structured audio narration to its corresponding text or timestamp within a video.
SMIL and media overlay capabilities of EPUB3 will be supported.
Description: Properly tagged footnotes do not comingle with text stream.
Requirement: Production to compose the book with properly tagged footnotes.
Use of aside element with epub:type="footnote" will uniquely identify footnotes.
Draft Page 11/15/2013
[1] Additional classing/subclassing of gadgets will occur once a complete matrix of widgets and gadgets is assembled and ready analysis
[2] "Chemical Markup Language | CML." 12 Jul. 2012 <http://www.xml-cml.org/>
[3] "Scalable Vector Graphics - Wikipedia, the free encyclopedia." 2003. 18 Jul. 2012 <http://en.wikipedia.org/wiki/Scalable_Vector_Graphics>