Strange validation messages

3 posts / 0 new
Last post

Hello
I have written and published two books in EPUB format with Smashwords (see http://mewila.co.uk).

The first, called "mewila", was written with tools including LaTeX, htlatex, Linux zip. It can be read with Calibre and the Firefox ePub plugin, which give no error messages for it. The official ePub validator reports a range of errors which make no sense: "missing" labels which exist and an "unclosed" tag which is never opened.

The second, called "bub", was written with OpenOffice in MS ".doc" format and then translated by the Smashwords "meatgrinder". This process produced an epub file version 2 file, using an XML namespace called http://www.idpf.org/2007/opf which seems to have disappeared from the World Wide Web. However it works with no errors.

I am trying to edit the mewila.OPF and mewila.NCX files, using the bub files as a model, so that mewila will pass the ePub validator. Any guidance will be very welcome.

My latest attempt is accepted by Calibre with no errors. The Firefox plugin can read the text but it cannot read the .NCX file. This is strange, as most of the changes were made to the .OPF file. The sequence of tags linking from the .OPF file to the .NCX file seems to be intact.

HTML 4 is not a problem. For my sins, a boss once told me to teach Internet protocols including an introduction to XML when I had no qualifications to do it. I am sorry for the class who had to take the course, but I hope it was not a complete fiasco. I know just enough about the subject to recognise major XML features. Markus Gylling's comments in 2009 on XML reinforce my inclination not to try to learn any more about it.
Regards and thanks
Alan Hutchinson

In what follows, opening angle brackets have been changed to "+" symbols.

--------

mewila WORKING VERSION
first few lines of .OPF file:

+?xml version="1.0" encoding="UTF-8"?>
+package version="2.0" unique-identifier="PrimaryID" xmlns="http://www.idpf.org/2007/opf">
+metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">

+dc:title id="t1">Beauty and Pleasure and Moderately Energetic Walking in the Lower Southern Alps+/dc:title>

+dc:language>en+/dc:language>

+dc:subject>Non-fiction, health, alpine, walking, beauty, pleasure, children, risk, science, thinking, weather+/dc:subject>

+dc:creator id="creator">Alan Hutchinson+/dc:creator>

+dc:publisher id="publisher">Alan Hutchinson+/dc:publisher>

+dc:date>2013+/dc:date>

+dc:identifier id="PrimaryID">Darwinian_Ev-_Southern_Alps+/dc:identifier>

+/metadata>

+manifest>

--------

mewila WORKING VERSION
first few lines of .NCX file:

+?xml version="1.0"?>
+!DOCTYPE ncx PUBLIC "-//NISO//DTD ncx 2005-1//EN"
"http://www.daisy.org/z3986/2005/ncx-2005-1.dtd">

+ncx version="2005-1" xmlns="http://www.daisy.org/z3986/2005/ncx/">

+head>
+meta name="dtb:uid" content="Darwinian_Ev-_Southern_Alps"/>
+meta name="dtb:depth" content="2"/>
+meta name="dtb:totalPageCount" content="0"/>
+meta name="dtb:maxPageNumber" content="0"/>
+/head>

+docTitle>
+text>Beauty and Pleasure and Moderately Energetic Walking in the Lower Southern Alps+/text>
+/docTitle>

+navMap>

--------

mewila LATEST TEST VERSION
first few lines of .OPF file:

+?xml version="1.0" encoding="UTF-8"?>

+package version="2.0" unique-identifier="PrimaryID" xmlns="http://www.idpf.org/2007/opf">

+metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:opf="http://www.idpf.org/2007/opf" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:calibre="http://calibre.kovidgoyal.net/2009/metadata" xmlns:dc="http://purl.org/dc/elements/1.1/">

+dc:description>This is a study of beauty and pleasure. It starts with two examples: Alpine walking, which takes up more than half the book, and a little simple science. The last two chapters explore how the sensations of beauty and pleasure might have arisen, and how they influence behaviour. Two of the greatest pleasures are rational thought and understanding, and communication and good company. They can lead in very different directions. There are about 90 pictures, 150 anecdotes and 360 references.+/dc:description>

+dc:language>en+/dc:language>

+dc:creator opf:role="aut">Alan Hutchinson+/dc:creator>

+dc:title>Beauty and Pleasure and Moderately Energetic Walking in the Lower Southern Alps+/dc:title>

+dc:contributor opf:role="bkp">Smashwords, Inc.+/dc:contributor>

+dc:subject>Non-fiction+/dc:subject>
+dc:subject>health+/dc:subject>
+dc:subject>alpine+/dc:subject>
+dc:subject>walking+/dc:subject>
+dc:subject>beauty+/dc:subject>
+dc:subject>pleasure+/dc:subject>
+dc:subject>children+/dc:subject>
+dc:subject>risk+/dc:subject>
+dc:subject>science+/dc:subject>
+dc:subject>thinking+/dc:subject>
+dc:subject>weather+/dc:subject>

+dc:publisher id="publisher">Alan Hutchinson+/dc:publisher>

+dc:date>2013+/dc:date>

+dc:identifier id="PrimaryID">Darwinian_Ev-_Southern_Alps+/dc:identifier>

+/metadata>

+manifest>

--------

mewila LATEST TEST VERSION
first few lines of .NCX file:

+?xml version="1.0" encoding="utf-8?>
+!DOCTYPE ncx PUBLIC "-//NISO//DTD ncx 2005-1//EN"
"http://www.daisy.org/z3986/2005/ncx-2005-1.dtd">

+ncx version="2005-1" xmlns="http://www.daisy.org/z3986/2005/ncx/">

+head>
+meta name="dtb:uid" content="Darwinian_Ev-_Southern_Alps"/>
+meta name="dtb:depth" content="2"/>
+meta name="dtb:totalPageCount" content="0"/>
+meta name="dtb:maxPageNumber" content="0"/>
+/head>

+docTitle>
+text>Beauty and Pleasure and Moderately Energetic Walking in the Lower Southern Alps+/text>
+/docTitle>

+navMap>

Hi Alan,

I downloaded the Southern Alps book, and ran it through epubcheck. I think the major problem is with the mewila.html file. Book content in EPUB2 is required to be XHTML 1.1, and that file is HTML 4.0.1. Even after converting the file to XHTML (using Tidy) there were hundreds of errors relating to duplicate ID values. epubcheck couldn't parse the HTML file at all, and all the links from the OPF and NCX didn't have anywhere to go, so to speak.

I'm not sure there's an easy solution. I did a bunch of GREP searches to fix some of the ID issues, but there were still 30 or 40 duplicate IDs after that.

Thanks,

Dave

Dave
Thanks a lot. I shall think about this.
Alan.

Secondary menu