Duplicate metadata usage

3 posts / 0 new
Last post

Dear, every forum users

My team is working on a parser of EPUB 3.0 metadata.
I know EPUB 3.0 specification says three metadatas from DCMES must be used and together with some of modified properies from DCTERMS can be used.
If you look at the sample here, however, O'Reilly Accessible EPUB 3
http://code.google.com/p/epub-samples/downloads/detail?name=accessible_e...
The sample includes many duplicate metadata such as title, language, contributor, rights, and publisher in both dc and dcterms.
---------------------------------------------------------------------------------------------------
<metadata>
<dc:identifier id="pub-identifier">urn:isbn:9781449328030</dc:identifier>
<meta id="meta-identifier" property="dcterms:identifier">urn:isbn:9781449328030</meta>
<dc:title id="pub-title">Accessible EPUB 3</dc:title>
<meta property="dcterms:title" id="meta-title">Accessible EPUB 3</meta>
<dc:language id="pub-language">en</dc:language>
<meta property="dcterms:language" id="meta-language">en</meta>
<meta property="dcterms:modified">2012-02-20T22:17:24Z</meta>
<!--The preceding date value is actually local time (not UTC) in UTC format because there is no function in XSLT 1.0 to generate a correct UTC time-->
<meta property="dcterms:contributor">O’Reilly Production Services</meta>
<dc:contributor>O’Reilly Production Services</dc:contributor>
<meta property="dcterms:contributor">DavidFutato</meta>
<dc:contributor>DavidFutato</dc:contributor>
<meta property="dcterms:contributor">RobertRomano</meta>
<dc:contributor>RobertRomano</dc:contributor>
<meta property="dcterms:publisher">O’Reilly Media, Inc.</meta>
<dc:publisher>O’Reilly Media, Inc.</dc:publisher>
<meta id="meta-creator12" property="dcterms:creator">Matt Garrish</meta>
<dc:creator id="pub-creator12">Matt Garrish</dc:creator>
<meta property="dcterms:contributor">BrianSawyer</meta>
<dc:contributor>BrianSawyer</dc:contributor>
<meta property="dcterms:date">2012</meta>
<dc:date>2012</dc:date>
<meta property="dcterms:rights">Copyright © 2012 O’Reilly Media, Inc</meta>
<dc:rights>Copyright © 2012 O’Reilly Media, Inc</dc:rights>
<meta property="dcterms:rightsHolder">O’Reilly Media, Inc</meta>
<meta property="dcterms:contributor">DanFauxsmith</meta>
<dc:contributor>DanFauxsmith</dc:contributor>
<meta property="dcterms:contributor">KarenMontgomery</meta>
<dc:contributor>KarenMontgomery</dc:contributor>
</metadata>
-----------------------------------------------------------------------------
Is there any specific reason that the sample has duplicated information? maybe for legacy ebooks?
Should we parse both duplicate dc and dcterms to qualify EPUB 3 standard?

Thank you.

As I recall, the sample was generated using the docbook xslt stylesheets, so I expect the duplication is just a feature of that transform. A lot of the dcterms metadata is also technically wrong. dcterms:publisher, dcterms:creator, dcterms:contributor and dcterms:rightsHolder are represented as literal values, but they would have to conform to the Agent range to be conformant, which isn't technically possible as EPUB 3 doesn't allow nesting meta tags. I believe dcterms:rights is also invalid. There are also spaces missing from the names in the DC elements.

The dublin core elements are recognized by EPUB 2 reading systems and the meta elements are not, which is one reason why their use was persisted into the new specification. But since the dcterms properties largely duplicate the DC elements, their inclusion doesn't add a lot of value here.

The meta/@property approach is more forward looking, and allows extensibility of metadata in EPUB 3. Beyond the three required DC elements, dcterms:modified, and a few notes on the optional DC elements, the specification doesn't say what must be done with metadata. The listing of the optional DC elements doesn't technically give them more weight than the dcterms equivalents (they're listed because the specification notes all allowed tags), but I expect that the DC elements will get better support simply because of the legacy they have, and because much of dcterms can't be validly used (although I'm not sure how to stop people from using the dcterms properties inaccurately, as I don't think the hasRange field in the property definintions is widely understood).

So although the duplication isn't technically necessary, it's also not harmful (beyond the issues noted above).

I'll try to fix the sample in the next little while.

Oops, I shouldn't have said it's not techncially possible. I was thinking of defining like this:

<meta property="dcterms:creator">
   <meta property="foaf:Person">
      <meta property="foaf:name">Matt Garrish</meta>
   </meta>
</meta>

I believe you could use dcterms:creator et al., but to be valid they would have to include a URI pointing to where the agent is defined:


<meta property="dcterms:creator">http://example.com/MattGarrish</meta>


 

Secondary menu