Embedding non-unicode legacy font in ePub

5 posts / 0 new
Last post

Hi,

recently I am seeing that lot of publishers are embedding non-unicode legacy fonts in their ebooks, these books are mainly in non-English language. Earlier I was thinking that for epub only works with Unicode fonts. I have checked few them in different readers and they are rendered perfectly. If the epub is getting rendered perfectly by embedding non-Unicode legacy font then there is no need to for the publisher to spend huge amount of money in getting the content converted into Unicode first and then into epub..
My question is what are the problems in using non-Unicode legacy font in ePub, will it meet the epub3 standard.

Please suggest. PDF to ePub3 conversion best software

@surojit : as per http://www.idpf.org/epub/30/spec/epub30-publications.html#sec-xml-constr... 5.4 XML Conformance:

Any Publication Resource that is an XML-Based Media Type must meet the following constraints: [...] It must be encoded in UTF-8 or UTF-16

Plus, there are other contraints on the usage of Unicode characters in file names (e.g.: http://www.idpf.org/epub/30/spec/epub30-ocf.html#sec-container-filenames ) and so on.

Hence, I strongly suggest to switch to Unicode. If you know the original encoding, converting the materials to Unicode costs virtually nothing.

The issue typically doesn't just involve flipping the original encoding, but dealing with the use of private ranges in the legacy document to create missing characters in the font. So while it's true that switching the encoding can be simple, find and fixing all the custom characters used in each file can be costly.

But I'm not sure how non-unicode fonts solve this problem, as text content does have to be in utf-8/utf-16 to be a compliant EPUB 3. Even if the font remaps the unicode character points to different characters, there's no guarantee that any reading system will honor your embedded fonts, which will potentially leave readers with a book full of gibberish. Any use of text-to-speech will also come out as gibberish, as well, regardless of whether the font is applied or not.

But it sounds from the original question like it's not a font remapping issue here so much as people potentially disobeying the requirement for content documents to be utf-encoded in addition to using non-unicode fonts. "Packaged like an EPUB 3" does make EPUB 3.

*does not make EPUB 3*

Secondary menu