Hyphenation

7 posts / 0 new
Last post

I'm composing my first ebook in epub format. I'm trying to turn on automatic hyphenation but it's not really working. Could someone tell me what exactly

1) is a hyphenation resource file (is it a text file, how does it interact with css)

2) are the places where you can find these files for different languages (for free, preferrably)

3) your css should look like if you're using auto-hyphens and hyphenation resource files

4) you should add in a xhtml file when you're doing this

5) your content.opf file should look like (specifically, what is the media type for hyphenation resource files)

6) the other things are you need to take into consideration?

A link to a sample book, in which automatic hyphenation is used, could also help.

Hyphenation resources are part of the browser (or, in the case of EPUB, reading system); they're not anything you can control. You can only request the RS to use the language-specific hyphenation rules it has using the CSS3 hyphens property (it'll check the language specification in the file to determine which to use).

I think an old draft of the CSS3 Text module made reference to importing your own @hyphenation-resource file, but I don't believe it ever got defined and has since been dropped.

You may not get very widespread support for the property, as the CSS3 Text module is only at last call status (might take some prefixed versions, too).

If you want to set your own hyphenation points, you can use the Unicode soft-hyphen character, but that would take some automation.

Thanks Matt! Well, it's a pity if you can't import your own hyphenation dictionaries. Guess i have to use soft hyphens then... What would be the best automation tool to do that? Do soft hyphens cause problems if you're using the word search in a book?

Support in EPUB 2 for soft hyphens is probably going to be pretty lousy. Support in EPUB 3 should be better, at least if you take the view that what works well in modern browsers will work well in reading systems (but that theory sometimes explodes).

I haven't tested hyphenation support to any great extent, to be honest. It would be interesting to get it into the EPUB 3 test suite so that it could be reported in the support grid, since hyphenation is a critical bit of typography.

And when I mentioned automation, I was thinking rolling up the sleeves and programming something. If that's not your cup of tea, I don't know of any programs that will auto-hyphenate an HTML file, but there are scripts like hyphenator.js that might be useful. (Again only for EPUB 3, as you're not likely to get scripting support in EPUB 2.)

Thanks again, Matt. I'm not going to program anything at this point, so I'll try hyphenator.js (if I can find language files for Finnish).

I agree that hyphenation is critical for typography since there really are a lot of long words and also some megawords e.g. in Finnish, and the text either looks ragged or holey without hyphenation.

I used hyphenator.js as a "program" to produce a static xhtml-document (through a couple of steps). I tested this hyphenated document with different viewers, but at least FBReader and Calibre had difficulties with showing the text correctly (FBR shew all soft hyphens as dashes and Calibre shew soft hyphens as vertical dashes in italicised words).

This lack of support for hyphenation both in epub standard and in viewers is quite surprising. After all, as mentioned, hyphenation is critical for typography, and it shouldn't be too hard to support this functionality better. The current state makes epub look pretty naive compared to pdf and paper books.

Neither of those is an EPUB 3 reading system, so it's not surprising you're getting lousy results.

I did a quick test in Readium, iBooks and a fewer other newer RSes and you get better results visually, but searching is affected on some.

If you call the script from your pages you can set it to use the CSS3 hyphens property automatically once there is support, which would at least future-enhance usage (rather than embed the characters). The fallback in EPUB 2 reading systems with no support for scripting is that nothing happens.

And the situation for reading system developers is complex. For EPUB 3, reading systems are typically built on the open browser cores (webkit and gecko) so support develops in parallel with browsers generally. Asking for hyphenation support in advance of what has been built into the cores is asking a lot. The spec could try and mandate support, but that's not going to make it happen any faster in reality.

I know it can make for a frustrating experience developing content, but it's just the dynamic of working on the ever evolving open web platform.

Secondary menu