On the wording for character and numeric "entities"

2 posts / 0 new
Last post

http://www.idpf.org/accessibility/guidelines/content/xhtml/entities.php says:

> Named character entities — such as   for non-breaking spaces and — for em dashes — are no longer supported in XHTML5 documents. Numeric character entities should be used instead.

There seems to be a myriad of subtle differences of how different sources refer to either of those, but usually the common denominator is that "entity" denotes a human readable name to refer to a character instead of using the character's numeric index in a character set. The term "entity" in XML http://www.w3.org/TR/xml/#dt-entity means exactly declaring a readable name in a DTD that can be used to refer to some content — i.e. an alias. So "numeric entities" is somewhat of an oxymoron.

HTML4, being based on a DTD, called them "character entity references" and "numeric character references" respectively at http://www.w3.org/TR/1999/REC-html401-19991224/charset.html#entities.

Both HTML5 (W3) at http://www.w3.org/TR/html5/syntax.html#character-references and HTML: The Living Standard (WHATWG) at http://developers.whatwg.org/syntax.html#character-references call them "named character references" and "numeric character references" respectively (dropping the "entity" moniker altogether — understandable, I guess, since they dropped DTDs too).

EPUB 3 specifications and guidelines should thus be calling them either:

* "named character references" and "numeric character references", as the format is based on HTML5, or
* "character entity references" and "numeric character references", as the format is based on XML.

Thanks, but links to that page and the one on dtd declarations were dropped some time back, and they were removed from the download, since they weren't specifically about accessibility. I'm going to delete the pages from the server shortly.

Secondary menu