Version 2.0.1 v1.0; May 9, 2010

Font Embedding for Open Container Format Files

This version
http://www.idpf.org/doc_library/informationaldocs/FontManglingSpec_2.0.1_draft.htm
Latest version
http://www.idpf.org/doc_library/informationaldocs/FontManglingSpec_2.0_latest.htm
Previous version
http://www.idpf.org/doc_library/informationaldocs/FontManglingSpec.html
Diffs to previous version
http://www.idpf.org/doc_library/informationaldocs/FontManglingSpec_2.0.1_diffs_to_2.0.htm

OCF [1] is a technology which is well-suited to package OPF/OPS-compliant [2] electronic publications. Since OCF is fundamentally a zip file, commonly available zip tools can be used to extract any unencrypted content stream from the package. On some systems, the contents of the zip file may appear like any other native container (e.g. a folder). While the ability to do this is quite useful, it can pose a problem for an author of the publication who wishes to include a third-party font. Many commercial fonts allow embedding, but embedding a font implies making it an integral part of the publication, not providing the original font file along with the content. Since integrated zip support is so ubiquitous in modern operating systems, simply placing the font in the zip archive is insufficient to signify that the font is not intended to be reused in other contexts. This uncertainty can undermine the otherwise very useful font embedding capability that OPF/OPS provides. 


In order to discourage reuse of the font, some font vendors may allow use of their fonts in OCF containers only if those fonts are bound in some way to the publication. That is, if the font file can not be installed directly for use on an operating system with the built-in tools of that computing device, and it can not be directly used by other OPF/OPS publications. It is beyond the scope of this document to provide a digital rights management or enforcement system for font files. It will instead propose a method of obfuscation that will require additional work on the part of the final OCF recipient to gain general access to any included fonts. It is the hope of the IDPF that this will meet the requirements of most font vendors. However, no claim is made in this document or by the IDPF that this constitutes encryption, nor does it guarantee that the font file will be secure from copyright infringement. The proposed mechanism will simply provide a stumbling block for those who are unaware of the license details of the supplied font. It will not prevent a determined user from gaining full access to the font. Given the original OCF publication, it is possible to apply the algorithms described in this document to extract the raw font file. Whether this satisfies the requirements of individual font licenses remains a question for the licensor and licensee.


Obfuscation Algorithm


The algorithm employed to obfuscate the font file consists of modifying the first 1040 bytes (~1KB) of the font file. In the unlikely event that the file is less than 1040 bytes, then the entire file will be modified. The key for the algorithm must be a 20 byte (160 bit) SHA-1 digest[3] of the publication's unique identifier. Details on generating this key are given in the section "Generating the Obfuscation Key". To obfuscate the original data, the result of performing a logical exclusive or (XOR) on the first byte of the raw file and the first byte of the key is stored as the first byte of the embedded font file. This process is repeated with the next byte of source and key, until all bytes in the key have been used. At this point, the process continues starting with the first byte of the key and 21st byte of the source. Once 1040 bytes have been encoded in this way (or the end of the source is reached), any remaining data in the source is directly copied to the destination. In pseudo-code, this is the algorithm:

set source to font file
set destination to obfuscated file
set keyData to key for font
set outer to 0
while outer < 52 and not (source at EOF)
    set inner to 0
    while inner < 20 and not (source at EOF)
        read 1 byte from source     //Assumes read advances file position
        set sourceByte to result of read
        set keyByte to byte inner of keyData
        set obfuscatedByte to (sourceByte XOR keyByte)
        write obfuscatedByte to destination
        increment inner
    end while
    increment outer
end while
if not (source at EOF) then
    read source to EOF
    write result of read to destination
end if 

To get the original font data back, the process is simply reversed. That is, the source file becomes the obfuscated data and the destination file will contain the raw font data.

IdentifyingGenerating the Obfuscation Key

To tie a font to a particular publication, it is necessary to bind to a unique property of that publication. Such a value is required by the OPF 2.0 specification, as detailed in its section 2.1"Package Identity". Every compliant OPF file has a dc:identifier element which uniquely identifies the publication. The OPF 2.0 specification details finding this element by examining the unique-identifier attribute of the package files package element. This element provides the required characteristic of being unique to a publication, however it is not suitable for use directly as the obfuscation key (for instance, its length is not defined).


In order to create a suitable key that is tied to the publication, a SHA-1 digest of the UTF-8 encoded unique identifier should be generated as specified by the Secure Hash Standard[3]. Before generating the digest, all white space characters as defined by the XML 1.0 specification[4], section 2.3 are removed. Specifically the Unicode code points 0x20, 0x09, 0x0D and 0x0A will be stripped from the string before the digest is computed. This digest is then directly used as the key for the algorithm described in the "Obfuscation Algorithm" section.


Specifying Obfuscated Resources

All encrypted data in an OCF must have an entry in the encryption.xml file accompanying the publication, per section 3.5.5 of the OCF specification. The EncryptionMethod element child of the EncryptedData must have an Algorithm attribute with the value "http://www.idpf.org/2008/embedding". The presence of this attribute signals the use of the algorithm described in this specification. All resource that have been obfuscated using this approach must be listed in the CipherData element.


An example encryption file might look like this:


<encryption 

xmlns="urn:oasis:names:tc:opendocument:xmlns:containerxmlns:enc="http://www.w3.org/2001/04/xmlenc#">

<enc:EncryptedData> 
<enc:EncryptionMethod Algorithm="http://www.idpf.org/2008/embedding"/>
<enc:CipherData> 
<enc:CipherReference URI="OEBPS/Fonts/BKANT.TTF"/> 
</enc:CipherData>
</enc:EncryptedData> 

</encryption>

To prevent trivial copying of the embedded font to other publications, the explicit key must not be provided in the encryption.xml file. Reading systems that implement this specification must derive the key from the packages unique identifier.

References

[1] OCF: http://www.idpf.org/ocf/ocf1.0/index.htmdoc_library/epub/OCF_2.0.1_draft.htm  


[2] OPF: latest draft is available at  http://www.idpf.org/doc_library/informationaldocs/OPS/OPF_2.0_0.7_draft.htmepub/OPF_2.0.1_draft.htm  


[3] SHA-1: http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf  


[4] XML 1.0:  http://www.w3.org/TR/REC-xml/