Skip to main content

Unicode Errors When Building Django Documentation

The few times I've built Django's documentation from scratch, I always seem to get LaTeX unicode errors.

Unrelatedly, I've found that when I print Django's documentation on a B&W printer, some of the documentation, particularly code, is difficult, if not impossible, to read.

This note describes a brute-force approach to eliminate the unicode errors, as well as a rather churlish way to force Django to produce black and white PDF documentation.

Resolving Unicode Errors

When building the Django 2.1.3 documentation in PDF format (i.e., make latexpdf), I experienced two different Unicode-related errors. For example:

make latexpdf

...
[1400] [1401] [1402] [1403] [1404] [1405] [1406] [1407]
Underfull \hbox (badness 5924) in paragraph at lines 122862--122864
[]\T1/ptm/m/n/10 For build-ing up frag-ments of HTML, you should nor-mally be u
s-ing [][]\T1/pcr/m/sl/10 django.utils.html.
[1408]

! Package inputenc Error: Unicode char 你 (U+4F60)
(inputenc)                not set up for use with LaTeX.

See the inputenc package documentation for explanation.
Type  H <return>  for immediate help.
 ...                                              

l.122949 ...e}\PYG{o}{=}\PYG{k+kc}{True}\PYG{p}{)}

? x

My intuition tells me that the root cause is somewhere in my environment. However, I did not want to spend a lot of time tracking it down. One way to solve it is to modify line 10 of django.tex:

\usepackage[utf8]{inputenc}

to look like this:

\usepackage[utf8x]{inputenc}

If this is done, the PDF file is created correctly and the (apparently Chinese) characters are present. However, as django.tex is generated by make latexpdf, this isn't a great approach. Therefore, I chose to modify docs/config.py. It's not a great approach either, but at least it survives repeated invocations of make.

Prior to my modification, the relevant section of conf.py looked like this:

# -- Options for LaTeX output --------------------------------------------------

latex_elements = {
    'preamble': (
        '\\DeclareUnicodeCharacter{2264}{\\ensuremath{\\le}}'
        '\\DeclareUnicodeCharacter{2265}{\\ensuremath{\\ge}}'
        '\\DeclareUnicodeCharacter{2665}{[unicode-heart]}'
        '\\DeclareUnicodeCharacter{2713}{[unicode-checkmark]}'
    ),
}

And after modification:

latex_elements = {
    'preamble': (
        '\\DeclareUnicodeCharacter{2264}{\\ensuremath{\\le}}'
        '\\DeclareUnicodeCharacter{2265}{\\ensuremath{\\ge}}'
        '\\DeclareUnicodeCharacter{2665}{[unicode-heart]}'
        '\\DeclareUnicodeCharacter{2713}{[unicode-checkmark]}'

        # Added by KHE
        '\\DeclareUnicodeCharacter{4F60}{\\textquestiondown}'
        '\\DeclareUnicodeCharacter{597D}{\\textquestiondown}'
    ),
}

This approach prints an upside-down question mark instead of the problematic unicode characters. It is not optimal, but it only affects page 1409 of the documentation; I can live with that.

Black & White PDF

To create a B&W PDF when make latexpdf is run, I made the following two changes to config.pdf.

First, I changed the pygments style from trac to bw. This takes care of code snippets.

# The name of the Pygments (syntax highlighting) style to use.
# pygments_style = 'trac'
pygments_style = 'bw'

Then, I added code to latex_element to change non-white colors from sphinx.sty to black.

latex_elements = {
    'preamble': (
        '\\DeclareUnicodeCharacter{2264}{\\ensuremath{\\le}}'
        '\\DeclareUnicodeCharacter{2265}{\\ensuremath{\\ge}}'
        '\\DeclareUnicodeCharacter{2665}{[unicode-heart]}'
        '\\DeclareUnicodeCharacter{2713}{[unicode-checkmark]}'

        # Added by KHE
        '\\DeclareUnicodeCharacter{4F60}{\\textquestiondown}'
        '\\DeclareUnicodeCharacter{597D}{\\textquestiondown}'
        '\sphinxDeclareColorOption{TitleColor}{{rgb}{0,0,0}}'
        '\sphinxDeclareColorOption{InnerLinkColor}{{rgb}{0,0,0}}'
        '\sphinxDeclareColorOption{OuterLinkColor}{{rgb}{0,0,0}}'
        '\sphinxDeclareSphinxColorOption{VerbatimHighlightColor}{{rgb}{0,0,0}}'
    ),
}

Result

Running make latexpdf creates a B&W PDF without any unicode errors.