LaΤeΧ symbols, fonts and character sets

Latex has been around for a few decades now, and a lot of weird stuff has happened in that time, e.g. they invented the € symbol. Here is how one keeps up with the typographical times.

First, be aware that there are may systems for encoding characters and information about them that can be used by TeX. I am no expert but there are encodings with names like T1, OT1, T2C and what-have-you. There is also the modern solution of using unicode systems (which in practice means utf8 encoding), supported by modern desktop fonts. Encodings have to map to fonts, which means that fonts must support the desired encodings. The systems that make this work are routine in modern times for european roman-script text but not so much for non-english or mathematical text. In particular, when combining LaTeX’s ancient recondite specialty with popular modern cosmopolitan text handling, friction ensues.

Unicode

And unicode-happy fonts! This is more-or-less smooth for the text (as opposed to mathematical) part of documents these days.

Be warned, mathematics (can) use a whole other system though and has some addenda. Aside from the usual fragile academic code quality nonsense, non-ASCII character support is not so painful in modern LaTeX as such. However, the 💩 hits the fän if I try to use non-ascii ©haracters in BibTeX. I use BibLaTeX/biber instead which works fine without any further effort. Sometimes a journal will advise against it, but they don’t seem to notice if I ignore them, and it saves me much time. I have not yet received the complaint “Oh no! there weren’t enough glitches in your bibliography! I was offended that umlauts rendered correctly!” although that woudl not be the weirdest reviewer comment I have ever received.

pdfLaTeX

pdfLaTeX can import modern input encoding but not modern fonts, AFAICS. I put this at the start of every file I touch.

% !TEX encoding = UTF-8 Unicode

then after the documentclass

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{fontspec}
\usepackage{xltxtra,xunicode}
% Fix default incommensurability of font sizes
% this line must come after fontspec
\defaultfontfeatures{Mapping=tex-text,Scale=MatchLowercase}
\newcommand{\euro}{€}

XeTeX

\usepackage{mathspec}
\usepackage{xltxtra,xunicode}
% Fix incommensurability of font sizes which per default is awful
% this line must come after mathspec or fontspec
\defaultfontfeatures{Mapping=tex-text,Scale=MatchLowercase}
\newcommand{\euro}{€}

Everything just works AFAICT. It’s incompatible with a couple of primordial packages which do various fake Unicode hacks (like the ucs package) but I don’t think those are actually needed anyway.

You can change fonts using fontspec/mathspec:

\setmainfont{Times New Roman}

As a bonus complication amsmath must be loaded before anything which uses mathspec because mathspec is flakey.

You can do cleverer tricks such as turning italic green.

But which fonts do you have? Find out manually.

fc-list : family

or, even more precise

fc-list :outline -f "%{family}\n"

PRO TIP: only load one of mathspec and fontspec because mathspec loads fontspec which causes errors.

A tonne may be installed, but start with something useful.

tlmgr install
    baskervillef \
    mathdesign

unicode-math

If I decided to go hardcore unicode I could use unicode even for mathematics, via unicode-math.

With this package, changing maths fonts is as easy as changing text fonts — and there are more and more maths fonts appearing now. Maths input can also be simplified with Unicode since literal glyphs may be entered instead of control sequences in your document source.

Presumably this is only for XeTex/LuaTeX etc.

Also if I were to copy-paste equations from a PDF generated by such means to LaTeX, they would be somewhat less mangled. The price is that it has some quirks, e.g. missing some curly letters. On the other hand it has symbols that are in unicode but not in elderly Tex math fonts, such as ⫫. Also, usually I do not typically have a choice of maths fonts because versions are stipulated in the style guide for the journal/conference/thesis I am writing. Therefore, while this is nifty, the benefits are insufficiently many compared to the burdens. The logic of collective action dictates I ignore it for now.

Anyway, it goes like this

\usepackage{unicode-math}
\setmathfont{texgyrepagella-math.otf}

Modern fonts

XITS is a scientific Times-like font that seem to soothe for example academics, who have a great fetish for cargo cult graphic design. It is not obvious how to set these fonts and weird things happen with maths sometimes but it mostly seems to work in the end.

Special symbols

Particular characters/dingbats/emoji/etc?

90% of questions to this theme can be answered by the Latex Math Symbols Cheat sheet.

Emoji

There are two dominant ways to insert emoji into LaTeX.

The dirty-yet-shiny hack to include color emoji as images. (needs a Mac computer lying to raid for the images, as a one-off.)

\documentclass{article}
\usepackage{coloremoji}
\begin{document}
Hello, 🌎.
\end{document}

Elegant but less colourful, XeTeX has native monochrome emoji via DejVu fonts.

\documentclass{article}
\usepackage{fontspec}

% these lines must come after fontspec
\newfontfamily\DejaSans{DejaVu Sans}
\newcommand\todo{{\color{red}\DejaSans 🚧}}

\begin{document}
  \todo mention {\DejaSans 😁😂😃😇😉😈😋😍😱}
\end{document}

Google’s Noto font famously supports emoji. I wonder if that would be a good alternative? One could use the system-installed Noto font via XeTeX in the usual manner. There is also a noto package for TexLive, but I think that is not unicode? 🤷‍♂

Stochastic independence symbol

A case study in doing typography right. The probabilistic independence symbol ⫫, unicode U+2AEB, (“double up tack”) does not ship in normal LaTeX maths systems for some reason. So how do you fake it? In one of many slightly unsatisfactory ways!

Jason Blevins suggests the following hacks:

\newcommand{\indep}{\perp \! \! \! \perp}

It has some shortcomings, such as not setting the symbol up as a proper operator, which probably means something bad in the complicated world of LaTeX spacing. Perhaps this would be better:

\newcommand{\indep}{\mathop{\perp \! \! \! \perp}}

Alternatively, the following does more specific space management.

\newcommand\indep{\protect\mathpalette{\protect\independenT}{\perp}}
\def\independenT#1#2{\mathrel{\rlap{$#1#2$}\mkern2mu{#1#2}}}

Fibo Kowalsky adds the alternative:

\newcommand{\indep}{\raisebox{0.05em}{\rotatebox[origin=c]{90}{$\models$}}}

Ashwin Khadke notes that for classic LaTeX one can import the symbol in one of the massive math symbol fonts, e.g. mdsymbol:

\usepackage{mdsymbol}
\newcommand{\indep}{\upvDash}

All of these would probably benefit from declaring the created symbol to be a mathematical operator via \mathrel.

Note that mdsymbol is incompatible with amssymb and amsfonts although notionally it renders them unneeded. Also it is a sans serif math font, so may not fit with your aesthetic. And it redefines various useful characters and is generally a mess. It would probably be better to import a single character via pifont or yagusylo although that is onerous in its own right.

unicode-math should work via:

\usepackage{unicode-math}
\newcommand{\indep}{⫫}

Any of the above should result in the independence symbol being available for use as

$X \indep Y $

AFAIK only the Jason Blevins double \perp trick works for js mathematics, although in that case I believe you can just type ⫫ since js mathematics is happy with unicode.

Conditional |

The stochastic conditional symbol is also fiddly to type. Jason Blevins observes that spacing is differently preserved in normal and big sizes.

Normal size:

\Pr( A \mid B )

will get us

\[\Pr(A \mid B)\]

Big size needs a \; spacer set around a \middle\vert, e.g.:

\Pr\left( A \;\middle\vert\; \sum_{i=1}^N B_i \right)

for

\[\Pr\left( A \;\middle\vert\; \sum_{i=1}^N B_i \right)\]

Every moment spent thinking about this nonsense is a moment spent not saving the world.

Changing case

tl;dr

These are different in traditional latex but both identical in unicode xelatex:

\uppercase{ab\"{u}ë} % AB\"{u}Ë
\MakeUppercase{ab\"{u}ë} % AB\"{U}Ë