Skip to main content
\( \newcommand{\lt}{<} \newcommand{\gt}{>} \newcommand{\amp}{&} \)

Section10Font Tests

We place various blocks of Unicode characters here to determine the minimum configuration necessary to make them render. Alan Wood’s Unicode Resources site has been helpful in formulating these tests.

Basic Latin, U+0000U+007F

These 95 characters are the most basic, and should all render using xelatex with no special setup. U+0000 to U+001F are control codes and not used here. U+007F is also a control code and so is excluded. We have also replaced reserved characters by their MathBook XML equivalent empty elements

0 1 2 3 4 5 6 7 8 9 A B C D E F
002_ ! " # $ % & ' ( ) * + , - . /
003_ 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
004_ @ A B C D E F G H I J K L M N O
005_ P Q R S T U V W X Y Z [ \ ] ^ _
006_ ` a b c d e f g h i j k l m n o
007_ p q r s t u v w x y z { | } ~
Table10.1Basic Latin, Regular
Monospace, Basic Latin, U+0000U+007F

These are exactly the same characters as above, but now we wrap them in the <c> element intended for inline use. This does not test all verbatim situations but is a good simple first test.

0 1 2 3 4 5 6 7 8 9 A B C D E F
002_ ! " # $ % & ' ( ) * + , - . /
003_ 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
004_ @ A B C D E F G H I J K L M N O
005_ P Q R S T U V W X Y Z [ \ ] ^ _
006_ ` a b c d e f g h i j k l m n o
007_ p q r s t u v w x y z { | } ~
Table10.2Basic Latin, Monospace

Note that the single and double quotes are upright and dumb, not curly and smart: " ' " ' " '. The zero is distinguished from the capital “oh”: 0 O 0 O 0 O. And the numeral one is slightly different from the lower-case “ell”: 1 l 1 l 1 l. The hyphen should be short and not expanded into some other kind of dash: - - -. These characters should all cut/paste out of a PDF into a text editor with no conversion to other characters.

Note also that we have entered all these characters into the source with the &#x00NN; XML notation, but needed to “protect” the question mark, since we use it in the verbatim routine to mark the beginning and end.

Latin-1 Supplement, U+0080U+00FF

These 94 characters should all render using either pdflatex or xelatex with no special setup. U+0080 to U+009F are control codes and not used here. U+00A0 (non-breaking space) and U+00AD (soft hyphen) are also excluded.

0 1 2 3 4 5 6 7 8 9 A B C D E F
00A_   ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯
00B_ ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
00C_ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
00D_ Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
00E_ à á â ã ä å æ ç è é ê ë ì í î ï
00F_ ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ
Table10.3Latin-1 Supplement, Regular
Monospace, Latin-1 Supplement, U+0080U+00FF

The same 94 characters as above, wrapped in a <c> element as if being used inside a sentence. These will all render with xelatex and none will render with pdflatex (so there is just blank space below). If we improve the latter, then these will get duplicated into the sample article.

0 1 2 3 4 5 6 7 8 9 A B C D E F
00A_ ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯
00B_ ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
00C_ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
00D_ Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
00E_ à á â ã ä å æ ç è é ê ë ì í î ï
00F_ ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ
Table10.4Latin-1 Supplement, Monospace
Latin Extended-A, U+0100U+017F

Good success rendering with xelatex and no extra setup. About 25% of these are missing when rendered with pdflatex.

0 1 2 3 4 5 6 7 8 9 A B C D E F
010_ Ā ā Ă ă Ą ą Ć ć Ĉ ĉ Ċ ċ Č č Ď ď
011_ Đ đ Ē ē Ĕ ĕ Ė ė Ę ę Ě ě Ĝ ĝ Ğ ğ
012_ Ġ ġ Ģ ģ Ĥ ĥ Ħ ħ Ĩ ĩ Ī ī Ĭ ĭ Į į
013_ İ ı IJ ij Ĵ ĵ Ķ ķ ĸ Ĺ ĺ Ļ ļ Ľ ľ Ŀ
014_ ŀ Ł ł Ń ń Ņ ņ Ň ň ʼn Ŋ ŋ Ō ō Ŏ ŏ
015_ Ő ő Œ œ Ŕ ŕ Ŗ ŗ Ř ř Ś ś Ŝ ŝ Ş ş
016_ Š š Ţ ţ Ť ť Ŧ ŧ Ũ ũ Ū ū Ŭ ŭ Ů ů
017_ Ű ű Ų ų Ŵ ŵ Ŷ ŷ Ÿ Ź ź Ż ż Ž ž ſ
Table10.5Latin Extended-A

Rendered with xelatex and no special setup, I seem to be missing only four characters:

  • U+0138 (LATIN SMALL LETTER KRA, Greenlandic, removed 1973)
  • U+0149 (LATIN SMALL LETTER N PRECEDED BY APOSTROPHE, Afrikaans, deprecated as of Unicode version 5.2.0)
  • U+0166 (LATIN CAPITAL LETTER T WITH STROKE, Northern Sámi alphabet, used in northern parts of Norway, Sweden and Finland)
  • U+0167 (LATIN SMALL LETTER T WITH STROKE, Northern Sámi alphabet, used in northern parts of Norway, Sweden and Finland)
Latin Extended-B, U+0180U+024F

Rendering with xelatex and no extra setup, maybe 50% missing, and some constructions of accents are clearly wrong. Almost none of these appear when rendered with pdflatex. (When processed with lualatex the incorrectly accented characters are not even visible.)

0 1 2 3 4 5 6 7 8 9 A B C D E F
018_ ƀ Ɓ Ƃ ƃ Ƅ ƅ Ɔ Ƈ ƈ Ɖ Ɗ Ƌ ƌ ƍ Ǝ Ə
019_ Ɛ Ƒ ƒ Ɠ Ɣ ƕ Ɩ Ɨ Ƙ ƙ ƚ ƛ Ɯ Ɲ ƞ Ɵ
01A_ Ơ ơ Ƣ ƣ Ƥ ƥ Ʀ Ƨ ƨ Ʃ ƪ ƫ Ƭ ƭ Ʈ Ư
01B_ ư Ʊ Ʋ Ƴ ƴ Ƶ ƶ Ʒ Ƹ ƹ ƺ ƻ Ƽ ƽ ƾ ƿ
01C_ ǀ ǁ ǂ ǃ DŽ Dž dž LJ Lj lj NJ Nj nj Ǎ ǎ Ǐ
01D_ ǐ Ǒ ǒ Ǔ ǔ Ǖ ǖ Ǘ ǘ Ǚ ǚ Ǜ ǜ ǝ Ǟ ǟ
01E_ Ǡ ǡ Ǣ ǣ Ǥ ǥ Ǧ ǧ Ǩ ǩ Ǫ ǫ Ǭ ǭ Ǯ ǯ
01F_ ǰ DZ Dz dz Ǵ ǵ Ƕ Ƿ Ǹ ǹ Ǻ ǻ Ǽ ǽ Ǿ ǿ
020_ Ȁ ȁ Ȃ ȃ Ȅ ȅ Ȇ ȇ Ȉ ȉ Ȋ ȋ Ȍ ȍ Ȏ ȏ
021_ Ȑ ȑ Ȓ ȓ Ȕ ȕ Ȗ ȗ Ș ș Ț ț Ȝ ȝ Ȟ ȟ
022_ Ƞ ȡ Ȣ ȣ Ȥ ȥ Ȧ ȧ Ȩ ȩ Ȫ ȫ Ȭ ȭ Ȯ ȯ
023_ Ȱ ȱ Ȳ ȳ ȴ ȵ ȶ ȷ ȸ ȹ Ⱥ Ȼ ȼ Ƚ Ⱦ ȿ
Table10.6Latin Extended-B

This table is left-over and will be redone, perhaps.

Name Range Samples LaTeX
Latin Extended-A U+0100U+017F Ą ą IJ ij TeXLive
Latin Extended Additional U+1E00U+1EFF Ḁ Ẁ Ặ ỳ \(\approx\)TeXLive
Cyrillic U+0400U+04FF Љ Щ щ Ӄ polyglossia
Arabic U+0600U+06FF ؟ ب حٍ ۳ polyglossia
General Punctuation U+2000U+206F — “ ‰ ※ TeXLive
Letterlike Symbols U+2100U+214F ℀ ℃ № ™ \(\approx\)TeXLive
Table10.7Sample Unicode Characters