Parse a Font
Parsing TrueType (TTF) Font Files — A Reference
A practical, implementation-agnostic guide to reading the TTF binary format. It is organized around the act of parsing: the spine of the document is 5. How to parse, which walks the file in order and links each step to that table’s byte layout in 6. Table reference. Read the parse flow top to bottom; click into a table’s layout only when you need the exact fields.
Contents
1. Introduction
The goal is to take a .ttf file as a flat array of bytes and turn it into
structured data — the font header, the glyph outlines, the metrics, the
character map. This document covers how to do that: the file’s shape, the
order to read things in, how to move the read cursor through the bytes, and the
byte-level layout of every core table. Field layouts are given as plain
Type | Name | Notes tables so they map directly to a struct in any language.
It does not contain program code — it is the format reference you build code from.
2. Mental model
A TTF file is not a document you read front to back. It is a small archive (the format is called “sfnt” — scalable font). Its bytes are organized as:
- A 12-byte offset table (the sfnt header): what kind of font this is and how many tables it holds.
- A table directory: one fixed-size record per table, each saying “the table with this tag lives at this byte offset and is this long.”
- The table data, located only by the offsets in the directory.
The crucial consequence: you never assume where anything is. You read a table’s offset and length from the directory, then jump there.
It helps to know what the tables are for before parsing them. This is how a character becomes a drawn glyph:
character code ──(cmap)──► glyph index ──(loca)──► glyf (the outline)
└─(hmtx)──► advance width (the spacing)
head and maxp are the configuration that makes loca interpretable. This
chain is also why the parse order in section 5 is what it is.
3. Conventions
Endianness
Everything is big-endian (most significant byte first). The two bytes
0x00 0x09 mean 9, not 2304. There are no little-endian fields anywhere.
Data types
The spec uses named types. These are the ones you need:
| Type | Bytes | Meaning |
|---|---|---|
uint8 | 1 | unsigned byte |
int8 | 1 | signed byte |
uint16 | 2 | unsigned short |
int16 | 2 | signed short |
uint24 | 3 | unsigned 24-bit |
uint32 | 4 | unsigned long |
int32 | 4 | signed long |
Fixed | 4 | 16.16 signed fixed-point |
FWORD | 2 | int16 in font design units |
UFWORD | 2 | uint16 in font design units |
F2Dot14 | 2 | 2.14 signed fixed-point (scales/variations) |
LONGDATETIME | 8 | signed int64, seconds since 1904-01-01 00:00 UTC |
Tag | 4 | four uint8 ASCII characters, e.g. glyf |
Offset16 | 2 | uint16 offset |
Offset32 | 4 | uint32 offset |
Version16Dot16 | 4 | packed major/minor version |
Signedness matters. Most binary helpers decode only unsigned integers. For
signed fields (int16, int64, …), decode the unsigned value of the same width
and reinterpret the bits as two’s complement. Get this wrong and a small
negative (e.g. a descender of -200) becomes a huge positive — and bounding
boxes routinely go negative, so it bites.
4. File structure
The top of every TTF is a fixed header followed by the directory. These two are the only parts you read linearly; everything else is reached by offset.
4.1 Offset table (sfnt header) — 12 bytes, at byte 0
| Type | Name | Notes |
|---|---|---|
uint32 | sfntVersion | 0x00010000 = TrueType outlines; 0x4F54544F (OTTO) = CFF/PostScript; 0x74727565 (true) on Apple. Your “is this a TTF?” check. |
uint16 | numTables | number of table records that follow |
uint16 | searchRange | (largest power of 2 ≤ numTables) × 16 |
uint16 | entrySelector | log2(largest power of 2 ≤ numTables) |
uint16 | rangeShift | numTables × 16 − searchRange |
searchRange, entrySelector, and rangeShift are a legacy binary-search
optimization. You can ignore their meaning, but for a byte-identical
round-trip you must read and re-emit them verbatim (or recompute with the
formulas above).
4.2 Table directory — numTables records of 16 bytes each, at byte 12
| Type | Name | Notes |
|---|---|---|
Tag | tableTag | 4 ASCII bytes, e.g. head, glyf |
uint32 | checksum | table checksum |
Offset32 | offset | from the beginning of the file |
uint32 | length | the table’s actual length, excluding pad bytes |
Record i begins at byte 12 + i × 16.
- In well-formed fonts the records are sorted ascending by tag, but the table data they point at can appear in any physical order.
- Each table’s data is padded with zero bytes to a 4-byte boundary.
lengthis the real (unpadded) length; the next table’soffsetaccounts for padding.
5. How to parse
This is the core of the document. The diagram above shows the full parse flow and table dependencies. Below is the exact sequence to follow and the strategy for decoding each table.
5.1 The parse sequence
Where a table sits (physical) is independent of when you decode it (logical). Decode order is forced by data dependencies:
Follow this sequence. Each step links to that table’s byte layout in section 7; the note is only what the step needs and yields, not the full definition.
- Offset table — byte layout →. Read the first 12 bytes;
capture
numTables. - Table directory — byte layout →. Read
numTables × 16bytes into atag → (offset, length)map. After this, switch from linear reading to seeking by offset. head— byte layout →. YieldsunitsPerEm,indexToLocFormat, the font bounding box.maxp— byte layout →. YieldsnumGlyphs.hhea— byte layout →. YieldsnumberOfHMetrics.hmtx— byte layout →. NeedsnumberOfHMetrics(hhea) andnumGlyphs(maxp).loca— byte layout →. NeedsindexToLocFormat(head) andnumGlyphs(maxp).glyf— byte layout →. Needslocato bracket each glyph.cmap— byte layout →. Independent; the character → glyph map.name— layout →,post— layout →,OS/2— layout →,kern— layout →. Independent / optional; parse as needed.
Validate as you go. sfntVersion is a known value;
head.magicNumber == 0x5F0F3CF5; every offset + length stays within the file;
the last loca entry equals the glyf table length. These checks turn corrupt
input into clear errors instead of out-of-range reads.
5.2 Fixed- vs variable-layout tables — the decode strategy
How you decode a table depends on its shape:
- Fixed-layout tables have a fixed set of fixed-width fields in a fixed order, no embedded arrays or strings. Size is known in advance; you can decode one in a single positional pass — provided your struct’s field order matches the on-disk order exactly.
- Variable-layout tables contain a count, version, or offset that determines how much follows — arrays sized by another field, strings, sub-tables. Size is not known until you read; these need hand-written, conditional parsing.
| Table | Class | Why |
|---|---|---|
| Offset table | fixed | 12 bytes, fixed fields |
| Table record | fixed | 16 bytes, fixed fields |
head | fixed | 54 bytes, all fixed-width |
maxp | fixed* | fixed per version (0.5 vs 1.0) |
hhea | fixed | 36 bytes |
OS/2 | fixed* | fixed per version (0–5) |
hmtx | variable | array sized by hhea.numberOfHMetrics + trailing array |
loca | variable | numGlyphs + 1 entries; entry width set by head |
glyf | variable | per-glyph variable length, two glyph formats |
cmap | variable | sub-tables in several formats, offset-linked |
name | variable | record array + string storage block |
post | variable* | header fixed; v2.0 adds a glyph-name array |
kern | variable | optional; multiple sub-table formats |
* = fixed once you know the version; treat the version word as a branch.
6. Table reference
The byte layouts the parse sequence in 5.1 links into. Skim only what you need.
6.1 head — font header (54 bytes, fixed)
The global font header. It contains the design grid size (unitsPerEm), the
bounding box that encloses every glyph in the font, creation and modification
timestamps, and style flags. The most critical field for parsing is
indexToLocFormat: it controls whether the loca table uses 2-byte or 4-byte
offsets, so head must be decoded before loca can be read.
| Type | Name | Notes |
|---|---|---|
uint16 | majorVersion | 1 |
uint16 | minorVersion | 0 |
Fixed | fontRevision | designer’s version; store raw if not interpreting |
uint32 | checksumAdjustment | whole-file checksum; see spec |
uint32 | magicNumber | always 0x5F0F3CF5 (validation hook) |
uint16 | flags | bit field |
uint16 | unitsPerEm | design grid; 16–16384, power of 2 for TrueType |
LONGDATETIME | created | int64, seconds since 1904 |
LONGDATETIME | modified | int64 |
int16 | xMin | font bounding box (signed!) |
int16 | yMin | |
int16 | xMax | |
int16 | yMax | |
uint16 | macStyle | bit field |
uint16 | lowestRecPPEM | smallest readable size in pixels |
int16 | fontDirectionHint | |
int16 | indexToLocFormat | 0 = short loca, 1 = long loca |
int16 | glyphDataFormat | 0 |
6.2 maxp — maximum profile
Establishes the memory requirements for the font. It records the worst-case
counts of points, contours, and interpreter stack depth across all glyphs so
the rasterizer can pre-allocate exactly what it needs. The only field you
strictly need for parsing is numGlyphs — it sizes the loca and hmtx
arrays. Version 0.5 (CFF fonts) carries only that field; version 1.0
(TrueType) carries the full set.
Version 0.5 (0x00005000, CFF fonts — 6 bytes):
| Type | Name | Notes |
|---|---|---|
uint32 | version | 0x00005000 |
uint16 | numGlyphs | the only field you need |
Version 1.0 adds (0x00010000, TrueType — 32 bytes total):
| Type | Name | Notes |
|---|---|---|
uint16 | maxPoints | max points in a non-composite glyph |
uint16 | maxContours | max contours in a non-composite glyph |
uint16 | maxCompositePoints | max points in a composite glyph |
uint16 | maxCompositeContours | max contours in a composite glyph |
uint16 | maxZones | 1 = no twilight zone; 2 = twilight zone used |
uint16 | maxTwilightPoints | max points in the twilight zone |
uint16 | maxStorage | max storage area locations |
uint16 | maxFunctionDefs | max function definitions |
uint16 | maxInstructionDefs | max instruction definitions |
uint16 | maxStackElements | max stack depth |
uint16 | maxSizeOfInstructions | max byte count for glyph instructions |
uint16 | maxComponentElements | max number of components in a composite glyph |
uint16 | maxComponentDepth | max nesting depth of composites |
6.3 hhea — horizontal header (36 bytes, fixed)
Contains the metrics needed to lay out glyphs on a horizontal baseline: the
ascender, descender, and line gap that renderers use for line spacing, and the
maximum advance width. All values are in font design units (FUnits). The key
field for parsing is numberOfHMetrics, which tells the hmtx parser how
many full (advance width + LSB) pairs the table contains.
| Type | Name | Notes |
|---|---|---|
uint16 | majorVersion | 1 |
uint16 | minorVersion | 0 |
FWORD | ascender | (int16) |
FWORD | descender | (int16, usually negative) |
FWORD | lineGap | (int16) |
UFWORD | advanceWidthMax | (uint16) |
FWORD | minLeftSideBearing | (int16) |
FWORD | minRightSideBearing | (int16) |
FWORD | xMaxExtent | (int16) |
int16 | caretSlopeRise | slope of the cursor (rise/run); 1 for vertical |
int16 | caretSlopeRun | 0 for vertical |
int16 | caretOffset | shift for slanted highlight; 0 for non-slanted |
[4]int16 | (reserved) | set to 0 |
int16 | metricDataFormat | 0 |
uint16 | numberOfHMetrics | sizes the hmtx table |
6.4 hmtx — horizontal metrics (variable)
Stores the advance width and left side bearing (LSB) for every glyph. The
advance width is how far the cursor moves after drawing the glyph; the LSB is
the distance from the glyph’s origin to the left edge of its bounding box.
The table has no header — it is two back-to-back arrays whose sizes come from
hhea.numberOfHMetrics and maxp.numGlyphs.
Two back-to-back arrays:
hMetrics:numberOfHMetricsentries (fromhhea), each:Type Name uint16advanceWidth int16lsb (left side bearing) leftSideBearings:numGlyphs − numberOfHMetricsentries ofint16.
The trailing array lets monospaced runs share one advance width: glyphs past
numberOfHMetrics reuse the last advance width and store only their side
bearing.
6.5 loca — index to location (variable)
Maps each glyph index to the byte offset of its outline data inside the glyf
table. Without loca you cannot find where any glyph starts. It stores
numGlyphs + 1 offsets — the extra entry marks the end of the last glyph, so
each glyph’s length is simply loca[i+1] − loca[i]. The entry width (2 or 4
bytes) is set by head.indexToLocFormat.
numGlyphs + 1 offsets into glyf. The + 1 exists so each glyph’s length
is loca[i+1] − loca[i]; the final entry marks the end of the last glyph.
- Short (
indexToLocFormat == 0):numGlyphs+1 × Offset16(uint16). The stored value is the real offset ÷ 2 — multiply by 2 when reading. This works because glyph data is 2-byte aligned (every real offset is even) and it doubles auint16’s reach to ~128 KB. - Long (
indexToLocFormat == 1):numGlyphs+1 × Offset32(uint32), the real offset directly.
If loca[i] == loca[i+1], glyph i has no outline (e.g. the space). This
is common — handle it.
6.6 glyf — glyph data (variable, the hard one)
Contains the actual outline data for every glyph — the contours and control
points that the rasterizer turns into pixels. It has no table header; it is
just glyph blocks packed back to back, located entirely by loca. Each block
starts with a common 10-byte header that tells you whether the glyph is simple
(contours) or composite (references to other glyphs), followed by format-specific
data. This is the most complex table to parse.
No table header. Just glyph blocks back to back, located by loca. Each block
starts with a common header:
| Type | Name | Notes |
|---|---|---|
int16 | numberOfContours | ≥ 0 → simple glyph; < 0 (use −1) → composite glyph |
int16 | xMin | glyph bounding box |
int16 | yMin | |
int16 | xMax | |
int16 | yMax |
Simple glyph (numberOfContours ≥ 0)
| Type | Name | Notes |
|---|---|---|
uint16 | endPtsOfContours[numberOfContours] | last value + 1 = total point count |
uint16 | instructionLength | bytes of hinting that follow |
uint8 | instructions[instructionLength] | TrueType hinting bytecode (can pass through opaquely) |
uint8 | flags[…] | one logical flag per point, compressed (below) |
| — | xCoordinates[…] | delta-encoded, width per flags |
| — | yCoordinates[…] | delta-encoded, width per flags |
Total point count = endPtsOfContours[last] + 1. You need it to know how
many flags and coordinates to read.
Flag bits (one uint8 per point, logically):
| Bit | Name | Meaning |
|---|---|---|
0x01 | ON_CURVE_POINT | point is on the curve (vs an off-curve Bézier control point) |
0x02 | X_SHORT_VECTOR | x delta is 1 byte (else 2 bytes or 0) |
0x04 | Y_SHORT_VECTOR | y delta is 1 byte |
0x08 | REPEAT_FLAG | the next byte is a repeat count; repeat this flag that many additional times |
0x10 | X_IS_SAME_OR_POSITIVE_X_SHORT_VECTOR | dual meaning (below) |
0x20 | Y_IS_SAME_OR_POSITIVE_Y_SHORT_VECTOR | dual meaning |
0x40 | OVERLAP_SIMPLE | contours may overlap |
0x80 | reserved | set to 0 |
The flag array is compressed: when REPEAT_FLAG (0x08) is set, the following
byte gives how many extra points share that flag. Read flags in a loop until
you have accumulated pointCount of them, expanding repeats as you go.
Coordinate decoding (x shown; y identical with 0x04/0x20). Coordinates
are stored as deltas from the previous point (the first point’s delta is
from 0), and all x deltas come first, then all y deltas:
X_SHORT_VECTOR(0x02) set → x delta is 1 byte (uint8); its sign is given by0x10: set ⇒ positive, clear ⇒ negative.X_SHORT_VECTORclear:0x10set ⇒ delta is 0 (this x equals the previous x).0x10clear ⇒ x delta is a 2-byteint16.
Accumulate deltas to get absolute coordinates.
Composite glyph (numberOfContours < 0)
A loop of components, each referencing another glyph:
| Type | Name | Notes |
|---|---|---|
uint16 | flags | component flags (below) |
uint16 | glyphIndex | the component glyph’s id |
| arg1, arg2 | — | int8/uint8 or int16/uint16 per ARG_1_AND_2_ARE_WORDS; if ARGS_ARE_XY_VALUES, signed placement offsets, else point-matching indices |
| transform | — | 0, 1, 2, or 4 × F2Dot14 depending on the scale flags |
Component flag bits:
| Bit | Name | Meaning |
|---|---|---|
0x0001 | ARG_1_AND_2_ARE_WORDS | args are 16-bit (else 8-bit) |
0x0002 | ARGS_ARE_XY_VALUES | args are offsets (else point indices) |
0x0004 | ROUND_XY_TO_GRID | |
0x0008 | WE_HAVE_A_SCALE | one F2Dot14 uniform scale follows |
0x0020 | MORE_COMPONENTS | another component follows; loop |
0x0040 | WE_HAVE_AN_X_AND_Y_SCALE | two F2Dot14 follow |
0x0080 | WE_HAVE_A_TWO_BY_TWO | four F2Dot14 (2×2 matrix) follow |
0x0100 | WE_HAVE_INSTRUCTIONS | after the last component: uint16 length + instruction bytes |
0x0200 | USE_MY_METRICS | |
0x0400 | OVERLAP_COMPOUND |
Loop while MORE_COMPONENTS is set.
6.7 cmap — character-to-glyph mapping (variable)
Maps Unicode character codes (or other encodings) to glyph indices. It is the bridge between text and outlines — without it you cannot know which glyph to draw for a given character. The table contains multiple subtables for different platform/encoding combinations; you pick the best one for your use case (typically the Windows Unicode BMP subtable, format 4) and use only that.
Header:
| Type | Name |
|---|---|
uint16 | version (0) |
uint16 | numTables |
Then numTables encoding records:
| Type | Name | Notes |
|---|---|---|
uint16 | platformID | 0=Unicode, 1=Mac, 3=Windows |
uint16 | encodingID | platform-specific |
Offset32 | subtableOffset | from the start of the cmap table |
Common pairings: (3,1) Windows BMP Unicode → usually format 4; (3,10)
Windows full Unicode → format 12; (0,*) Unicode; (1,0) Mac Roman →
format 0. Pick the best subtable, then parse by its format word:
- Format 0 — byte encoding:
format,length,language, thenuint8 glyphIdArray[256]. - Format 4 — segment mapping (BMP, the workhorse):
format,length,language,segCountX2,searchRange,entrySelector,rangeShift, then parallel arraysendCode[segCount], a reserved pad,startCode[segCount],idDelta[segCount],idRangeOffset[segCount], and a trailingglyphIdArray. - Format 6 — trimmed table: a contiguous range of codes.
- Format 12 — segmented coverage (full Unicode):
format(12),reserved,uint32 length,uint32 language,uint32 numGroups, then groups of{ uint32 startCharCode; uint32 endCharCode; uint32 startGlyphID }.
6.8 name — human-readable strings (variable)
Stores all the font’s human-readable text strings: family name, subfamily, full name, version string, copyright notice, trademark, and more. Each string is stored once in a raw byte pool at the end of the table, and a list of records points into that pool with platform, encoding, language, and name ID metadata. For most use cases you want name ID 1 (family), 2 (subfamily), and 4 (full name), platform 3 (Windows), decoded as UTF-16BE.
| Type | Name | Notes |
|---|---|---|
uint16 | version | 0 or 1 |
uint16 | count | number of name records |
Offset16 | storageOffset | start of string storage, from the table start |
Then count name records:
| Type | Name | Notes |
|---|---|---|
uint16 | platformID | |
uint16 | encodingID | |
uint16 | languageID | |
uint16 | nameID | what the string is (1=family, 2=subfamily, 4=full name, 5=version, …) |
uint16 | length | string length in bytes |
Offset16 | stringOffset | from storageOffset |
(Version 1 adds a langTagCount + language-tag records before the storage.)
Then the raw string bytes, encoded per platform — usually UTF-16BE for
platform 3.
6.9 post — PostScript data (variable, version-dependent)
Contains PostScript-specific metadata: the italic angle, whether the font is
monospaced (isFixedPitch), and optionally a mapping from glyph index to
PostScript glyph name. The 32-byte header is always present; the glyph name
array only appears in version 2.0. For most transformation work you only need
isFixedPitch and italicAngle.
Fixed 32-byte header:
| Type | Name |
|---|---|
Version16Dot16 | version |
Fixed | italicAngle |
FWORD | underlinePosition |
FWORD | underlineThickness |
uint32 | isFixedPitch |
uint32 | minMemType42 |
uint32 | maxMemType42 |
uint32 | minMemType1 |
uint32 | maxMemType1 |
- 1.0 (
0x00010000): header only (standard Mac glyph names assumed). - 2.0 (
0x00020000): header +uint16 numGlyphs+uint16 glyphNameIndex[numGlyphs]+ Pascal-string names for indices ≥ 258. - 2.5: deprecated. 3.0 (
0x00030000): header only, no names.
6.10 OS/2 — OS/2 and Windows metrics (fixed per version)
The primary source of font classification metadata for Windows and cross-platform renderers. It holds the weight class (thin → black), width class (condensed → expanded), typographic ascender/descender/line gap, Windows-specific ascender/descender, Unicode and code-page coverage bitmaps, the PANOSE classification, and embedding permission flags. This is the table that bold and width transformations modify. The version word determines how many trailing fields exist. Zero-pad to the full struct size when reading older versions so missing fields default to 0.
Version 0 (78 bytes, all versions):
| Type | Name | Notes |
|---|---|---|
uint16 | version | 0–5 |
int16 | xAvgCharWidth | weighted average advance width of lower-case chars |
uint16 | usWeightClass | 100–900 (matches CSS font-weight) |
uint16 | usWidthClass | 1–9, condensed → expanded |
uint16 | fsType | embedding permission flags |
int16 | ySubscriptXSize | |
int16 | ySubscriptYSize | |
int16 | ySubscriptXOffset | |
int16 | ySubscriptYOffset | |
int16 | ySuperscriptXSize | |
int16 | ySuperscriptYSize | |
int16 | ySuperscriptXOffset | |
int16 | ySuperscriptYOffset | |
int16 | yStrikeoutSize | |
int16 | yStrikeoutPosition | |
int16 | sFamilyClass | IBM font-family classification |
uint8[10] | panose | 10-byte PANOSE classification |
uint32 | ulUnicodeRange1 | Unicode block coverage bits 0–31 |
uint32 | ulUnicodeRange2 | bits 32–63 |
uint32 | ulUnicodeRange3 | bits 64–95 |
uint32 | ulUnicodeRange4 | bits 96–127 |
Tag | achVendID | 4-char vendor identifier |
uint16 | fsSelection | style flags (italic, bold, regular, …) |
uint16 | usFirstCharIndex | lowest Unicode codepoint in the font |
uint16 | usLastCharIndex | highest Unicode codepoint in the font |
int16 | sTypoAscender | typographic ascender (FUnits) |
int16 | sTypoDescender | typographic descender (FUnits, usually negative) |
int16 | sTypoLineGap | typographic line gap (FUnits) |
uint16 | usWinAscent | Windows ascender metric |
uint16 | usWinDescent | Windows descender metric (positive value) |
Version 1 adds (86 bytes total):
| Type | Name | Notes |
|---|---|---|
uint32 | ulCodePageRange1 | code-page coverage bits 0–31 |
uint32 | ulCodePageRange2 | bits 32–63 |
Version 2 / 3 / 4 add (96 bytes total):
| Type | Name | Notes |
|---|---|---|
int16 | sxHeight | height of lowercase ‘x’ (FUnits) |
int16 | sCapHeight | height of uppercase ‘H’ (FUnits) |
uint16 | usDefaultChar | glyph index for the default character |
uint16 | usBreakChar | glyph index for the word-break char |
uint16 | usMaxContext | max length of target glyph context |
Version 5 adds (100 bytes total):
| Type | Name | Notes |
|---|---|---|
uint16 | usLowerOpticalPointSize | lower optical size, ×20 |
uint16 | usUpperOpticalPointSize | upper optical size, ×20 |
6.11 kern — kerning (optional, variable)
Stores spacing adjustments between specific pairs of glyphs. Where hmtx
gives every glyph a single advance width, kern lets you say “when an ‘A’
is followed by a ‘V’, pull them 40 units closer together.” It is optional —
many modern fonts omit it and use GPOS instead for more powerful contextual
kerning. When present, it is a sequence of subtables each covering a set of
pairs.
Optional and historically messy — Apple and Microsoft define incompatible
headers. Many fonts omit kern entirely (modern kerning lives in GPOS).
The Microsoft format (the one to target for OpenType-era fonts):
Table header:
| Type | Name | Notes |
|---|---|---|
uint16 | version | 0 |
uint16 | nTables | number of subtables that follow |
Per subtable header:
| Type | Name | Notes |
|---|---|---|
uint16 | version | subtable version (0) |
uint16 | length | total subtable length in bytes (including this header) |
uint16 | coverage | high byte = format (0 or 2); low byte = flags (see below) |
Coverage flags (low byte): bit 0 = horizontal, bit 1 = minimum, bit 2 = cross-stream, bit 3 = override.
Format 0 (sorted kern pairs — the common case):
| Type | Name | Notes |
|---|---|---|
uint16 | nPairs | number of kern pairs |
uint16 | searchRange | (largest power of 2 ≤ nPairs) × 6 |
uint16 | entrySelector | log2(largest power of 2 ≤ nPairs) |
uint16 | rangeShift | nPairs × 6 − searchRange |
Then nPairs entries, each 6 bytes:
| Type | Name | Notes |
|---|---|---|
uint16 | left | left glyph index |
uint16 | right | right glyph index |
int16 | value | kern adjustment in FUnits |
Pairs are sorted by (left << 16) | right for binary search. Loop over
subtables while there are bytes remaining.
7. Sources
- Microsoft OpenType specification — the definitive modern reference; it incorporates and supersedes the original TrueType spec: https://learn.microsoft.com/en-us/typography/opentype/spec/
- Apple TrueType Reference Manual — the original format, authoritative for TrueType outlines: https://developer.apple.com/fonts/TrueType-Reference-Manual/
- ISO/IEC 14496-22 “Open Font Format” — the ISO standardization, freely downloadable from ISO; equivalent to the Microsoft spec.
When the two disagree on edge cases, treat the Microsoft spec as canonical for OpenType-era fonts.
Per-table pages, all under
https://learn.microsoft.com/en-us/typography/opentype/spec/:
| Table | Page | Table | Page | |
|---|---|---|---|---|
| structure | otff | cmap | cmap | |
head | head | name | name | |
maxp | maxp | post | post | |
hhea | hhea | OS/2 | os2 | |
hmtx | hmtx | kern | kern | |
loca | loca | glyf | glyf |