At the conclusion of the fourth article, we mentioned that we would explore text layout using Pango and its associated libraries: GNU FriBidi and CairoGraphics. In the fifth post, we described how to build and install these libraries on both Ubuntu and Windows. In this article, we finally explore them in practice. The focus is on true justification — that is, both the left and right margins are flush, not ragged, much like the TeX typesetting system. We use the Vietnamese text file from the fourth article to generate a PDF of more than 150 pages.

🦀 Index of the Complete Series.

156-feature-image.png
Rust: PDFs — Exploring Layout with Pango and Cairo

🚀 The code for this post is in the following GitHub repository: pdf_03_pango.

We Are Not Using the lopdf Crate

The Pango and Cairo libraries, along with their Rust bindings, are fully capable of creating multilingual PDFs — including font subsetting and preserving text for copy‑and‑paste. We do not need to implement any of that ourselves. However, PDFs generated by these libraries often contain more internal objects than those created with lopdf, resulting in larger file sizes. For the text file we are using, the resulting PDF is nearly 600 KB, almost twice the size of the PDF produced with lopdf.

The Pango and CairoGraphics Crates

The code we study in this article needs to use the following crates: cairo-rs, pango-sys, pango, and pangocairo. We rely on features across all of these crates to achieve what we need. The final code uses every one of them. The crates with the -sys suffix expose the low‑level unsafe FFI APIs, while the others provide safe‑Rust abstractions.

Units and Coordinates

When using Pango and CairoGraphics, we work with different units and coordinate systems. It is essential to understand these differences; otherwise the code will look confusing.

PDF Units and Coordinates

We covered this in detail in an earlier article. Here is a quick recap:

● A PDF unit is a PostScript point. 1 point = 1/72 inch exactly.
1 inch = 25.4 mm.

0, 0 — the bottom‑left corner of the page.

595.22, 842 — the top‑right corner of an A4 page.

We will not work with PDF coordinates directly, but it is important to keep them in mind.

Pango Units and Coordinates

Pango uses its own unit system. To convert a PDF unit to a Pango unit, multiply by pango::SCALE. To convert from Pango units back to PDF units, divide by the same constant.

Pango uses screen-style coordinates:

0, 0 — the top‑left corner.

Cairo Units and Coordinates

Cairo uses device units, which for PDF surfaces are the same as PDF units (PostScript points).

Regarding Cairo’s coordinate system:

● It is top‑left for image surfaces.

● It is bottom‑left for PDF surfaces. However, PangoCairo applies the necessary transformation so that we can work in a top‑left coordinate system.

For simplicity, we can think of its coordinate system as matching Pango — top‑left origin:

0, 0 — the top‑left corner.

Repository Layout

💡 Please note: on both Windows and Ubuntu, I’m running Rust version rustc 1.90.0 (1159e78c4 2025-09-14).

This is once again a one‑off project—I don’t plan to update it in future development. I want to keep a log of progress exactly as it occurred. Future code may copy this and make changes to it. I’ve placed the project under the pdf_03_pango directory. The structure is:

.
├── build.rs
├── Cargo.toml
├── set_env.bat
├── src
│   ├── main_start.rs
│   ├── main_fully_justified.rs
│   ├── main_essay.rs
│   ├── main.rs
│   └── page_geometry.rs
└── text
    └── essay.txt

All four main*.rs modules under src/ are self‑contained Rust programs written in the listed order to help me understand Pango and Cairo layout. The text/essay.txt file is the Vietnamese input text, which the last two main*.rs modules convert into a PDF document. We also used this text file in the fourth article. We discuss these modules in the listed order.

The src/page_geometry.rs is also copied from the fourth post, with some minor refactoring, all f32 changed to f64. Please refer to this discussion for more detail. 👉 Changing value for any margin in the A4_DEFAULT_MARGINS constant would change the layout of the text in the PDF.

💡 The code requires the Pango, HarfBuzz, Cairo, etc. libraries. 🐧 On Ubuntu, all required libraries are globally recognised. 🪟 On Windows, I haven’t added the paths for libraries’ DLLs to the PATH environment variable. In each new Windows terminal session, I run the following once:

set PATH=C:\PF\harfbuzz\dist\bin\;%PATH%
set PATH=C:\PF\vcpkg\installed\x64-windows\bin\;%PATH%
set PATH=C:\PF\pango\dist\bin;C:\PF\cairo-1.18.4\dist\bin;C:\PF\fribidi\dist\bin;%PATH%

Alternatively, you can simply run set_env.bat.
After that, cargo run works as expected.

The pdf_03_pango/build.rs Module and Visual Studio Code rust-analyzer On Windows 🪟

The pdf_03_pango/build.rs Module

The crates we are using already ship with FFI definitions, so we do not need to generate a bindings.rs module as we did for HarfBuzz in earlier articles in this series. We only need to ensure that the linker can find the required .lib files. On Linux 🐧 this works out of the box, but on Windows 🪟 we must add the locations of these .lib files to the search paths.
These are the reasons why the pdf_03_pango/build.rs module is so simple: most of the code has appeared in earlier articles and should be self‑explanatory.

⓶ Visual Studio Code rust-analyzer On Windows 🪟</strong>

I am developing on Windows. The crates we are using are designed to discover native libraries automatically using pkg-config. If pkg-config cannot locate the .pc files for the relevant libraries, rust-analyzer reports many errors, becomes unresponsive, and IDE autocomplete stops working.

To fix this, open the Windows Environment Variables editor and, under User variables for <logged user>, add a new variable PKG_CONFIG_PATH whose value includes:

C:\PF\pango\dist\lib\pkgconfig
C:\PF\cairo-1.18.4\dist\lib\pkgconfig
C:\PF\vcpkg\installed\x64-windows\lib\pkgconfig
C:\PF\harfbuzz\dist\lib\pkgconfig
C:\PF\fribidi\dist\lib\pkgconfig

🙏 These paths reflect my installation. Your paths may differ, so adjust them accordingly.

After restarting Visual Studio Code, all errors disappear and rust-analyzer becomes responsive again.

The pdf_03_pango/src/main_start.rs Module

The code starts by creating a cairo::PdfSurface. We need to enable the pdf feature for the crate as follows:

16
cairo-rs = { version = "0.21.5", features = ["pdf"] }

A4 page width and height are in PostScript points, as discussed earlier.

Next, we create a cairo::Context. We then create a pango::Layout using the function pub fn pangocairo::create_layout(cr: &Context) -> Layout, which is where the text will be rendered.

Regarding the parameter passed to layout.set_width((a4_default_content_width() * pango::SCALE as f64) as i32);, please refer to Pango Units and Coordinates. For cr.move_to(A4_DEFAULT_MARGINS.left, A4_DEFAULT_MARGINS.top);, see Cairo Units and Coordinates.

For other pangocairo functions, please see the pangocairo API listing.

We must specify an appropriate font for pango::Layout using pango::FontDescription; otherwise Pango will choose a default font, which is unlikely to be correct for our text. We must also set alignment and wrap mode using pango::Alignment and pango::WrapMode.

In the code above, we can uncomment the longer multiline text and experiment with different alignments to observe how the layout changes. The text is right‑ragged with Alignment::Left and left‑ragged with Alignment::Right. To make both the left and right margins flush, we must use justification.

On Pango Justification

The Pango library implements the pango_layout_set_justify() function https://docs.gtk.org/Pango/method.Layout.set_justify.html:

void pango_layout_set_justify (
  PangoLayout* layout,
  gboolean justify
)

I am reproducing its official documentation below:

Description

Sets whether each complete line should be stretched to fill the entire width of the layout.

Stretching is typically done by adding whitespace, but for some scripts (such as Arabic), the justification may be done in more complex ways, like extending the characters.

Note that this setting is not implemented and so is ignored in Pango older than 1.18.

Note that tabs and justification conflict with each other: Justification will move content away from its tab-aligned positions.

The default value is FALSE.

Also see pango_layout_set_justify_last_line().

Parameters

justify

Type: gboolean

Whether the lines in the layout should be justified.


The pango_sys crate imports this function as:

pub unsafe extern "C" fn pango_layout_set_justify(
    layout: *mut PangoLayout,
    justify: gboolean,
)

To call this function, we need to obtain the raw C pointer *mut PangoLayout from the safe-Rust pango::Layout wrapper.

The ToGlibPtr trait provides the necessary conversions from Rust wrapper types to C-compatible raw pointers for FFI calls. Since we do not want to take ownership of the underlying C object, the method fn to_glib_none(&'a self) -> Stash<'a, P, Self> is the appropriate choice. The returned Stash contains the raw pointer.

The following wrapper structs — thin layers around pointers to C objects — all implement ToGlibPtr: cairo::Context, pango::Layout, pango::FontDescription, and others. Therefore, to_glib_none() is available on all of them.

Calling pango::Layout::to_glib_none() returns Stash<*mut PangoLayout, Layout>.
The first element, *mut PangoLayout, is the raw pointer required by pango_layout_set_justify().
In the next module, we apply this function to achieve full justification.

The pdf_03_pango/src/main_fully_justified.rs Module

This module is a copy of pdf_03_pango/src/main_start.rs discussed above, with a few lines of full‑justification code added. There are two possible implementation approaches; both produce fully justified text.

The first approach is to use an unsafe FFI call, lines 44–46:

44
45
46
    unsafe {
        pango_layout_set_justify(layout.to_glib_none().0, 1); // 1 = TRUE
    }

The second approach is to define a new method fn set_justify(&self, justify: bool) on pango::Layout. Internally, this method calls the unsafe pango_layout_set_justify() function. This implementation appears in lines 17 to 27. We can then call the new set_justify() method as follows:

43
44
45
46
    layout.set_justify(true);
    /* unsafe {
        pango_layout_set_justify(layout.to_glib_none().0, 1); // 1 = TRUE
    } */

The screenshot below shows the output PDF:

156-01-full-justification.png

The pdf_03_pango/src/main_essay.rs and pdf_03_pango/src/main.rs Modules

In these two modules we read the Vietnamese text file and produce a PDF of more than 150 pages. The primary objective of the first module, pdf_03_pango/src/main_essay.rs, is to demonstrate page breaking. In the final module, pdf_03_pango/src/main.rs, we demonstrate adding page numbers.

💡 Important: Please keep the discussion on Units and Coordinates in mind while reading this section.

The pdf_03_pango/src/main_essay.rs Module

As before, this module is built upon pdf_03_pango/src/main_fully_justified.rs discussed above.
We begin by reading the entire text file into a string and giving the full text to layout. All natural blank lines are preserved by Pango.

Similar to the page‑generation loop in the fourth post, we iterate through the lines, calculate how many vertical PostScript points each line occupies, and increase the running vertical total accordingly. When the running total exceeds the effective page height — a4_default_content_height() — we flush the current page and start a new one. The code should be self‑explanatory.

Note also that, unlike the previous programs where we hardcoded the line height to 18 PostScript points, this module determines line_height as:

66
67
	let (_ink, logical) = line.extents();
	let line_height = logical.height() as f64 / pango::SCALE as f64;

This produces a more accurate result. It is also the correct approach when headers or other text use larger font sizes, since they naturally require more vertical space.

We observe the following about the PDF generated by this module:

● Visually, the font size appears larger than in the PDF generated in the
fourth post, even though we use the same font program and the same size.

● All URLs are subjected to the same line‑breaking calculations. In the
fourth post we did not account for URLs, so they simply disappeared off the right side.

● The screenshot below shows a single morpheme (tiếng) being broken across pages:

156-02-bad-break.png

In the paper Breaking Paragraphs into Lines, Knuth and Plass discuss this issue at length: such breaks are undesirable. I am not aware of any Pango settings that address this yet.

Next, we look at adding page numbers…

The pdf_03_pango/src/main.rs Module

As before, we build this module on top of pdf_03_pango/src/main_essay.rs, discussed above. All the existing code remains intact. We add two new helper methods: total_pages() and page_number().

The page_number() function prints the current page number onto the current page. The text’s Y coordinate is fixed. The X coordinate is calculated dynamically and depends on the width of the page‑number text (e.g., 1 of 155 vs 150 of 155). The midpoint of the page‑number text is aligned with the midpoint of the effective page width. Changing A4_DEFAULT_MARGINS.left and/or A4_DEFAULT_MARGINS.right will also change the X coordinate.

The total_pages() function calculates and returns the total number of pages after line breaking has been completed. This function is essentially a copy of the page‑generation loop; it should be self‑explanatory. I attempted to compute the total number of pages without iterating through the line collection, but the results were incorrect. This approach works, but it is slow. I am not yet sure whether a more efficient alternative exists.

The screenshot below shows the last page with a page number:

156-03-last-page-with-number.png

As mentioned previously, the size of these PDFs is much larger than the one generated using the HarfBuzz library and the lopdf crate. PDFXplorer shows that there are many more internal objects: for example, more metadata, more structure, and more glyph tables.

What’s Next

We have now completed an introductory exploration of Pango and CairoGraphics. I must admit, they are a joy to use. These libraries do not know anything about headers or similar structural elements; if we want to support those, we must implement them ourselves. I would like to explore this in the future.

I am not yet sure what my final goal with this study will be. One idea I find appealing is a reporting engine that generates business reports based on SQL queries, database connections, and a report template — though that is still a long way off. Another idea: Facebook often has information‑rich posts in both English and Vietnamese, and it would be useful to have a CLI tool that takes a Facebook post’s URL and produces a nicely formatted PDF. I may explore this as well. Having said all this, at this point I am still not sure what the next post will be.

Thanks for reading! I hope this post helps others who are looking to deepen their understanding of PDF technology. As always—stay curious, stay safe 🦊

✿✿✿

Feature image sources:

🦀 Index of the Complete Series.