I am attempting a Rust implementation of Robert Nystrom’s Lox language discussed in Crafting Interpreters. This post describes my Rust code equivalence for the Scanning chapter.

🦀 Index of the Complete Series.

139-feature-image.png
rlox: A Rust Implementation of “Crafting Interpreters” – Scanner

This is the long list of existing Rust Lox Implementations. I downloaded and ran the first two, but I did not have a look at the code. I would like to take on this project as a challenge. If I complete it, I want it to reflect my own independent effort.

🚀 Please note, code for this post can be downloaded from GitHub with:

git clone -b v0.1.0 https://github.com/behai-nguyen/rlox.git

● To run interactively, first change to the rlox/ directory, then run the following command:

$ cargo run

Enter something like var str2 = "秋の終わり";, and press Enter — you will see the tokens printed out. Please refer to the screenshot below for an illustration.

139-01.png

At the moment, inputs are processed independently, meaning each new input does not retain any connection to previous inputs.

To exit, simply press Enter without entering anything.

● To Run with a Lox script file, first change to the rlox/ directory, then run the following command:

$ cargo run ./tests/data/scanning/numbers.lox

If there are no errors, you will see the tokens printed out.

● To run existing tests, first change to the rlox/ directory, then run the following command:

$ cargo test

❶ Repository Layout

.
├── Cargo.toml
├── README.md
├── src
│   ├── lib.rs
│   ├── lox_error.rs
│   ├── main.rs
│   ├── scanner_index.rs
│   ├── scanner.rs
│   ├── token.rs
│   └── token_type.rs
└── tests
    ├── data
    │   └── scanning
    │       ├── identifiers.lox
    │       ├── keywords.lox
    │       ├── numbers.lox
    │       ├── punctuators.lox
    │       ├── README.md
    │       ├── sample.lox
    │       ├── strings.lox
    │       ├── utf8_text.lox
    │       └── whitespace.lox
    ├── test_common.rs
    └── test_scanner.rs

❷ Let’s briefly describe the project.

● Identifier names follow Rust convention. In the Scanning chapter, method names such as scanTokens(), peekNext() are scan_tokens() and peek_next() in Rust respectively.

● Identifier names which are keywords in Rust will simply have an underscore (_) suffix appended. For example, match() becomes match_(), and type becomes type_.

● The src/scanner_index.rs module is not in the original Java version. It implements the Java variables start, current, line and some additional fields to support UTF-8 text scanning and slicing; please refer to this post for a full discussion on supporting UTF-8 text slicing.

● In the src/token.rs module, I am not sure if we need the literal field in the Token struct in the future. I leave it for the time being.

● 💥 In the src/scanner.rs module, the method scan_tokens() returns an array (vector) of Token; and the run() function in the src/main.rs module consumes this array and drops it. This array is local. In the Java implementation, it is a global class variable. This implementation might change in the future.

● The src/lox_error.rs module is also not in the original Java version. It implements a Rust specific error struct.

● Under tests/data/scanning/ directory, except for utf8_text.lox which is mine; the README.md lists the original addresses of all other test data files.

● The tests/test_scanner.rs module implements test for each of the test data files in the tests/data/scanning/ directory.

❸ The above points are specific to this implementation, otherwise the code adhere to Crafting Interpreters, chapter Scanning.

Thank you for reading. I hope you find this post helpful. Stay safe, as always.

✿✿✿

Feature image sources:

🦀 Index of the Complete Series.