rlox: A Rust Implementation of “Crafting Interpreters” – Global Variables, Assignment, and Scope | behai-nguyen software development learnings and documentation

I have completed Chapter 8: Statements and State. The following additional statements and expressions have been implemented: Stmt::Expression, Stmt::Print, Stmt::Var, Expr::Variable, Expr::Assign and Stmt::Block. We can now declare global variables, define scoped variables, and assign values to variables. This post discusses some implementation issues that deserve attention.

🦀 Index of the Complete Series.


rlox: A Rust Implementation of “Crafting Interpreters” – Global Variables, Assignment, and Scope

🚀 Note: You can download the code for this post from GitHub with:

git clone -b v0.3.0 https://github.com/behai-nguyen/rlox.git

❶ Running the CLI Application

💥 The interactive mode is still available. However, valid expressions—such as ((4.5 / 2) * 2) == 4.50;—currently produce no output. I’m not sure when interactive mode will be fully restored, and it’s not a current priority.

For now, Lox scripts can be run using the CLI application. For example:

▶️Windows: cargo run .\tests\data\variable\shadow_global.lox
▶️Ubuntu: cargo run ./tests/data/variable/shadow_global.lox

Content of shadow_global.lox:

var a = "global";
{
  var a = "shadow";
  print a; // expect: shadow
}
print a; // expect: global

For more information, please refer to this section of the README.md.

❷ Updated Repository Layout

Legend: ★ = updated, ☆ = new.

💥 Unmodified files are omitted for brevity.

.
├── docs
│   └── RLoxGuide.md ★
├── README.md ★
├── src
│   ├── data_type.rs ☆
│   ├── environment.rs ☆
│   ├── expr.rs ★
│   ├── interpreter.rs ★
│   ├── lib.rs ★
│   ├── lox_error_helper.rs ☆
│   ├── lox_error.rs ★
│   ├── main.rs ★
│   ├── parser.rs ★
│   ├── scanner.rs ★
│   ├── stmt.rs ★
│   └── token.rs ★
├── tests
│   ├── data/ ☆ ➜ around 11 directories & 88 files
│   ├── README.md ☆
│   ├── test_common.rs ★
│   ├── test_interpreter.rs ★
│   ├── test_parser.rs ★
│   ├── test_scanner.rs ★
│   └── test_statements_state.rs ☆
└── tool
    └── generate_ast
        └── src
            └── main.rs ★

❸ Catching Up on the Author’s Test Scripts

As noted in the first post of this project, I initially used the author’s test scripts to verify the code after completing the scanner in Chapter 4: Scanning. However, during the next three chapters, I missed a large portion of the test scripts. There are quite a few of them, and here are some observations:

They are not organised by chapter.
Within each subdirectory, scripts can apply to multiple chapters.
Each script needs to be examined individually, and we must use our own judgment to assess its suitability for the current stage of implementation.

While working through Chapter 8: Statements and State, I realised I had missed many scripts that should have been used in Chapter 4, Chapter 6: Parsing Expressions, and Chapter 7: Evaluating Expressions.

I retrofitted tests for around 70 scripts. Please refer to the tests/data/ area directory in the repository:

For each subdirectory, only the scripts actually used are checked in.
Each subdirectory contains a short README.md file that lists which scripts are used by which test modules.

As a result, the following test modules have been updated:

tests/test_common.rs — Writing a test method for every script would be overwhelming due to the large number of scripts. This helper module includes new types and utility methods to facilitate semi-automatic testing. The additions are straightforward and should be mostly self-explanatory. Some of them are discussed in later sections.
tests/test_scanner.rs — A single new method, test_scanner_generics() method, has been added. It uses just one script, but still follows the semi-automatic approach. This module may be the easiest place to start examining the new testing logic.
tests/test_parser.rs — Three new test methods have been added.
tests/test_interpreter.rs — A single new method, test_interpreter_expr() method, has been added, but it covers quite a few scripts.

💥 I assume that by the end of Part II of the book, I should have tests that exercise all of the author’s scripts. Otherwise, the test suite would be incomplete.

❹ Interpreter Refactoring

The implementation of the Interpreter struct is responsible for producing output. During testing, we often need to capture this output—sometimes spanning multiple lines—in order to compare it against the expected results. I prefer not to use any third-party crate, and at the same time, I don’t want the Interpreter to behave like a fully featured writer.

Instead, we want the caller to specify the desired output destination. These “destinations” are objects that implement the Write trait. The Interpreter simply delegates output writing to the specified destination. Relevant changes in the src/interpreter.rs module:

⓵ Refactoring the Interpreter Struct

...
pub struct Interpreter<W: Write> {
    output: W,
}

impl<W: Write> Interpreter<W> {
    pub fn new(output: W) -> Self {
        Interpreter { 
            output,
        }
    }

    pub fn get_output(&self) -> &W {
        &self.output
    }

    fn write_output(&mut self, value: &str) {
        writeln!(self.output, "{}", value).expect("Failed to write output");
    }
	...
}	

⓶ New Implementation of the Stmt::Print Statement

281
282
283
284
285
286
287
288
289
290
    fn visit_print_stmt(&mut self, stmt: &Print) -> Result<(), LoxError> {
        let value: Value = self.evaluate(stmt.get_expression())?;

        // Note from the author in the original Java version:
        //     Before discarding the expression’s value, we convert it to a 
        //     string using the stringify() method we introduced in the last 
        //     chapter and then dump it to stdout.
        self.write_output(&self.stringify(&value));
        Ok(())
    }

💥 Note that the first parameter — &mut self — must now be mutable, which is a significant refactoring that will be discussed further in the Expressions and Statements section.

⓷ Using the Interpreter in the CLI Application — Writing to Standard Output

In the CLI application, the src/main.rs run() method passes io::stdout() as the output destination:

37
let mut interpreter = Interpreter::new(io::stdout());

All output is written directly to the terminal.

⓸ Using the Interpreter in Tests — Writing to a Byte Stream

The tests/test_common.rs module defines the public function make_interpreter():

179
180
181
pub fn make_interpreter<W: std::io::Write>(writer: W) -> Interpreter<W> {
    Interpreter::new(writer)
}

In the existing tests/test_interpreter.rs and new tests/test_statements_state.rs modules, an Interpreter instance is created with a byte stream as the output destination:

let mut interpreter = make_interpreter(Cursor::new(Vec::new()));

After execution, the output can be extracted as a list of strings using the extract_output_lines() function in tests/test_common.rs:

184
185
186
187
188
189
190
pub fn extract_output_lines(interpreter: &Interpreter<Cursor<Vec<u8>>>) -> Vec<String> {
    let output = interpreter.get_output().clone().into_inner();
    String::from_utf8(output).unwrap()
        .lines()
        .map(|line| line.to_string())
        .collect()
}

❺ Expressions and Statements Refactoring

Enabling output delegation requires the Interpreter instance to be mutable, as discussed earlier. This, in turn, necessitates the following changes in the implementation of expressions and statements:

The first parameter of the Visitor<T> trait's visit_()* methods must now be mutable — &mut self — in both the expr.rs and stmt.rs modules.
The accept() method for both Expr and Stmt now requires both parameters to be mutable. This led to the introduction of accept_ref(), where &self is not mutable. Callers can invoke the appropriate method depending on the context. See the changes in expr.rs and stmt.rs.
As previously discussed, both expr.rs and stmt.rs are generated by a standalone tool. The relevant parts of this tool have been updated to produce the desired changes. Reference.

❻ Environment Implementation, Usage, and Unit Tests

⓵ Multiple Ownership and Mutability

The initial part of the implementation—up to the Assignment Semantics section—is straightforward and consists mainly of struct definitions. However, starting from the Scope section, particularly Nesting and Shadowing, the Environment becomes a Parent pointer tree. At this point, managing ownership of the global Environment instance— held by the Interpreter—becomes more complex: we require both multiple ownership and interior mutability.

To address this, Rc<T> and RefCell<T> are used to manage Environment instances. For more information, see the relevant sections of The Book:

⓶ Implementation Overview

The full implementation can be found in src/environment.rs module. Here’s the struct declaration and constructors:

30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
83
pub type EnvironmentRef = Rc<RefCell<Environment>>;

pub struct Environment {
    values: ValuesMap,
    enclosing: Option<EnvironmentRef>,
}

impl Environment {
    // https://craftinginterpreters.com/statements-and-state.html#nesting-and-shadowing
    // The global scope’s environment.
    pub fn new() -> Self {
        Environment {
            values: HashMap::new(),
            enclosing: None,
        }
    }

    // https://craftinginterpreters.com/statements-and-state.html#nesting-and-shadowing
    // Creates a new local scope nested inside the given outer one.
    pub fn new_local_scope(enclosing: EnvironmentRef) -> Self {
        Environment {
            values: HashMap::new(),
            enclosing: Some(enclosing),
        }
    }
    ...
}

The enclosing field represents the parent environment. If there is no parent (i.e. global scope), it is set to None, which is why its type is Option<EnvironmentRef>. The new_local_scope() constructor is used inside Interpreter to create nested environments for block scopes.

⓷ Usage

The Environment is used in src/interpreter.rs:

28
29
30
31
32
33
34
35
36
37
38
39
40
41
pub struct Interpreter<W: Write> {
    output: W,
    // The variable global scope.
    environment: EnvironmentRef,
}

impl<W: Write> Interpreter<W> {
    pub fn new(output: W) -> Self {
        Interpreter { 
            output,
            environment: Rc::new(RefCell::new(Environment::new())),
        }
    }
    ...

The Interpreter initialises the global environment. When encountering a Stmt::Block, it creates a new environment for the block scope:

253
254
255
256
257
258
259
260
261
262
263
impl<W: Write> stmt::Visitor<()> for Interpreter<W> {
    fn visit_block_stmt(&mut self, stmt: &Block) -> Result<(), LoxError> {
        
        let new_env = Rc::new(RefCell::new(
            Environment::new_local_scope(Rc::clone(&self.environment))
        ));
        self.execute_block(stmt.get_statements(), new_env)?;

        Ok(())
    }
    ...

Replacing and restoring the environment is implemented as:

58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
    // See: https://craftinginterpreters.com/statements-and-state.html#scope
    pub fn execute_block(&mut self, statements: &[Stmt], 
        new_env: EnvironmentRef) -> Result<(), LoxError> {
        let previous = std::mem::replace(&mut self.environment, new_env);

        let result = (|| {
            for stmt in statements {
                self.execute(stmt)?;
            }
            Ok(())
        })();

        self.environment = previous;
        result
    }
	...

std::mem::replace() safely swaps the current environment. If self.execute(stmt) returns an error, the closure immediately returns Err(...), otherwise it returns Ok(()). In both cases, the previous environment is restored, ensuring self.environment = previous; is always executed—effectively mimicking a try...finally construct.

⓸ Unit Tests

There is a comprehensive set of unit tests for this module. To run them, use:

cargo test environment::tests

❼ What’s Next

I apologise that this post is a bit long. I feel the need to document the problems encountered during development to explain the rationale behind the code. There are still five more chapters to go until Part II is complete. At this point, I’m a little more certain that I’m going to see this project through to the end of Part II.

There are still warnings about dead code—these I’m happy to ignore for now.

Thank you for reading! I hope this post helps others following the same journey. As always—stay curious, stay safe 🦊

✿✿✿

Feature image sources: