Introduction

In this tutorial I plan to introduce some basic tutorial set of Rust tutorials for Data Science and Data Engineering.

To focus on this applied material, in this current initial version, I don't expand much extremely deeply into the Rust programming on the next sections, just focusing instead on how to install Rust, setup up your environment for in a productive and easier way to learn and then present some of the language basics. And then, in the next section, I point you to the great external resources out there in the wild where you can learn the language basics and then come back here a more applied knowledge.

Learning a new language or tool does need to be a linear process, and I don't recommend you to stick to a single material, but rather come and go back and forth to many sources, experiment, play around, and keep it fun.

Note: this tutorial is work in progress!

Installing Rust

So, now let's go though the first steps to start with Rust programming.

The recommended way to install Rust is following the instructions at https://rustup.rs/. On Linux (the best OS), this is simply running:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Configuring your editor

Rust is known for its good documentation and tooling, so taking advantage of that from the very beginning makes it a much more pleasant learning experience.

My recommended editor not only for Rust, but in general is VSCode (snap install code --classic if using Ubuntu). But even better, there is also an open-source version (free from Microsoft's VSCode telemetry) called VSCodium: install it with snap install codium --classic if using Ubuntu, see their official website for instructions for other OSes. Whatever the flavor you choose, I'll call it VSCode here from now on for simplicity.

After installation, the most important extension for it is the Rust Analyzer, which will get give the type of every variable in your code and the compilation errors (just after saving the files, even before you compile them!). I recommend also the following extensions:

  • Error Lens: show compile errors and warnings along the file instead of just the bottom of the editor.

  • Crates: will allow you to check that the version of your Rust crates (libraries) dependencies are up-to-date.

  • Even Better TOML: for validating, coloring, etc, the Cargo.toml files, where you store your Rust project definitions and dependencies.

  • CodeLLDB: for debugging Rust and other languages.

I also recommend setting up VSCode with Rust Analyzer with clippy (which will help maintain a better quality code) and format on save:

  • Go to File > Preferences > Settings and search for rust analyzer check command, change it from check to clippy.

  • Go to File > Preferences > Settings and search for format on save and check the box, this will automatically format your files when you save them.

A classic Hello World program

To start run:

bash cargo new rust-hello-world

Note that Cargo is the Rust package manager, which we basically use to compile everything in Rust.

Now open the folder rust-hello-world on VSCode (e.g.: run code rust-hello-world) and your main file will be main.rs in the src folder, while Cargo.toml is where your repository configurations will be (this is where you can add dependencies). By default, the initial code created there already is a Hello World program, so if you check main.rs, you will see:

fn main() {
    println!("Hello, world!");
}

And you can compile and run it with cargo run.

Some Cargo commands

Some other interesting Cargo commands worth mentioning already:

  • cargo build: will compile and create your executable file, but unlike cargo run, won't execute it.

  • cargo check: will just check if the code can compile or not, build not compile it (because compile large projects can take a relatively long time). The Rust Analyzer automatically does this for you on the background and display the errors along the file (with using the Error lens extension). It will also show you warnings for things in your compile that don't prevent compilation but are bad practices (e.g.: unused variables), cargo clippy is a superset of this with more extensive warnings.

  • cargo watch -x 'r': will automatically run cargo run every time your source code changes. A variation is cargo watch -c -x 'r' which will clean it the terminal screen output before running running cargo run. It doesn't come with Rust/Cargo by default, but you can easily with install it with cargo install cargo-watch

  • Which brings us to another useful Cargo tool: cargo install installs a Rust tool cli "globally" (actually, most likely only for your $USER in the OS).

  • cargo init provides another way to start a Rust project (crate): just create a folder, cd to it, and run cargo init, your crate will be named after the folder name.

A Rust program with a crate dependency

Given that all this nice tooling is setup and ready to fly, let's make a slightly more complex program now.

Run cargo add rand, this will add the rand crate as a dependency to our current Rust project (i.e. our current crate!), so we will now be able to use this crate on our code. Take a look at the official crate documentation and at its page at the crates.io.

Also, look at your Cargo.toml file now, it should look like this:

[package]
name = "our-crate-name-here"
version = "0.1.0"
edition = "2021"

[dependencies]
rand = "0.8.5"

Instead of running cargo add, you could just have manually edited this files too, it has the same effect.

Now, let's change our src/main.rs file to:

fn main() {
    println!("Hello, world!");
    println!("This a random number XD: {}", rand::random::<i8>());
}

Now compile and run it with cargo run.


If you found this project helpful, please consider making a donation.