Basics III
Let's continue with some more of the basics of Rust programming.
Crates and modules
As seem in the previous section, you can start a new Rust crate by running cargo init
on a new empty folder (or cargo new my-cool-crate
, which will create a new folder, roughly equivalent to mkdir my-cool-crate && cd my-cool-crate && cargo init && cd ..
).
The new crate will contain the files
Cargo.toml
src/main.rs
Cargo.toml
is where your crate configurations are stored, and src/main.rs
is where the main file for your crate binary.
You can, of course, create additional files, but they will be ignored by the compiler unless you add them as modules on your src/main.rs
.
So let's do the following:
- Create a file called
src/utils.rs
, then addmod utils
to the top of yoursrc/main.rs
. - Now create a function called
myfunc
atsrc/utils.rs
and try to call it fromsrc/main.rs
with ``utils::myfunc`. - It won't work yet, first you need to prefix the function definition with
pub
to make the function accessible to the module parent (main). - This will work already, but here's an optional extra: instead of calling
utils::myfunc
, you can add ause utils::myfunc
at the top of yoursrc/main.rs
and then you can usemyfunc
in your code.
Here's the outline of our files:
Cargo.toml
[package]
name = "my-cool-crate"
version = "0.1.0"
edition = "2021"
[dependencies]
src/main.rs
mod utils;
use utils::myfunc;
fn main() {
myfunc();
}
src/utils.rs
pub fn myfunc() {
println("hi!");
}
Submodules
Additionally, we can create a folder src/utils
and add more files there, they will be modules of utils
, so submodules of main
. In that case however, it's recommended to move src/utils.rs
to src/mod/utils.rs
(it's optional, it has the same effect). So here's another example:
src/main.rs
mod utils;
use utils::myfunc;
fn main() {
myfunc();
}
src/utils/mod.rs
// pub is need to allow main.rs
// to access reader
pub mod reader;
pub fn myfunc() {
println!("hi!");
}
src/utils/reader.rs
pub fn anotherfunc() {
println!("hello!");
}
Library crates
This setup is still not ideal for two reasons:
- First, this is just a binary crate: will generate an executable just fine, but won't allow us to import function into other crates (for other internal projects or distributing the crate to other users).
- Second, this will create problems with
cargo test
.
To solve that, we can rename src/main.rs
to src/lib.rs
and make some tweaks:
src/lib.rs
mod utils;
pub use utils::myfunc;
pub use utils::reader::anotherfunc;
pub fn run() {
myfunc();
anotherfunc();
}
Now external crates will have access to my_cool_crate::run
, my_cool_crate::myfunc
and my_cool_crate::anotherfunc
, because we made them all public on src/lib.rs
.
For that all you need to do is add my-cool-crate = { path="path_to/my-cool-crate" }
at your other crate (assuming it's in the same machine). Moreover, publishing it to some Git repository or the official Rust crates repository is also a possibility.
However, we don't have binary crate anymore (cargo build
will still work though). But to get that back, is quite simple because those functions are also accessible to src/main.rs
out of the box, which can then be recreated with the following content:
src/main.rs
use my_cool_crate::run;
fn main() {
run();
}
Multi-threading
A simple example of multithreading with rayon
:
fn get_timestamp() -> Result<u128, Box<dyn std::error::Error>> {
let timestamp = std::time::SystemTime::now()
.duration_since(std::time::SystemTime::UNIX_EPOCH)
.map_err(|e| format!("Error: {:?}", e))?
.as_micros();
Ok(timestamp)
}
fn main() -> Result<(), Box<dyn std::error::Error>> {
let start = get_timestamp()?;
let mut var1 = 5;
for _ in 0..100_000_000 {
var1 += 1;
}
let mut var2 = 7;
for _ in 0..100_000_000 {
var2 += 1;
}
let end = get_timestamp()?;
println!("Time taken (single-threaded): {}", end - start);
let start = get_timestamp()?;
let (pvar1, pvar2) = rayon::join(
|| {
let mut pvar1 = 5;
for _ in 0..100_000_000 {
pvar1 += 1;
}
pvar1
},
|| {
let mut pvar2 = 7;
for _ in 0..100_000_000 {
pvar2 += 1;
}
pvar2
},
);
let end = get_timestamp()?;
println!("Time taken (multi-threaded): {}", end - start);
println!("var1 final value: {}", var1);
println!("var2 final value: {}", var2);
println!("pvar1 final value: {}", pvar1);
println!("pvar2 final value: {}", pvar2);
Ok(())
}
Which will output something like this:
Time taken (single-threaded): 1618570
Time taken (multi-threaded): 795160
var1 final value: 100000005
var2 final value: 100000007
pvar1 final value: 100000005
pvar2 final value: 100000007
Async
For better or for worse, a large portion of Rust ecosystem is async
, so it's useful to give a brief introduction on the matter. Imagine two following scenario:
- Your program is executing a very large amount of parallel tasks that I/O bound, which means each of these tasks are idle (but blocking the thread), not consuming CPU processing power most of the time.
For example:
-
A process waiting for a slow HDD to read a fragmented file or a client communication with some server over the network and waiting for a response.
-
A webserver waiting for client answers (and in this case, the number of tasks is not even fixed and determined locally as traffic to website varies).
On a traditional synchronous program, the number of parallel tasks is limited to the number of thread, so if you have a queue of tasks to be done of 1000 tasks, but only 10 allocated threads, even if each tasks uses on average 0.1% of CPU and just a little RAM, but takes around 5 seconds of wait to complete (e.g.: because of waiting the client over the network to process and previous request and send a response), it will take 500 seconds to complete all those tasks, by which time there might be thousands more in the queue, which might lead to slowness and requests time for clients of our webservice.
A solution would be to increase the number of threads, which is enough for many problems but has its limitations: potential overhead caused by excessive of threads being created/removed/managed, difficulty to predict the necessary number of threads, etc to the point that you might end up recreating functionality similar to asynchronous task management.
On the other hand asynchronous creates a simple solution to this problem: every time a task is waiting for I/O, it yields to the asynchronous runtime manager which can handle control of thread to other tasks that might have also being waiting for slow I/O in the past. This way many tasks can share the same thread and thousands of tasks can effectively run in parallel with just a few threads.
Nowadays, the most common asynchronous runtime for Rust is tokio
, being close to a de facto standard. Functions are defined as asynchronous by prefixing them with the async
keyword. Calling such functions per si won't cause anything to be executed at all, but rather, they return a Future
(similar to a javascript Promise
), which you can cause to be executed by calling await
as seen in the following example:
src/main.rs
async fn some_operation(i: i32) {
// this yields to the tokio runtime which will be able to
// use the thread to make progress on other tasks and come
// back here later to check if the sleep task is completed
tokio::time::sleep(std::time::Duration::from_secs(3)).await;
println!("some_operation({i}) completed");
}
#[tokio::main]
async fn main() {
let task = some_operation(-2); // nothing happens yet
task.await; // starts task and blocks until it finishes
let task = some_operation(-1); // nothing happens yet
let task = tokio::spawn(task); // starts task in background
task.await.unwrap_or_default(); // blocks until the task finishes
// start 100 tasks in background
let tasks = (0..100)
.map(|i| tokio::spawn(some_operation(i)))
.collect::<Vec<_>>();
// join all tasks
// this will block until all tasks are finished
// but notice that the tasks are running concurrently
// regardless of the number of cores in your computer
// and this will take about 3 seconds to run
// even with a large number of tasks
// (e.g: try increasing to 1000 tasks)
// by default, tokio will use as many threads
// as there are cores in your computer
// so we are running a hundred of tasks concurrently
// using a just few threads because whenever the function
// some_operation() calls await (on the sleep), it yields
// to the tokio runtime which can then reuse the
// the free thread to make progress on another task
for task in tasks {
task.await.unwrap_or_default();
}
}
Cargo.toml
[package]
name = "my-cool-crate"
version = "0.1.0"
edition = "2021"
[dependencies]
tokio = { version = "1.32", features = ["full"] }
Async HTTP requests with Reqwest
Let's now see an example of downloading and printing a JSON file with reqwest
.
src/main.rs
// This is necessary on the main() function
// to start the tokio asynchronous runtime
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let url = "https://xkcd.com/info.0.json";
let client = reqwest::Client::new();
let response = client
.get(url)
.send()
.await
.map_err(|e| format!("Failed to send request to {url}: {e}"))?;
if response.status().as_u16() != 200 {
let response_status = response.status();
Err(format!(
"Failed to fetch {url}. Status code: {response_status}"
))?;
}
let body = response
.text()
.await
.map_err(|e| format!("Failed to read response body: {e}"))?;
println!("Response body: {}", body);
Ok(())
}
Cargo.toml
[package]
name = "my-cool-crate"
version = "0.1.0"
edition = "2021"
[dependencies]
reqwest = "0.11"
tokio = { version = "1.32", features = ["full"] }
Which will output something like this:
Response body: {"month": "9", "num": 2829, "link": "", "year": "2023", "news": "", "safe_title": "Iceberg Efficiency", "transcript": "", "alt": "Our experimental aerogel iceberg with helium pockets manages true 100% efficiency, barely touching the water, and it can even lift off of the surface and fly to more efficiently pursue fleeing hubristic liners.", "img": "https://imgs.xkcd.com/comics/iceberg_efficiency.png", "title": "Iceberg Efficiency", "day": "15"}
Deserialization with Serde
Building upon the previous section, let's additionally deserialize the downloaded JSON string into a Rust struct using serde
:
src/main.rs
use serde::Deserialize;
async fn get_xkcd_json(
i: i32,
) -> Result<String, Box<dyn std::error::Error>> {
let url = format!("https://xkcd.com/{i}/info.0.json");
let url = url.as_str();
let client = reqwest::Client::new();
let response = client
.get(url)
.send()
.await
.map_err(|e| format!("Failed to send request to {url}: {e}"))?;
if response.status().as_u16() != 200 {
let response_status = response.status();
Err(format!(
"Failed to fetch {url}. Status code: {response_status}"
))?;
}
let body = response
.text()
.await
.map_err(|e| format!("Failed to read response body: {e}"))?;
Ok(body)
}
#[derive(Deserialize, Debug)]
pub struct Item {
month: String,
num: i64,
year: String,
title: String,
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut res = Vec::with_capacity(10);
for i in 1..=3 {
let body = get_xkcd_json(i).await?;
let item: Item = serde_json::from_str(&body)?;
res.push(item);
}
println!("{:?}", res);
Ok(())
}
Cargo.toml
[package]
name = "my-cool-crate"
version = "0.1.0"
edition = "2021"
[dependencies]
reqwest = "0.11"
tokio = { version = "1.32", features = ["full"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
Which will output something like this:
[Item { month: "1", num: 1, year: "2006", title: "Barrel - Part 1" }, Item { month: "1", num: 2, year: "2006", title: "Petit Trees (sketch)" }, Item { month: "1", num: 3, year: "2006", title: "Island (sketch)" }]
[Finished running. Exit status: 0]
Additional resources
To go beyond the basics of the previous sections and have a deep dive into the Rust programming language, I recommend the following additional resources:
-
Reddit r/learnrust for asking questions.
-
The web development section of this tutorial as an initial applied project.
If you found this project helpful, please consider making a donation.