Rust is a fairly new programming language on the block, but it is a language that enjoys so much love in the programming community. This love has persisted the last several years, so I wanted to see what all the hype was about. In order to do that, I had to write some Rust, but I did not want to stop with the generic “Hello, world!” I had to decide on and write something that would give me a good overview of the language and let me have some fun while I was doing it.
I decided to write a utility that would allow me to output a dump of a file in hex format; this would essentially be a version of the tool known as xxd on Linux. For the sake of simplicity and this post, I only decided to imitate the basic functionality of the tool. Might save the rest for later posts :-). With that, let’s start writing some Rust and see where we can go with this language.
Setup
The first thing I need to do is download and install Rust on my local development machine. The recommended way to do this is through a tool called rustup
, and the Rust website says to run the following command to do this:
curl — proto ‘=https’ — tlsv1.2 -sSf https://sh.rustup.rs | sh
I did all of this on a Linux machine, and the above command downloaded the Rust toolchain and its package management tool, Cargo, and installed it on my machine. Now that I have the compiler, I have to test it out to see if the download was successful, and the easiest way to do that was to compile the famous “Hello, world!” in Rust. Below is what that program looks like:
fn main() {
println!("Hello, world!");
}
Copy the above in a file called hello.rs
. This is a Rust source file. In order to compile this to a program that executes, run the command rustc hello.rs
in terminal. This may take a little time depending the setup, but the resulting executable hello
will be produced in the same directory. By running the command ./hello
, the text Hello, world!
was printed to the terminal. Woo hoo! This confirms that our download and install was successful, so we can start out working on the hex dumper.
Cargo
Cargo is Rust’s package manager, and it really is an awesome tool. Going into it is far beyond the scope of this article, but I will be using it to manage this build and project.
To initialize a Cargo project:
cargo new rhexdump
This will create a directory called rhexdump
and fill it with the necessary git repository files; this directory is referred to as the project root. All of the Rust source code will live in the rhexdump/src
folder. In order to test out the program, run the command:
cargo run
Can be run inside of the terminal. Cargo will compile and run the program and log everything to the terminal. Let’s start writing some Rust!
The Big Picture
The GitHub gist above is the entirety of the program, and we will be dissecting it and stepping through the different parts and examining them.
The Breakdown
Let’s start with the main
function:
let args: Vec<String> = env::args().collect();if args.len() < 2 {
panic!("Not enough arguments!");
}
Here, we start off by collecting all of the arguments passed to the program and putting them all into a vector of String
types. If we did not have more than 1 argument (the program name counts as an argument, hence the 2) passed to the program, we commence to panic and cease program execution.
let mut file_to_read = get_file(String::from(&args[1]));
Next, we look to get a File structure that we can read over, and we do this by calling the get_file
function that I defined above:
fn get_file(path_to_file: String) -> File {
match File::open(path_to_file) {
Ok(f) => File::from(f),
Err(e) => {
panic!(e);
}
}
}
The get_file
function is pretty straightforward; it takes a String
argument that represents a path to a file. The -> File
indicates that this function returns a File structure.
match File::open(path_to_file) {
Ok(f) => File::from(f),
Err(e) => {
panic!(e);
}
}
Here, we perform a match
operation. For readers coming from other languages, this is similar to a switch
statement. The operation attempts to open a File by using the path_to_file
argument provided, and this returns a Result
. If the operation was successful, it returned an Ok(file_struct)
, and we use the returned file_struct
to create our File object. If the operation returned an Err
, we panic once again.
let mut buff = [0; GLOBAL_BUFFER_LENGTH];
let mut offset: usize = 0;
In this operation, I am creating a mutable buffer that is of the GLOBAL_BUFFER_LENGTH
and fill it with all 0’s. We also initialize a mutable variable called offset to 0.
loop {
let bytes_read = file_to_read.read(&mut buff);
match bytes_read => {
Ok(number) => {
if number == 0 {
break;
} else {
println!("{:08x}: {:40} {:10}",
offset,
get_hex_rep(&mut buff[0..number]),
get_ascii_representation(&mut buff[0..number]));
offset += GLOBAL_BUFFER_LENGTH;
}
},
Err(why) => {
eprintln!("rhexdump: {}", why);
break;
}
}
}
This loop is the real meat of the main function, and we’ll start off by breaking this down from the top on down.
loop {
...
}
This is Rust’s loop construct, so this block of code executes until a break occurs within it.
let bytes_read = file_to_read.read(&mut buff);
This line of code reads some data into the buffer we created further up the program, and this gets stored as a result with the number of bytes (in this case) that we read from the file.
match bytes_read {
Ok(number) => {
...
},
Err(why) => {
eprintln!("rhexdump: {}", why);
break;
}
}
This match
block is very similar to what was done in the get_file
function. If the operation returned an Ok
with the number of bytes read, we continue to process the data. In the event we had an Err
, we output to the standard error stream and break out of the loop.
if number == 0 {
break;
} else {
println!("{:08x}: {:40} {:10}",
offset,
get_hex_rep(&mut buff[0..number]),
get_ascii_representation(&mut buff[0..number]));
offset += GLOBAL_BUFFER_LENGTH;
}
We had an Ok
returned, so we have some data that has to be processed. We first check to see if the number of bytes we read was 0. If we did not read any bytes, we are done and just break out of the loop.
If our number of bytes read was not 0, we use the println!
macro to output to the screen the current offset, a hexadecimal formatted representation of our data buffer along with an ascii formatted representation of our data buffer.
First, we’ll look at theget_hex_rep
function:
fn get_hex_rep(byte_array: &mut [u8]) -> String {
let build_string_vec: Vec<String> = byte_array.chunks(2)
.map(|c| {
if c.len() == 2 { format!("{:02x}{:02x}", c[0], c[1]) }
else { format!("{:02x}", c[0]) }
}).collect(); build_string_vec.join(" ")
}
This function contains some of the typical Rust functional operations with iterators and mappings. This may look foreign to some programmers, but seasoned Rustaceans will understand it.
The function starts off by stating that the processed data will be stored in a Vec<String>
. We then state that we want to parse over the data in the byte_array
variable in sections or chunks of 2. From there, we apply a mapping function to the various chunks. We check to see if this chunk has a length of 2, and if it does, we use the format!
macro to create a String
, two hexadecimal representations of the bytes contained in the chunks. If the chunk does not have a length of 2, we do the same operation to the first byte in the chunk. We then conclude by collecting all of the results into a vector. Finally, we create our returned string by calling the join
function on the Vec<String>
structure.
That was how we created the hexadecimal representation; we still need to create the ascii representation, and we take care of that by using the get_ascii_representation
function.
pub fn get_ascii_representation(byte_array: &mut [u8]) -> String {
let build_string_vec: Vec<String> = byte_array.iter().map(|num| {
if *num >= 32 && *num <= 126 { (*num as char).to_string() }
else { '.'.to_string() }
}).collect(); build_string_vec.join("")
}
get_ascii_representation
is very similar to the get_hex_rep
in its layout. We begin by creating an iterator over the byte_array
slice, and we apply a mapping function to its contents. Here we check to see if the byte is in the printable ascii range; if it is, we convert the byte to a character representation and then create a string out of it. If the byte is not in the printable range, we substitute it with a .
string representation. We then collect all of these results into a Vec<String>
. Finally, we call join
on the vector to create and return our String
representation.
Wrap-Up
That concludes our analysis and review of my version of the xxd Linux hex dump utility written using the Rust programming language. Rust has so many good things going for it, and I know that more and more people and organizations will be looking to utilize it in their projects.