Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Comments on How can I create and modify a struct over iterations of a loop?
Parent
How can I create and modify a struct over iterations of a loop?
How can I have a mutable object (for example a vector) that is created inside a loop iteration and needs to be updated in later iterations of said loop?
As a concrete example, consider parsing something similar to an ini file.
[section1]
entry1
entry2
[section2]
entry3
entry4
In a simple scripting language like PHP I would just create an array when a new section starts, append items and store it away when the next section starts.
sample pseudocode
$sections = [];
$currentSection = null;
for line in ini_files {
if ($line is sectionheader) {
if ($currentSection is array) {
$sections[] = $currentSection;
}
$currentSection = [];
}
else {
$currentSection[] = $line;
}
}
Now in Rust it's not that easy of course. If I understood correctly, the $var = null/None and assign as needed schema is done using Options instead.
I'm currently trying to wrap my head around that. Let's say I have the following loop, and Section is a custom struct that is supposed to store the related entries:
let mut sections: Vec<Section> = Vec::new();
let mut current_section: Option<Section> = None;
for line in read_to_string("input.ini").unwrap().lines() {
if line.trim().ends_with("]") {
if current_section.is_none() {
sections.push(current_section.unwrap());
}
current_section = Some(Section::new(line));
// --------------- this reinitialization might get skipped
}
else {
current_section.unwrap().add_entry(line);
// ^^^^^^^^^^^^^^^ ------- `current_section` moved due to this method call,
// in previous iteration of loop
}
}
additional file: section.rs
#[derive(Debug)]
pub struct Section {
pub name: String,
pub entries: Vec<String>,
}
impl Section {
pub fn new(name: &str) -> Section {
Section {
name: String::from(name),
entries: Vec::new()
}
}
pub fn add_entry(&mut self, line: &str) {
self.entries.push(String::from(line));
}
}
I get that current_section is moved and that I can't access it anymore due to the changed ownership. But how can I store it so that I can access it on further iterations? I tried every combination of referencing and dereferencing with &
and *
I could think of.
How can I store a variable from a loop iteration and still be able to modify it at later iterations?
Post
The following users marked this post as Works for me:
User | Comment | Date |
---|---|---|
GeraldS | (no comment) | Jul 30, 2024 at 19:12 |
tl;dr: Use current_section.as_mut().unwrap()
instead of just current_section.unwrap()
.
Ownership is a core language-supported concept in Rust and this means what might be a single method in most languages can multiply into several that differ in their behavior with respect to ownership.You can see this in the diagram here for C++ which has similar ownership concerns albeit with far less support from the language itself.
Unlike C++, the Rust type system makes this far more evident and statically checked instead of just requiring you to know the semantics for various types. In this case, we can see the issue simply by looking at the type of unwrap
. Its type is pub fn unwrap(self) -> T
. We can see from this that unwrap
consumes (takes ownership of) the object it's called on, and it gives the caller ownership of the contained value it returns. For an Option<Section>
, T = Section
and thus the result of unwrap
will be a Section
which would be dropped at the end of the block meaning the next iteration would be working with deallocated memory. Or it would if Rust let you run the code.
The solution in this case is to use as_mut
, as in current_section.as_mut().unwrap().add_entry(line)
. as_mut
has type pub fn as_mut(&mut self) -> Option<&mut T>
. Again, the types communicate the relevant ownership information. We see that as_mut
only mutably borrows the object it's called on, instead of consuming it, and it returns a (mutable) reference. We can also see that the Option
is being created and given ownership to the caller. Now when we invoke unwrap
, we're invoking it with T = &mut Section
. We're still consuming the Option
which as_mut
just created, but now we can tell from the type that the result of unwrap
is just a mutable reference and that we have not taken ownership of the Section
. We do own the mutable reference, but we don't own what it's referencing. The mutable reference dies immediately after the add_entry
, which is good because we can only have one mutable reference to an object at a time.
The following is more of a case study on lifetimes and how they affect Rust API design. It's not specific to your problem.
The entry
method for HashMap
and the associated Entry
type is a good illustration of these ideas and how they impact interface design in Rust and are worth studying. This pattern of needing to introduce auxiliary data types to capture life-times appropriately is common in Rust library design. Indeed, earlier versions of Rust had more stuff built into the language until people realized that it could be accomplished without additional language features.
On its face, entry
just seems like an efficient way to get at a (potential) location in the collection. If the key is in the hash map, we get an OccupiedEntry
that we can access and update in-place much like get_mut
. If the key isn't in the hash map, we get a VacantEntry
which we can update to insert an element into the hash map. This is nice as for many collections the work to look up a value is similar to the work needed to insert the value. With this type, we don't need to do that work twice. The alternative would be to use get_mut
and insert
which would likely do that work twice.
Except... as I found out, there are cases where that two step approach (get_mut
, insert
) doesn't really work. Recently, I was making a trie type which is roughly a node which is, recursively, a hash map of child nodes. To insert an value into the trie, I had a mutable "current node" variable whose type was a mutable reference to a node which would be updated in a loop as I traversed the trie. I originally wrote this with a get_mut
-insert
approach, but this doesn't work because the get_mut
is still mutably borrowing the parent node when you go to try to insert
. I couldn't figure out any reasonable way to make the borrow checker happy. Maybe I could have done something with a flag to indicate to insert later, but that would be really ugly and inefficient. Searching the methods of HashMap
was when I found entry
which solves this problem elegantly. With entry
I didn't need to attempt to mutably borrow the parent node again; I could just insert via the VacantEntry
directly.
As a final example of multiple methods that differ solely in their ownership behavior and how Rust's types indicate this, the OccupiedEntry
type has two very similar methods: get_mut
and into_mut
. Their types are pub fn get_mut(&mut self) -> &mut V
and pub fn into_mut(self) -> &'a mut V
. We see from the types that get_mut
does not consume the OccupiedEntry
and the mutable reference it returns only lives as long as the OccupiedEntry
. By contrast, into_mut
consumes the OccupiedEntry
and returns a mutable reference with a different lifetime (necessarily since the object into_mut
is called on doesn't live passed the call to into_mut
) which, in this case, is associated to the lifetime of the HashMap
. For my trie use-case, I needed to use into_mut
since my current node variable lived across iterations of a loop, while the OccupiedEntry
had a very brief life in a branch of a match
.
1 comment thread