Conversation

Jarkko Sakkinen

In Rust programs one common theme, which is not great for optimizing memory footprint, is the heavy use of collect(). Iterators are a great abstraction exactly for the reason that you can view items as a set without having to brute force deploy them as such in memory. One area where no language paradigm can protect against is the overuse of computing resource.

One good recipe against this e.g. in a commercial setting would be to set a constraint for core components to be no_std compatible and have total artistic freedom in user interfacing parts or e.g. storage backend interfacing part i.e. only in I/O code.

Then early steps are slower but sort of investments instead of debt when memory is considered early on…

There’s little gain with the added complexity of Rust to something like Go if this consideration is not done. Sometimes something like Go could do even a better job because then at least garbage collector considers memory constraints..

#rustlang

2
0
4

@jarkko Can you elaborate? Are you saying that using collect requires more memory than languages whose iterators aren't lazy in the first place?

1
0
0
@dpom given how tied the standard library is to vec, using collect is sort of encouraged in the API nothing to do with Rust as a language. rust is in binary level pretty much equivalent to C++.
1
0
0

@jarkko So you're basically saying that Rust docs point to heap-allocated data structures when it would be better to use array primitives?

1
0
0

@jarkko I'm getting a 404 on the link

0
0
0

@dpom use a Vec if you need a Vec because it is a Vec :-) it is not a great choice to make a code cleaner, unless you actually need to dynamically grow it later on. function like collect() cleans up code at the cost of using heap for small useless allocations. Allocating from stack on the other hand is just pointer arithmetic.

If the code that aims to be in par with equivalent C code, then these things are relevant. If the use of Rust is more like integration of available components and frameworks with the aim of not loosing productivity of something like JavaScript or Python, then this whole thing does not probably matter all that much…

1
0
1
In high performance code or something you might want to run in a shim these things do matter because e.g. one thing you want to optimize in high-availability service is round -trips to kernel, i.e. the number of context switches. Heap allocation is serviced either by reusing already mapped but freed memory or software needs to ask more from kernel in the corner case. More random heap allocations a system has circulating in its internals, the higher risk there is unpredictable latency peaks and such. Depends of course of the scale of the service how much this matter or not.
2
0
1

Jarkko Sakkinen

Edited 1 year ago

E.g. here’s a snippet from my zmodem2 crate:

            for (i, field) in payload.split('\0').enumerate() {
                if i == 0 {
                    state.file_name = String::from_str(field).or(Err(Error::Data))?;
                }
                if i == 1 {
                    if let Some(field) = field.split_ascii_whitespace().next() {
                        state.file_size = u32::from_str(field).or(Err(Error::Data))?;
                    }
                }
            }

In terms of how nice looking Rust code it is, well it is not that nice looking but it is heck a lot more efficient than a one-liner with collect‘s between. If this was std code (actually it uses heapless::String) the only heap allocation would happen in String::from_str but there you actually need heap allocated memory because the string length is not known at compile time.

I think that to go beyond basics of Rust to actual production code you need to step up from nice conventions to sort of requirements based of thinking. If code uses Vec, there should always answer on plate why it requires Vec and nice syntax is not a great answer for that question.

Even with e.g. enterprise Java, when in production, a lot of undestanding of how JVM JIT and GC work is required for efficient production code. A great tool needs educated use to actually get the added value sort of, or then actually something like Go might be a better option from purely productivity standpoint.

1
0
1

Jarkko Sakkinen

Edited 1 year ago

There’s a lot of stuff how great Rust is in this or that but not so much on evaluating what is bad Rust code and what is good so I’ll throw my 2 cents more like to get some feedback on that, rather than delivering the official truth :-) I’m happy to withdraw my views, I just throw this given the lack of “grown up” and educated viewpoints on efficient Rust code.

Like e.g. consider e.g. async which addresses the lack of non-blocking I/O in the standard library. It brings essentially a workqueue or thread pool abstraction with syntax sugar for polling and scheduling threads in the pool. Inefficient code is still a problem because thread pool is a limited resource, and keeping it too hot can make code stall just like before the feature. async is pretty much same thing as struct workqueue in kernel and not much else.

Making Rust features boring and uninteresting I guess :-) But it is good to make this sort of mind exercise for language features that are essentially glue code generators.

0
0
0

@jarkko Honestly, I think many concepts are elided in introductory Rust documentation because people coming from a non-systems background (like me) are already saturated with new concepts and assimilating the syntax.

But these things are definitely discussed in some circles, like @nnethercote's book:

https://nnethercote.github.io/perf-book/heap-allocations.html

1
0
1

@dpom @nnethercote Thanks, was aware of this book!

For async in particular the real mechanics are sort of “hidden” given the frameworks that put them nicely on the plate (such as tokio). For that it makes sense for anyone to see how it is “open-coded” just to not feel unsafe when putting weird keywords to the code.

I was a bit lost with that in particular until I read a chapter from book “Rust of Rustaceans”, which has pretty nice explanation of what it actually is: a polling interface and (usually) a thread pool (i.e. “executor”). All the syntax is semantically just macros with language level syntax to place them.

This is how I open Rust concepts. They are either new functions or code generators in some sense. If the latter I just do the exercise of open coding the generator to see what it results in.

It is quite common theme in new Rust language features that they are just code generators that produce a “normalized” subset of “plain” Rust code. By understanding this it takes a lot of magic away and helps to understand what they actually achieve.

1
0
0
@dpom @nnethercote I've worked on systems programming for my whole career so take my opinions from that angle. I'm specifically talking about code that would end up to kernel, bootloader, firmware, systems service etc. Do not mean to discriminate :-) I think Rust is great because it provides equal platform to work teams that do different areas of software and still get stuff easily integrated.
0
0
1