Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on What gets allocated on the stack and the heap?

Parent

What gets allocated on the stack and the heap?

+12
−2

I was told by my professor/book that computer programs use two kinds of memory and that all variables get allocated either on the stack or on the heap. Is this true? How can I tell where a variable gets allocated? Does the compiler handle allocation differently depending on which programming language I use?

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

General comments (2 comments)
Post
+14
−0

"Stack vs heap" is a common over-simplification and not really a meaningful one, since those two areas have quite different, specialized uses. And no, those are not the only memory regions used by your program.

To understand where variables end up, we must understand how a computer works. Physically, computer memories consist of:

  • Registers (fastest, restricted amount)
  • Cache memory (fast, optional)
  • RAM memory,
  • ROM memory (possibly slow)
  • External memories (very slow, lots of storage capacity).

This holds well for all computers from tiny microcontrollers to PC/desktop ones. "RAM" is a widely but sloppily used term, in this context it actually means read/write memory, which is volatile and loses its data during power down. Whereas ROM could either be a read-only part of the physical RAM memory, or it could be on-chip, non-volatile EEPROM or flash memory. External memories are things like hard drives, memory cards, extended parallel address buses, serial memories etc that have to be accessed through some manner of bus interface.

During optimization, a compiler tries to store as many variables as possible inside registers. Usually this is what happens with local scope variables. It's only when the compiler runs out of registers or when variables turn too large (such as arrays) that it needs to store them in RAM instead.

Regarding cache memories (if present): all variables stored in RAM may be loaded into data cache memory, which gives faster CPU access. This is handled by the CPU hardware, which predicts that certain areas of memory might soon be used and fetch those to cache in advance, while the CPU is busy executing something else. For example if we do a calculation repeatedly inside a loop, the array used by the loop is a likely candidate to end up in cache, given that it is allocated in adjacent memory cells.

In RAM, we typically have four different regions: the stack, the data section, the bss section and the heap.

The stack is what a compiler normally uses automatically when it runs out of registers. Registers + stack are therefore referred to as automatic storage, meaning they are handled automatically by the high level language compiler. The CPU has instruction support for handling that area of memory through a stack pointer (often named "SP"), which keeps track of how much stack that is currently used and where the next free memory cell is. Parameters and return values used during function calls have automatic storage too, stored in registers or on the stack based on the system-specific rules known as calling convention, that also specifies if the caller or callee is the one responsible for storing parameters. The stack is usually restricted to a limited amount of memory per program/process, so allocating very large objects in local scope with atomatic storage is a bad idea, that could lead to stack overflow when the program runs out of stack memory.

The data and bss sections is where variables with static storage duration go. All variables that must persist throughout the execution of the whole program end up in these sections, such as for example "global variables". All such variables that are explicitly initialized by the programmer to a value end up in the data section, and those who aren't explicitly initialized or initialized to zero end up in the bss section, where every variable is zero-initialized during program start-up.

The heap (sometimes called "free store") is a specialized area, either used when the amount of memory needed isn't known at compile time or when large amounts of memory are needed. Memory allocated on the heap is called allocated storage or dynamically allocated memory. It is not commonly used in low-end systems like microcontrollers, since such systems are deterministic, but also since dynamic allocation is often handled by the OS through API functions, so that multiple processes may co-exist and share RAM, requesting more memory from the OS when needed and handing it back when not needed.

In compiled languages such as C or C++, dynamic allocation is handled explicitly by calling functions like malloc/free or operators new/delete. Failing to free up heap memory through a bug is known as a "memory leak". Standard libraries particularly in C++ use dynamic allocation extensively for standard container classes. Higher level byte code or interpreted languages like for example Java, use heap memory even more often, handling dynamic allocation automatically by the compiler. This means that the programmer doesn't need to worry about where variables are stored, but also that they don't have to worry about memory leaks, since a separate thread known as the garbage collector is responsible for freeing up heap memory no longer used by the process.

In addition to the above mentioned read/write segments, variables can also end up allocated in a read-only data section, commonly called "rodata", which may be located either in RAM or ROM depending on system. And in some cases, read-only variables, numeric constants, strings etc end up allocated with the program itself, in the executable code which typically resides in a section called "text", which is always stored in ROM.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

1 comment thread

General comments (7 comments)
General comments
obround‭ wrote over 3 years ago

"Cache memory (fast, optional)" You should most likely add restricted, because cache is usually a few hundred KB to a few MB. "It's only when the compiler runs out of registers or when variables turn too large (such as arrays) that it needs to store them in RAM instead" The "such as arrays" part is false. Arrays cannot be stored in registers. This is because arrays are stored in contiguous blocks of memory, and registers (are not really, but can be thought of as) single blocks of memory.

Lundin‭ wrote over 3 years ago · edited over 3 years ago

@obround I'd be surprised if a compiler didn't store small size arrays inside registers, given that the array address is never taken. Easy enough to prove, here is an array stored in registers: https://godbolt.org/z/Yb134Y

obround‭ wrote over 3 years ago

I seem to have misunderstood that sentence: I though you were saying that compilers store the entire array (not array values) in registers.

Skipping 1 deleted comment.

EJP‭ wrote over 3 years ago

The paragraph starting 'During optimization, a compiler tries to store as many variables as possible inside registers' needs further work. This is entirely untrue of any variable allocated statically, and it is also untrue of any variable that has its address taken, special compiler optimizations apart.

Lundin‭ wrote over 3 years ago

@EJP How is "a compiler tries..." untrue? This post isn't meant to be an exhaustive list of every detail that happens during compilation, but a brief overall summary. If a variable has the address taken, then indeed the compiler can't store it in a register.

Lundin‭ wrote over 3 years ago · edited over 3 years ago

@EJP Regarding static variables, I'm not sure why there's some recurring myth about them not getting optimized... just check https://godbolt.org/z/ezoj7q. Not only did the compiler ignore to allocate separate memory for the two static storage variables a and b, it pre-calculated the result to 3 and stored it in a single register esi. Optimizing away the whole .data initialization part as well.

klutt‭ wrote over 3 years ago

Good answer. However, I'd say that external memory is also optional. It's fairly common in embedded systems.