Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

How does the strict aliasing rule enable or prevent compiler optimizations?

+2
−0

Inspired by https://meta.stackoverflow.com/questions/432242, and the relevant main-space questions.

I have heard that C and C++ something called the "strict aliasing rule", which means for example that if we have code like:

int aliased(float *f, int *i) {
    *f = *(reinterpret_cast<float*> i);
    return *f;
}

then we have invoked undefined behaviour.

I also heard that character types are a loophole. That is, if we type-pun to (char *, signed char * or unsigned char *) then we can legally assign bytes from the underlying storage of the int to the underlying storage of the float.

I was told that this has implications for compiler optimizations. For example, here the compiler can treat return *a - t as if it were just return 0:

int f (int *a, long *b)
{
    int t = *a;
    *b = 0;          // cannot change *a
    return *a - t;   // can be folded to zero
}

But on the other hand, if we change how b is used (braces added for clarity):

int f (int *a, long *b)
{
    int t = *a;
    for (int i = 0; i != sizeof *b; ++i)
    {
        ((unsigned char*)b)[i] = 0;
    }
    return *a - t;   // must not be folded
}

Now the same optimization is invalid.

What is the underlying reasoning here? It seems to me like a and b are either compatible pointer types or they aren't. Why does code inside the function change the compiler's reasoning? And how does undefined behaviour enable the compiler to make these kinds of optimizations?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

2 answers

You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.

+0
−0

Pointer conversions and aliasing

First of all, C historically allows all manner of crazy pointer conversions (whereas C++ is more restrictive) between compatible and non-compatible pointed-at types, as well as to/from the generic void*. There are several reasons to allow that: type punning, generic programming etc. So within reason, for example a float* is allowed to point at an int, though things like correct alignment still have to be fulfilled.

But since we allow that, it becomes problematic to determine if a certain variable was updated at a certain point, as any pointer can in theory point at anything. A compiler does a lot of reasoning behind the lines about the internal type system, the types used by temporary expressions and so on. From a compiler's perspective, there's the term "pointer aliasing" (informal term). Two pointers can be said to alias if they may point at the same variable. Example:

void func(int* x, int* y) // x and y may alias
{ ... }

int a;
int b;
func(&a, &a); // in this case x and y will point at the same variable
func(&a, &b); // in this case they point at different variables

Since func in the above example has external linkage and might sit in a different source file than the caller, there's no way for the compiler to know if x and y point at the same data - short of comparing their addresses, which would cause execution overhead. Therefore if we do *x = 5; inside the function, we'd have to reload *y when used because they alias and the write to *x could have changed *y.


Strict pointer aliasing

The rationale behind "strict pointer aliasing" (informal term), or just "strict aliasing", was to formalize the rules for how the compiler's internal type system is allowed to behave in these situations. In case we have a scenario such as

int aliased(float *f, int *i)

Then the compiler should reasonably be able to assume that if we modify f inside the function, we did not modify what i points at, because they aren't compatible types. If the compiler can't assume that, it would have to reload i from memory after writing to f or vice versa, because it can no longer assume that the pointed-at data wasn't changed. And that leads to inefficient code.

Therefore, rather than looking at the type of the pointer itself, rules were set up about the actual type of the pointed-at data. The term for this is effective type (formal term). If we declare int x; then the effective type of x is int. More intricate rules for effective type apply in case of dynamic allocation etc but I won't go into that here.

Strict aliasing refers to the scenario where a certain effective type is accessed through a de-referenced pointer, so called "lvalue access". C23 6.5 §7:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

Example of a "stict aliasing violation"

The following scenario is undefined behavior:

  • We have some variable of effective type int.
  • We point at it with an int* (maybe for passing it to a function like in your example).
  • Then later we cast the int* to a float*. This is allowed in case int and float have the same alignment on the given system, which is common.
  • If we later de-reference that float* pointer, we invoke undefined behavior.

Because of this there is no telling if your example invokes undefined behavior or not, because we do not know the effective types of the pointed-at variables, which is most commonly found by checking their declarations. On the caller side in your example.


Exceptions to strict aliasing

I also heard that character types are a loophole.

Yes, since they are one of the valid exceptions in the list I quoted from the standard. This is to allow us to inspect the raw binary contents of any type by iterating across it byte by byte, using a character type. This is guaranteed well-defined behavior (6.3.2.3 §7) and then the whole pointed at item is to be temporarily regarded as a character array.

// This is fine
int some_var;
unsigned char* uptr = (unsigned char*) &some_var; // ok conversion
...
do_something(uptr[i]); // ok to de-reference and iterate across

But that does not mean we can go the other way around!

// This is undefined behavior! (UB)
char str[] = "Hello";
int* iptr = (int*)str; // this might be a misaligned access, if so UB
do_something(*iptr); // this is a strict aliasing violation, always UB

Optimizations and historical problems with the gcc compiler

I was told that this has implications for compiler optimizations.

For good and bad. With the release of C99, the rules about strict aliasing was clarified somewhat although they've always been there in one way or another. But because C99 brought up the matter, in the early 2000s this caught the attention of the gcc compiler team specifically. They started to realize that if the standard was read literally, they could start making assumptions about if a variable was modified or not.

If a float* is modified then the lvalue of an int cannot have been modified for example, because that would be a strict aliasing violation. And so there is no need to reload some int into memory after modifying a float through a pointer, making the code more efficient.

That was the original attention of strict aliasing, but what was problematic is that gcc started to do this in all manner of scenarios and for other types too, which weren't so obviously non-compatible. Like accessing an int through a short pointer.

For example there could be many reasons why we'd want to go through a 32 bit integer 16 bits at a time, or a 64 bit integer 32 bits at a time. Or do word-based copying of aligned data since that's faster. Etc etc - there are lots of scenarios in hardware-related or driver/library implementation when you might want to do things like this.

But now gcc treated all such code as undefined behavior and optimized accordingly. Worse yet they shipped this as a default setting. This caused C programs to crash left and right in the early 2000s. Linux in particular since it relies heavily on gcc, but now the default settings would break everything (and causing another memorable, childish Torvalds tantrum). But also lots of embedded systems got problems, where gcc was starting to gain popularity thanks to the new ARM cores.

Ever since then, gcc has struggled with recovering from all the problems it caused back then, undoing some of the problematic optimizations or making them optional. But gcc did take a reputation blow and became infamous for following the standard to the letter rather than trying to generate the code that the programmer had intended. It is a problem with that compiler specifically, since no other compiler made the same call to aggressively optimize in case of strict aliasing violations.

Yes, the gcc team was indeed reading the standard correctly. The examples I showed are undefined behavior indeed. However, most compilers strive to be tools useful to programmers in the real world, so they wouldn't start to generate all manner of implicit subtle misbehavior because of it. Rather - they would implement a non-standard compiler extension treating something like *(short*) &my_int as deterministic code.

A professional C or C++ programmer does need to be aware of strict aliasing when doing wild & crazy pointer conversions, but even more so misaligned access. Strict aliasing is an artificial concept of the C standard, but misalignment is a real physical problem based on how CPUs and memories are designed internally.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

+1
−0

Aliasing is perhaps better understood as a property of assignments through pointers, not of the pointers themselves.

First, let's consider the reasoning behind applying the optimization to the first example.

In the first code block, the underlying storage of b is only assigned to via the b pointer. Since this assignment occurs through a long *, but a is an int *, the strict aliasing rule means that the compiler may assume that the assignment does not affect the underlying storage of a.

The fundamental idea behind "compiler optimizations enabled by undefined behaviour" is that the compiler may act as though undefined behaviour never actually happens in any program (even if it could statically prove otherwise for the current program).

Therefore, we may reason:

  1. If we know that a and b refer to different locations in memory, then we can conclude that modifying b's memory doesn't modify a's memory. Thus, the value stored at a hasn't changed since it was copied into t; therefore *a - t is equivalent to t - t, which is 0 for all integers.

  2. Suppose we don't know whether a and b refer to different locations in memory, but we do know that if they refer to the same location, then the program has undefined behaviour. Then we may proceed as if we knew they refer to different locations. Why? Because if they refer to different locations, then obviously the result will be correct; and if they refer to the same location, then it doesn't matter what the resulting code does. Either way, the code conforms to the standard.

  3. Suppose we don't know whether a and b refer to different locations in memory, in general. If there is an incompatible pointer assignment between those memory regions, then it would be undefined behaviour for them to be the same location; therefore by the previous step we assume they are different locations. But if there are only compatible pointer assignments, then we can make no assumptions.

Now we can understand why this reasoning doesn't apply to the second example. It is allowed for a and b to refer to the same memory location, as long as no assignment occurs to that underlying memory through those pointers. In this code, the underlying storage of b is modified via type punning - the only assignment occurs through compatible pointers. (The "loophole" exposed by pointer-to-character types is simply that they are compatible with all other pointer types.)

In the second code example, if we have a call like

int main() {
    long data = 1;
    f(reinterpret_cast<int*>(&data), &data);
}

then we don't invoke undefined behaviour: sizeof(long) >= sizeof(int), so there is enough valid data to make use of a within the function; and the function doesn't perform an incompatible pointer assignment, therefore it's acceptable that both pointers are to data's memory.

Therefore, the code should compile successfully, without applying the optimization. Depending on the platform endianness (and on whether long is a larger data type than int on the platform), the type-punning may cause the data to be zeroed out, such that the retrieved value of *a may differ from the value stored in t, such that the result may validly be -1.

Whereas if f is implemented as the first code block example, the call invokes undefined behaviour. The optimization may be performed, and this would cause 0 to be returned regardless of the platform (even though it should return -1 on some platforms). This is a consequence of the undefined behaviour, and one of many examples where it doesn't crash the program.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »