Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

How does the strict aliasing rule enable or prevent compiler optimizations?

+1
−0

Inspired by https://meta.stackoverflow.com/questions/432242, and the relevant main-space questions.

I have heard that C and C++ something called the "strict aliasing rule", which means for example that if we have code like:

int aliased(float *f, int *i) {
    *f = *(reinterpret_cast<float*> i);
    return *f;
}

then we have invoked undefined behaviour.

I also heard that character types are a loophole. That is, if we type-pun to (char *, signed char * or unsigned char *) then we can legally assign bytes from the underlying storage of the int to the underlying storage of the float.

I was told that this has implications for compiler optimizations. For example, here the compiler can treat return *a - t as if it were just return 0:

int f (int *a, long *b)
{
    int t = *a;
    *b = 0;          // cannot change *a
    return *a - t;   // can be folded to zero
}

But on the other hand, if we change how b is used (braces added for clarity):

int f (int *a, long *b)
{
    int t = *a;
    for (int i = 0; i != sizeof *b; ++i)
    {
        ((unsigned char*)b)[i] = 0;
    }
    return *a - t;   // must not be folded
}

Now the same optimization is invalid.

What is the underlying reasoning here? It seems to me like a and b are either compatible pointer types or they aren't. Why does code inside the function change the compiler's reasoning? And how does undefined behaviour enable the compiler to make these kinds of optimizations?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

1 answer

+1
−0

Aliasing is perhaps better understood as a property of assignments through pointers, not of the pointers themselves.

First, let's consider the reasoning behind applying the optimization to the first example.

In the first code block, the underlying storage of b is only assigned to via the b pointer. Since this assignment occurs through a long *, but a is an int *, the strict aliasing rule means that the compiler may assume that the assignment does not affect the underlying storage of a.

The fundamental idea behind "compiler optimizations enabled by undefined behaviour" is that the compiler may act as though undefined behaviour never actually happens in any program (even if it could statically prove otherwise for the current program).

Therefore, we may reason:

  1. If we know that a and b refer to different locations in memory, then we can conclude that modifying b's memory doesn't modify a's memory. Thus, the value stored at a hasn't changed since it was copied into t; therefore *a - t is equivalent to t - t, which is 0 for all integers.

  2. Suppose we don't know whether a and b refer to different locations in memory, but we do know that if they refer to the same location, then the program has undefined behaviour. Then we may proceed as if we knew they refer to different locations. Why? Because if they refer to different locations, then obviously the result will be correct; and if they refer to the same location, then it doesn't matter what the resulting code does. Either way, the code conforms to the standard.

  3. Suppose we don't know whether a and b refer to different locations in memory, in general. If there is an incompatible pointer assignment between those memory regions, then it would be undefined behaviour for them to be the same location; therefore by the previous step we assume they are different locations. But if there are only compatible pointer assignments, then we can make no assumptions.

Now we can understand why this reasoning doesn't apply to the second example. It is allowed for a and b to refer to the same memory location, as long as no assignment occurs to that underlying memory through those pointers. In this code, the underlying storage of b is modified via type punning - the only assignment occurs through compatible pointers. (The "loophole" exposed by pointer-to-character types is simply that they are compatible with all other pointer types.)

In the second code example, if we have a call like

int main() {
    long data = 1;
    f(reinterpret_cast<int*>(&data), &data);
}

then we don't invoke undefined behaviour: sizeof(long) >= sizeof(int), so there is enough valid data to make use of a within the function; and the function doesn't perform an incompatible pointer assignment, therefore it's acceptable that both pointers are to data's memory.

Therefore, the code should compile successfully, without applying the optimization. Depending on the platform endianness (and on whether long is a larger data type than int on the platform), the type-punning may cause the data to be zeroed out, such that the retrieved value of *a may differ from the value stored in t, such that the result may validly be -1.

Whereas if f is implemented as the first code block example, the call invokes undefined behaviour. The optimization may be performed, and this would cause 0 to be returned regardless of the platform (even though it should return -1 on some platforms). This is a consequence of the undefined behaviour, and one of many examples where it doesn't crash the program.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »