Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Can I access an array element from a pointer to an object contiguous with but outside the array?

+6
−0

C prohibits accessing an array out of bounds even if measures were taken to ensure that what should lie outside those bounds were known:

struct MyStruct {
    int x[2];
    int y, z;
};
static_assert(sizeof(MyStruct) == sizeof(int[4]), "Unexpected Padding");
struct MyStruct s;

In the above case y is guaranteed to be immediately after the last member of x because otherwise the static_assert would fail at compile-time. Nevertheless any attempt to access the memory in s.y using s.x[2] would be undefined behavior.

However is the converse true? If accessing outside an array given a pointer to inside the array is undefined behavior, is accessing inside an array given a pointer outside the array legal? For example could I access the memory of s.x[1] using (&s.y)[-1] since s.y is not an array?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

1 answer

+5
−0

The problem with undefined behavior due to array out of bounds happens whenever we use pointer arithmetic, which is only defined to work within the bounds of an array. Where plain variables, "scalars", are defined to behave just the same as arrays of 1 item, as far as pointer arithmetic is concerned.

So in your example, y is to be regarded as an array like int y[1] and therefore (&s.y)[-1] would be just as out of bounds as s.x[2] and therefore also undefined behavior.

Relevant parts of the C standard can be found below "additive operators". This because arr[i] is guaranteed to be equivalent to *(arr + i) and so the rules for the + operator is what matters.

From C17 6.5.6:

For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
/--/
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

Meaning you can't point at array item [-1] either.


Other forms of UB we might encounter here is misaligned access or "strict aliasing" violations.

In this case alignment is not an issue and the static assert ensures there's no padding - naturally there won't be any padding in practice, in any known real-world system - padding can only happen in theory here.

Strict aliasing is not an issue either as long as we use pointers to int for the access, since the actual type of the object store in memory is int.


There is a possible work-around which is well-defined: we can always inspect the struct through a pointer to character. This is well-defined due to a special rule in 6.3.2.3 that allows any object in C to be inspected byte by byte.

So this code is well-defined:

#include <stdio.h>
#include <stddef.h>

struct MyStruct {
    int x[2];
    int y, z;
};

int main() {
    struct MyStruct s = { {1,2},3,4 };
    static_assert(sizeof(struct MyStruct) == sizeof(int[4]), "Unexpected Padding");

    unsigned char* ptr = (unsigned char*) &s;
    ptr += offsetof(struct MyStruct, y);
    printf("y: %d\n", *(int*) ptr);
    ptr -= sizeof(int);
    printf("x[1]: %d\n", *(int*) ptr);

    return 0;
}

Output:

y: 3
x[1]: 2

In this case the whole struct is to be regarded as type
unsigned char [sizeof(struct MyStruct)] for the purpose of out-of-bounds checks and within this array we can access any byte. And given that the character pointer points at memory properly aligned for an int, from there we can safely cast to int and de-reference, because the actual type of the data ("effective type") is indeed int.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »