Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Can I access an array element from a pointer to an object contiguous with but outside the array?
C prohibits accessing an array out of bounds even if measures were taken to ensure that what should lie outside those bounds were known:
struct MyStruct {
int x[2];
int y, z;
};
static_assert(sizeof(MyStruct) == sizeof(int[4]), "Unexpected Padding");
struct MyStruct s;
In the above case y
is guaranteed to be immediately after the last member of x
because otherwise the static_assert
would fail at compile-time. Nevertheless any attempt to access the memory in s.y
using s.x[2]
would be undefined behavior.
However is the converse true? If accessing outside an array given a pointer to inside the array is undefined behavior, is accessing inside an array given a pointer outside the array legal? For example could I access the memory of s.x[1]
using (&s.y)[-1]
since s.y
is not an array?
1 answer
The following users marked this post as Works for me:
User | Comment | Date |
---|---|---|
CPlus | (no comment) | Apr 10, 2024 at 18:19 |
The problem with undefined behavior due to array out of bounds happens whenever we use pointer arithmetic, which is only defined to work within the bounds of an array. Where plain variables, "scalars", are defined to behave just the same as arrays of 1 item, as far as pointer arithmetic is concerned.
So in your example, y
is to be regarded as an array like int y[1]
and therefore (&s.y)[-1]
would be just as out of bounds as s.x[2]
and therefore also undefined behavior.
Relevant parts of the C standard can be found below "additive operators". This because arr[i]
is guaranteed to be equivalent to *(arr + i)
and so the rules for the +
operator is what matters.
From C17 6.5.6:
For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
/--/
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
Meaning you can't point at array item [-1]
either.
Other forms of UB we might encounter here is misaligned access or "strict aliasing" violations.
In this case alignment is not an issue and the static assert ensures there's no padding - naturally there won't be any padding in practice, in any known real-world system - padding can only happen in theory here.
Strict aliasing is not an issue either as long as we use pointers to int
for the access, since the actual type of the object store in memory is int
.
There is a possible work-around which is well-defined: we can always inspect the struct through a pointer to character. This is well-defined due to a special rule in 6.3.2.3 that allows any object in C to be inspected byte by byte.
So this code is well-defined:
#include <stdio.h>
#include <stddef.h>
struct MyStruct {
int x[2];
int y, z;
};
int main() {
struct MyStruct s = { {1,2},3,4 };
static_assert(sizeof(struct MyStruct) == sizeof(int[4]), "Unexpected Padding");
unsigned char* ptr = (unsigned char*) &s;
ptr += offsetof(struct MyStruct, y);
printf("y: %d\n", *(int*) ptr);
ptr -= sizeof(int);
printf("x[1]: %d\n", *(int*) ptr);
return 0;
}
Output:
y: 3
x[1]: 2
In this case the whole struct is to be regarded as type
unsigned char [sizeof(struct MyStruct)]
for the purpose of out-of-bounds checks and within this array we can access any byte. And given that the character pointer points at memory properly aligned for an int
, from there we can safely cast to int
and de-reference, because the actual type of the data ("effective type") is indeed int
.
0 comment threads