Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on Storing more bytes than a union member has, but less than the union size, with memcpy(3)

Parent

Storing more bytes than a union member has, but less than the union size, with memcpy(3)

+5
−0

Let's say we have an object, we store it in a union (into some other narrower type, but with memcpy(3), so it's allowed --I guess--), and then read it from the union via it's original type (so no alignment issues or anything.

$ cat union.c 
#include <string.h>

struct s { int       a;  int       b; };
struct t { int       a;               };
union u  { struct s  s;  struct t  t; };

int
main(void)
{
	struct s  x = {42, 53};
	union u   y;
	int       z;

	memcpy(&y.t, &x, sizeof(x));  // y.t has declared/effective type of 'struct t'
	z = y.s.b;  // Is this UB?

	return z;
}

I would guess the above is undefined behavior, exactly at the point of the read of y.s.b.

The reason is that since we created an object of type struct t via memcpy(3), then the compiler is free to assume that the object is no wider than sizeof(struct t), and so y.s.b (which is beyond that) would be "uninitialized" (even though we really wrote bytes to it).

Is it UB as I expect?

However, neither GCC and Clang complain about such program:

$ gcc-13 -Wall -Wextra -Wpedantic -pedantic-errors union.c -O3 -fanalyzer -fsanitize=undefined -fsanitize=address
$ ./a.out; echo $?
53
$ clang-16 -Wall -Wextra -Wpedantic -pedantic-errors union.c -O3 -fsanitize=undefined -fsanitize=address
$ ./a.out; echo $?
53

BTW, does it change if I change and use allocated memory?

int
main(void)
{
	struct s  x = {42, 53};
	union u   *y = xmalloc(sizeof(union u));  // No declared/effective type
	int       z;

	memcpy(&y->t, &x, sizeof(x));  // This sets the effective type to 'struct s'
	z = y->s.b;  // No UB?

	return z;
}
History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

Post
+3
−0

Reading from a union member who's size is larger than that of the last written member is explicitly allowed since C99, but the value of the extra bytes is unspecified. From the cppreference page on C unions:

If the size of the new type is larger than the size of the last-written type, the contents of the excess bytes are unspecified (and may be a trap representation). Before C99 TC3 (DR 283) this behaviour was undefined, but commonly implemented this way.

Since the value of the bytes is unspecified, the implementation may use any value in any instace, and does not need to document the behavior. That said, GCC does document this behavior (and clang tries to follow GCC's implementation defined behavior on Linux):

The relevant bytes of the representation of the [union] object are treated as an object of the type used for the access.

Also note that the implementation must not assume the destination of the memcpy is smaller than a union u. Consider the definition of memcpy:

Copies count characters from the object pointed to by src to the object pointed to by dest. Both objects are interpreted as arrays of unsigned char.

Also consider that the object pointed to by dest is a union u (whos size is the maximum size of all its members) not a struct t, even though it may be treated as a struct t in some cases.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

1 comment thread

Type of object pointed to by dest (2 comments)
Type of object pointed to by dest
alx‭ wrote 11 months ago

I think the object pointed to by dest is a struct t, not a union, since I use a pointer to the member. Only if I had casted it to the union type, or if I had used the union directly, it would have been the union, I think.

GrantMoyer‭ wrote 8 months ago

@alx My understanding is that the object at a memory address is whatever object was initially created there, regardless of whatever type a pointer to that address seems to be. For example, if you create an int n, it doesn't matter if you cast it like float* a = (float*) &n, the value pointed to by a is still an int (so dereferencing a has undefined behavior). The prior example is fairly obvious, but the same rule still applies to unions, so whatever the type of a pointer to a union's address is, the pointed to object is still a union. However, unions have cases where is explicitly ok to dereference them as another type.

That said, I don't have a reference for this understanding in the standard. If I find one, I'll provide it.