Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on Is `uint8_t` always an alias for a character type if it exists?

Parent

Is `uint8_t` always an alias for a character type if it exists?

+8
−0

Is uint8_t guaranteed to be a character type if it exists? Will using a uint8_t* to examine bytes of an object cause violation of the strict aliasing rule? Is the following legal code:

#include <cstddef>
#include <cstdint>
#include <cstdio>
#include <string>

int main() {
  std::string str{"Hello"};
  std::uint8_t* p = reinterpret_cast<std::uint8_t*>(&str);
  for (std::size_t i = 0; i < sizeof str; ++i) {
    std::printf("%d\n", *p++);
  }
}
History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

Post
+6
−0

Yes, it is in practice always a character type and you can safely assume as much, both in terms of (g)lvalue access and in terms of strict pointer aliasing. If not, the compiler would soon render itself completely useless.

C and C++ both got the following rule (C17 7.20.1.1/3)

intN_t ... uintN_t ...

These types are optional. However, if an implementation provides integer types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed types) that have a two’s complement representation, it shall define the corresponding typedef names.

So if your system supports 8 bit 2's complement numbers, it must support uint8_t. No exceptions - not even for freestanding (embedded) systems - stdint.h is one of the mandatory headers for all conforming implementations (C17 4/6).

And for such a system it does not make sense to define unsigned char as anything else but 8 bits. CHAR_BITS will be 8.

Padding bits, trap representations and other such exotic oddities is not allowed for character types either, nor can trap representations exist in 2's complement integers.

In practice, all known real-world compilers will simply implement uint8_t as a typedef for unsigned char. You can easily prove this by trial and error:


C

_Generic((uint8_t){0}, uint8_t:0, unsigned char:0);

error: '_Generic' specifies two compatible types


C++

void f (unsigned char c){}
void f (uint8_t c){}

error: redefinition of void f(uint8_t)
note: void f(unsigned char) previously defined here void f (unsigned char c){}


For those few exotic systems that have 16 bit bytes or other oddities like 1's complement, they cannot support uint8_t in the first place.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

General comments (6 comments)
General comments
Ayxan Haqverdili‭ wrote almost 4 years ago

Could there be an implementation-specific type like __u8, which is a distinct type from unsigned char even though both are 8 bits? Then uint8_t could be an alias for that.

Lundin‭ wrote almost 4 years ago

@Ayxan Haqverdili‭ The key here is 8 bit two's complement. Some manner of custom type could in theory have all manner of representations. unsigned char can only have one representation and signed char can only be 1's compl, 2's compl or signed magnitude. In case the system uses 2's compl and 8 bit characters, then the uint8_t and int8_t will use that too. Since 2's complement must be supported and UINTN_MAX honored, there can be no trap representations. But __u8 could be anything.

Ayxan Haqverdili‭ wrote almost 4 years ago · edited almost 4 years ago

I am assuming __u8 is an integer type exactly like unsigned char except for the strict aliasing rule. So, a function like void foo(__u8* a, int* b); can assume that the arguments are pointing to different memory addresses. Would it then be possible for an implementation to do typedef __u8 uint8_t;? I don't understand how representation of the integers (2's compl, etc.) apply here.

Lundin‭ wrote almost 4 years ago

@Ayxan Haqverdili‭ It doesn't make sense for a compiler to implement any other character type than unsigned char so why would it? If there exists a type unsigned char which is 8 bits and another type uint8_t which is also 8 bits and neither have padding bits, why would you make them non-compatible or different types? Simply to break your own compiler?

Ayxan Haqverdili‭ wrote almost 4 years ago

@Ludin Could be done to allow more aggressive optimizations by taking advantage of the fact that strict aliasing applies. The kind of optimization restrict allows in C.

Lundin‭ wrote almost 4 years ago

@Ayxan Haqverdili‭ In order to do that, the compiler would have to implement some crazy scheme to keep the types different and non-compatible internally. And for what purpose, just to break code unexpectedly? Compilers, most notably gcc, has already received a tonne of criticism for strict aliasing abuse in the early 2000s and its attempt to become an ISO compliant death station rather than a quality implementation that is useful for its users. I very much doubt they want to go back to that.