Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Where to place digit separators in C23?

+7
−0

C23 introduces the digit separator ' which can be placed anywhere inside an integer constant for the purpose of clarity and self-documenting code. These are otherwise ignored by the compiler when determining the value of the number.

However, the language standard provides no guidance regarding how to sensibly use digit separators. These were introduced to C with proposal N2626 which in turn provides no guidance either - for example it suggests that 2'3434'5323 might be clearer to read than 234345323, which I as a frequent user of engineering notation don't quite agree with. I believe the same feature was introduced in C++14 but with no guidance there either.

Are we to add ' at a whim or will are there any recommended practices to follow?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

Out of curiosity, do you know why they chose this character? I'm asking because many languages use `_... (3 comments)

2 answers

+7
−0

Since this is all new, there might still be time to establish a consensus before this style feature too ends up "all over the place" (like upper/lower case hex, upper/lower case integer constant suffices etc).

Luckily we can lean on established computer science in this case - there are already best engineering practices for how to write numbers with various bases. If using those present best practices, then we end up with something like this:


Decimal integer/floating point constants (base 10)

Since programming sorts under the domain of engineering, these should respect engineering notation, which means that decimal values are conveniently expressed is multiples of 103 or 10-3. That is: tera, giga, mega, kilo, milli, micro, nano, pico and so on.

// appropriate style examples:
1'000'000
1'000'000.0
0.000'000
.000'000

// BAD style examples, do not use:
1'0000'0000
1'2'3
12'34'56
12.34'56'78

Binary constants (base 2)

Binary numbers are by convention always grouped either by nibbles or bytes. Grouping them by any larger unit will become unreadable. Grouping them as anything else but groups of 4 is senseless, except for cases where you have a number of bits not divisible by 4. In that case, remaining bits are placed to the left.

// appropriate style examples:
0b0000'0000'0000'0000
0b00000000'00000000
0b10'1010'1010

// BAD style examples, do not use:
0b00'00'00'00
0b0000000000000000'0000000000000000
0b1010'1010'10

Hexadecimal constants (base 16)

Hex might be grouped in several different ways. Sometimes it might make sense to group it on byte level, sometimes as 16 bit words. 32 bit words without decimal separators are harder to read. Breaking up nibbles doesn't make sense either. In case of numbers that aren't divisible by 16 bits, remaining bits are placed to the left.

// appropriate style examples:
0x00'00'00'00
0x0000'0000
0xAA'BBCC

// BAD style examples, do not use:
0x0'0'0'0
0x0000000000000000'0000000000000000
0xAABB'CC
History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

2 comment threads

Good idea, but maybe units of four for binary and hex. (1 comment)
How do you suggest formatting `200819` in `#define _POSIX_C_SOURCE 200819`? (2 comments)
+4
−1

This probably makes for a thoroughly unsatisfying answer, but there's probably far stronger cultural pressure than technical pressure. In European-derived cultures, we mostly group numbers by powers of one thousand as you suggest, and anybody doing something else should either have an extremely good reason ("it actually represents a series of decimal values, but we store them together, because we learned programming in 1967") or would get laughed out of any code review.

However, people in East Asian cultures group digits as factors of ten thousand in speech, even though they'll (usually) write it following European conventions. And India generally does pairwise separation, except for the final three digits. I assume that other approaches exist, but those get commonly cited.

So, while in power-of-two bases, it's probably safe to assume that some natural word-multiple boundary (eight in binary, three in octal, and four in hexadecimal seems consistent in what I've seen over the years, but as another answer points out, architecture may easily figure in, here) could and should become a strongly-encouraged convention, we'd want to take care that a convention for decimal representation doesn't ask billions of people to write code that's less readable for them.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

Regarding culture (3 comments)

Sign up to answer this question »