Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on Where to place digit separators in C23?

Parent

Where to place digit separators in C23?

+7
−0

C23 introduces the digit separator ' which can be placed anywhere inside an integer constant for the purpose of clarity and self-documenting code. These are otherwise ignored by the compiler when determining the value of the number.

However, the language standard provides no guidance regarding how to sensibly use digit separators. These were introduced to C with proposal N2626 which in turn provides no guidance either - for example it suggests that 2'3434'5323 might be clearer to read than 234345323, which I as a frequent user of engineering notation don't quite agree with. I believe the same feature was introduced in C++14 but with no guidance there either.

Are we to add ' at a whim or will are there any recommended practices to follow?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

Out of curiosity, do you know why they chose this character? I'm asking because many languages use `_... (3 comments)
Post
+7
−0

Since this is all new, there might still be time to establish a consensus before this style feature too ends up "all over the place" (like upper/lower case hex, upper/lower case integer constant suffices etc).

Luckily we can lean on established computer science in this case - there are already best engineering practices for how to write numbers with various bases. If using those present best practices, then we end up with something like this:


Decimal integer/floating point constants (base 10)

Since programming sorts under the domain of engineering, these should respect engineering notation, which means that decimal values are conveniently expressed is multiples of 103 or 10-3. That is: tera, giga, mega, kilo, milli, micro, nano, pico and so on.

// appropriate style examples:
1'000'000
1'000'000.0
0.000'000
.000'000

// BAD style examples, do not use:
1'0000'0000
1'2'3
12'34'56
12.34'56'78

Binary constants (base 2)

Binary numbers are by convention always grouped either by nibbles or bytes. Grouping them by any larger unit will become unreadable. Grouping them as anything else but groups of 4 is senseless, except for cases where you have a number of bits not divisible by 4. In that case, remaining bits are placed to the left.

// appropriate style examples:
0b0000'0000'0000'0000
0b00000000'00000000
0b10'1010'1010

// BAD style examples, do not use:
0b00'00'00'00
0b0000000000000000'0000000000000000
0b1010'1010'10

Hexadecimal constants (base 16)

Hex might be grouped in several different ways. Sometimes it might make sense to group it on byte level, sometimes as 16 bit words. 32 bit words without decimal separators are harder to read. Breaking up nibbles doesn't make sense either. In case of numbers that aren't divisible by 16 bits, remaining bits are placed to the left.

// appropriate style examples:
0x00'00'00'00
0x0000'0000
0xAA'BBCC

// BAD style examples, do not use:
0x0'0'0'0
0x0000000000000000'0000000000000000
0xAABB'CC
History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

2 comment threads

Good idea, but maybe units of four for binary and hex. (1 comment)
How do you suggest formatting `200819` in `#define _POSIX_C_SOURCE 200819`? (2 comments)
Good idea, but maybe units of four for binary and hex.
hackerb9‭ wrote 4 months ago

I think this is a good idea. What standards do you compare this proposal to? Is there an IEEE/ISO standard for where to put commas in base ten numbers?

I notice that this is, on the surface, similar to "grouping" in locales which is famously flexible in certain ways (varying number of digits between separators) and inflexible in others (no separators for the numbers after the decimal point). See POSIX.1-2024. That makes me think that base-10 may be a can of worms and it might be better to focus on other bases.

I suggest being a bit more prescriptive about binary and hexadecimal separators by choosing just one recommended style. Perhaps, a grouping of four units would be good for both. It is large enough that the human brain can "chunk" it into pieces, but not so large that it might exceed our ability to quickly perceive.