Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on What is C23 and why should I care?

Parent

What is C23 and why should I care?

+18
−0

The C language has gone through many iterations and the latest one not yet released is informally called "C23", supposedly because they hoped to release it in 2023. The latest draft version N3096 is available for free from the ISO C working group.

The previous standard C17 (ISO 9899:2018) didn't really affect the average C programmer. It was a minor revision and mostly concerned with language error fixes in non-trivial areas of the language, which are likely of peripheral interest to the average user.

The version before that, C11, was a bigger revision but probably went by mostly unnoticed by the average C programmer as well. Some might have picked up _Generic, anonymous structs, static asserts and similar features.

What is new in C23 and how will it affect the average programmer?

EDIT: C23 is now formally the current version of ISO C, as per October 2024.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

Post
+15
−0

C23 will be a major revision, certainly the biggest one since C99. It contains lots of changes and new features, both big and small. The linked draft in the question contains a complete list of changes, but it might be a bit overwhelming to read unless you are a "C nerd" and if you aren't used at reading technical standards. There's a lot of error fixes, for example volatile access getting fixed and re-defined (it has actually been quite broken since the first C version). But this answer will focus on new features that are might have an impact on the way the average C programmer writes code.

  • bool, false and true are now proper keywords in the language. No need to include stdbool.h (which will get phased out). _Bool will remain a keyword. The way C treats booleans does not change though - the result of logical/relational expressions (a == b, a && b, a < b and the like) will still be of type int and of value 1 or 0. But otherwise C will now mostly behave like C++ has always done.

  • Similarly static_assert, alignas, alignof and other keywords that exist as keywords in C++ but only existed through header inclusion in C are now proper keywords. static_assert also gets a little upgrade - the string provided as second argument to it is now optional. So it can be used just like old-school assert but is evaluated at compile-time instead of run-time.

  • nullptr is a new keyword also included mainly for C++ compatibility. It is a constant guaranteed to be a null pointer constant much like NULL has always been, but without all the subtle stuff like 0 being a either a null pointer constant or an integer constant (see What's the difference between null pointers and NULL?). A new type nullptr_t was added and a nullptr constant is of this type. Compilers are already starting to implement error/warnings for cases where you pass a nullptr_t to a function - instead of silently accepting it.

    NULL is still available like before.

  • typeof keyword added to the language. Like many others of these features, this has been available as a non-standard extension for quite a while. Along with existing _Generic, it is handy for type safety and type-generic programming, designing with inheritance, maintaining troublesome function API in existing code in a safe manner and so on.

    The usage of typeof is quite easy - given some int x; you can declare another variable or expression as typeof(x) y; and that's the same as int y;. These are resolved at compile-time. There is also a corresponding typeof_unqual keyword which works the same but also discards qualifiers (const, volatile etc).

  • New keyword constexpr (which does not quite work as C++) allows you to write initializer lists, compound literals and the like which will then definitely get treated as constant expressions. The exact impact of this is still fuzzy to me, since the mainstream compilers gcc and clang are the only ones so far with significant C23 support and both of these prior C23 had questionable standard compliance when it comes to constant expressions. So I haven't yet been able to test it out with any confidence myself.

  • Binary integer constants (like 0b01010101) were added to the language. This has been a common compiler extension forever, but it's now part of the standard and portable. printf/scanf family of functions now support %b/%B.

  • Digit separator ' added to the language. The main rationale for not including binary constants in standard C was always because something like a 32 bit constant 0b10101010111010100010101110100011 is an unreadable mess and unhelpful for any purpose.

    So the pre-condition for allowing binary numbers was to also add a digit separator to the language - we can now write 0b10101010'11101010'00101011'10100011 or 0b1010'1010'1110'1010'0010'1011'1010'0011. Digit separators are purely aestetic and are otherwise ignored by the compiler, so we can place them anywhere. But they are also helpful for making large decimal or hex numbers more readable: 1'000'000 or 0xAA'BB'CC'DD.

  • Exotic signed number formats are removed. For various confused and historical reasons, C has until now supported exotic signed number formats such as one's complement and signed magnitude. These are no longer supported - the only signed format now supported is two's complement. Which is what 99.99% of all computers happen to use.

  • Function declarations without prototypes ("K&R style" functions) removed. The feature of declaring a function as void func() ("no prototype") and then later defining it as void func (x) int x;{} has been flagged by ISO as obsolescent since forever - it is now finally removed from the language. This will probably only affect those maintaining really old code bases, as nobody should have been using these in new code during the past 30 years.

    Until C23, void f() has been regarded as "a function accepting any parameter". In C23, this will finally become equivalent to "function taking no parameter - void".

    Because of this, the clang compiler enforced some nasty changes. In newer version of the compiler, when compiling against an older C standard, clang will whine if you write something like int main() and say "error: a function declaration without a prototype is deprecated in all versions of C". This is true, even Dennis Ritchie who invented C back in the days recognized that this was a design flaw to be fixed in the future. But existing code might have a hard time if you compile in new versions clang, as anything but C23.

  • Function attributes added to the language. This is a pretty big one, with quite diverse applications. Various compilers have non-standard syntax for "function attributes" and the like (gcc/GNU C in particular) and some of these are now standardized, as well as the syntax. Attributes aren't limited to functions but can be used for variables, types and a lot of other language items. The main purpose of the standardized attributes introduced with C23 is to enable better compiler diagnostics and self-documenting code. I won't go into all of these in detail but some of them are quite handy - see below.

  • Function attribute [[nodiscard]]. Prior to C23 there's a big quality of implementation issue with compilers for situations like calling the function
    important_t returns_important_info (void);. "PC-like" compilers like gcc and clang traditionally have a lax approach and won't complain if you call this function like returns_important_info(); and silently ignore the returned value. Whereas more higher integrity compilers like embedded systems ones will complain about that. While on the other hand, the same compilers they will also whine if you ignore the return value of printf etc.

    Declaring the function as [[nodiscard]] important_t returns_important_info (void); will ensure that the compiler always warns when someone tries to call the function as returns_important_info(); without handling the return value.

  • Similarly, the [[maybe_unused]] function attribute can be used for disabling diagnostics when some parameters or variables aren't used. A common trick is to cast such variables to (void) which is quite portable but clutters up the code.

    Also this attribute isn't restricted to variables but can be used for functions, enums etc. volatile int* my_register; ... int dummy = *my_register; is valid code that can be used for a read access that is discarded. Compilers might whine about dummy not getting used, which we can now block with [[maybe_unused]] int dummy = *my_register;.

  • The [[deprecated]] function attribute can be added to give the user a warning when they for example use some old function in your library that has been replaced with a better alternative but left around for backwards compatibility. This will probably be great for maintaining old code bases.

  • The [[fallthrough]] attribute can be used to explicitly document a certain shady practice of writing a switch-case where one case "falls through" to the next one without a break.

    For example if I write something fishy like switch(1){ case 1: puts("1"); case 2: puts("2"); } then any half-decent compiler should warn me "huh didn't you mean to have a break there?". In case we want something to fall through intentionally, we'd usually add a comment about it and then maybe go disable a specific compiler warning. Now we can write switch(1){ case 1: puts("1"); [[fallthrough]]; case 2: puts("2"); } and get self-documenting code as well as the warning surpressed all at once.

  • Write labels everywhere. Arguably nobody should be writing labels (example:) to begin with, since the main use of labels is goto spaghetti programming. However, case statements are also considered labels by C and this rule means that we can now finally declare variables inside case statements, without adding a {} compound statement.

    That sounds nice at first, though be aware that a variable declared inside a case without {} has the same scope as the whole switch. Which means that they are accessible throughout the entire switch and that a switch with variable declarations inside now comes with the same undefined behavior problems like goto spaghetti "jumping across a variable declaration", for example switch(2){ case 1: int x=5; printf("%d",x); case 2:printf("%d",x); /* UNDEFINED BEHAVIOR! */ } So the previous best practice of always using {} for each case is still the best practice.

  • New functions strdup/strndup were added to string.h. These are widely used non-standard extensions until now. These functions heap allocate room for a copy of a string and then copies it. Until now the only portable way to do so was to type out all of char* dst = malloc(strlen(src)+1); strcpy(dst, src);.

  • A new function memccpy was added to string.h. This function copies any data until a certain byte has been found and is likely to be more efficient than the hand-made equivalent. Like memcpy it does not require data to be aligned, but does not support copying between overlapping data. It returns a null pointer if the character was not found within the specified n interval but copies n bytes of data still.

    This function might be handy in various scenarios like parsing input until a new line character has been found. And it is also the final nail in the coffin for the dangerous strncpy function (Is strcpy dangerous and what should be used instead?), since we can now use the safer, faster and less ambigous memccpy(dst, src, '\0', n) in every situation where one might have previously been tempted to use strncpy(dst, src, n).

  • realloc(ptr,0) is now explicitly undefined behavior. This was always poorly-defined and discouraged, but compilers were allowed to handle this in compiler-specific ways until now.

  • New pre-processor directives #elifdef/#elifndef. Previously we had to write #elif defined(x).

  • New pre-processor directive #warning. This one has been available as a non-standard extension for quite a while, whereas #error has been standardized. Now both are available.

  • New pre-processor directive #embed. This is used for inclusion of the contents of "binary blob" files directly into the initialization lists of your source without having to involve make files and/or linker scripts/tricks.

  • The end (hopefully) to the debate about if C should support variable-length arrays (VLA) as an optional or mandatory feature. Now declaring a VLA object is an optional feature, but declaring pointers to VLA ("variable types") is a mandatory feature. This makes sense since VLA objects aren't often very helpful, as they tend to be allocated on the stack. But pointers to VLA is a very useful tool in many situations such as when declaring function interfaces or working with multi-dimensional arrays.

  • Bug fixes for a lot of library functions (search functions in particular): we can now pass a const-qualified pointer parameter to a library function and assign the result of the function to a similarly const-qualified pointer. These new features are named QChar and QVoid and are only used for char* and void* parameters to certain library functions.

    A lot of library functions in C have been broken by design since the dawn of time - for example it was not possible to write a string parser using strstr like: current = strstr(current, key) in case current was properly const correct as const char*... because strstr returns a char*. A lot of C library functions had this "broken by design" API which has now been corrected.

    So now these functions look like QChar *strstr(QChar *s1, const char *s2); where QChar or QVoid is a fix which means that the function will return the same type as was passed to s1.

  • Empty initialization lists int arr[3] = {}; is now part of the standard and equivalent to initializing everything to zero. This was a non-standard extension/C++-only syntax until now.

  • Trigraphs are removed from the language??! If you know what trigraphs are, then you know that they need to go. If you don't know what trigraph are, then you are lucky! :) This should have no impact on the average program unless you are maintaining some old, ugly code base in a non-English country. (Digraphs are still there, for all your obfuscated C code needs.)

  • A new library stdbit.h (as of this moment I have not yet seen a compiler implement it alas). This library contains a way to check CPU endianess, and functions to perform various bit handling tricks like counting leading/trailing zeroes/ones in raw binary.

  • A new library stdckdint.h for checked integer arithmetic. Meaning functions that perform integer overflow/underflow checks. The functions supported being ckd_add(), ckd_sub() and ckd_mul() that will be type-generic for all integer types. They will perform the operation and return true if successful, false in case of overflow/underflow.

  • __STDC_VERSION__ for C23 is confirmed to be 202311L. gcc/clang are so far using a placeholder value 202200L when you compile with -std=c2x.

    If I have missed something, do let me know. I would imagine that this post will be a work in progress and C23 is not yet formally released. The malloc/realloc parts in N3096 are for example filled with errors and inconsistencies even in the latest draft. Also the ISO wheels of bureaucracy grind slowly - rumors say that they might not be able to release it until 2024.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

7 comment threads

re: realloc(p, 0) was always poorly-defined and discouraged (1 comment)
C23 has added new format specifiers for exact-width integer types, least-width integer types, et cete... (1 comment)
{} initialize to 0 or NULL (11 comments)
memccpy(3) isn't safer than strncpy(3) (5 comments)
memccpy(dst, src, '\0', n) isn't appropriate for _every_ situation of (tentative) strncpy(dst, src, n) (1 comment)
Show more
{} initialize to 0 or NULL
alx‭ wrote about 1 year ago

{0} initializes to 0. {} initializes to 0, except pointers, which are initialized to NULL. It's pedantic, but in some unicorn implementations it may be meaningful.

Lundin‭ wrote about 1 year ago · edited about 1 year ago

alx‭ There is no difference. 0 is a null pointer constant. What applies during initialization of pointers is the rules of assignment C17 6.5.16: "- the left operand is an atomic, qualified, or unqualified pointer, and the right is a null pointer constant;". For incomplete initialization lists, 6.7.9 applies: "If an object that has static or thread storage duration is not initialized explicitly, then: - if it has pointer type, it is initialized to a null pointer" /--/ "If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration."

C23 will not change this.

Lundin‭ wrote about 1 year ago

And to be pedantic, nothing is ever initialized to NULL but to a null pointer. For confusion between NULL and null pointers, check out: What's the difference between null pointers and NULL?

alx‭ wrote about 1 year ago · edited about 1 year ago

(D'oh! yep, I meant a null pointer. You're right being pedantic. :)

Hmm, sorry, for some reason I thought {0} was shorthand for memset(3). It isn't.

Nevertheless, the wording in the answer "equivalent to initializing everything to zero" is technically wrong, I'd say, since the word zero doesn't count as a null pointer.

Also, I learnt that {0} doesn't write to padding bits, whereas {} does.

https://thephd.dev/ever-closer-c23-improvements#consistent-warningless-and-intuitive-initialization-with--

Lundin‭ wrote about 1 year ago

alx‭ Stop reading fishy blogs (JeanHeyd Meneide? he should know better) and instead read the mentioned part in C17 6.7.9 §10 (C23 6.7.10 §11) regarding implicit initialization of variables with static storage duration. It clearly states "if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits." This was always the case, I have no idea what he is on about.

alx‭ wrote about 1 year ago · edited about 1 year ago

Agree. That blog seems wrong (I'll write to JeanHeyd later). This morning I was a bit dense, and didn't find the relevant paragraph. 6.7.9p21 seems to be it http://port70.net/%7Ensz/c/c11/n1570.html#6.7.9p21

So, it seems both {0} and {} do the right thing for pointers, and so does implicit initialization for objects with static storage duration (6.7.9p10).

alx‭ wrote about 1 year ago · edited about 1 year ago

Lundin‭ I got a quick response from JeanHeyd. The standard seems unclear in the wording, and a reasonable interpretation could be that (some) padding is not zeroed. I'll clarify below the quotes.

6.7.9p21 "If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate,"...

That is the case, as 0 can only represent the first member.

... "the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration."

With the remainder, we could understand everything that is not the first subobject (as we thought), or we could understand every subobject except for the first one, and not including padding. The latter interpretation is the one implemented by GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77992. Clang seems to differ, and zeroes the padding (which is valid even with that interpretation, as zero is a subset of garbage).

alx‭ wrote about 1 year ago · edited about 1 year ago

Here's an example:

struct s2 {
	int8_t   c;
	int32_t  i;
};

struct s1 {
	int8_t     c;
	int32_t    i;
	struct s2  s;
};

void f(void)
{
    struct s1  x = {0};
}

The padding within x itself would not be zeroed, but the padding within x.s would be zeroed (because x.s is part of "the remainder", and static rules say the padding is zeroed).

Lundin‭ wrote about 1 year ago

alx‭ That seems like an unreasonable interpretation. The whole purpose of initializing something "as if it has static storage duration" with {0} is that the object may then potentially get allocated in .bss (rather than .data), which leads to faster initialization. Any implementation doing a selective zero-initialization of struct members and purposely not touching padding bytes located in the middle of the struct is very fishy. "The standard allows us to produce slower code here so lets go for it!" Eeeh?

Lundin‭ wrote about 1 year ago

I would guess that the gcc rationale is rather the one from 6.2.6.1 "When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.51)" Where the foot note says: "Thus, for example, structure assignment need not copy any padding bits." Assignment and initialization are often following the same rules in C, particularly so during parameter passing. Arguably, the 6.2.6.1 part isn't compatible with 6.7.9, given that initialization "stores" something.

alx‭ wrote about 1 year ago · edited about 1 year ago

Lundin‭

In the case of {0}, it looks rather weird. But imagine this structure:

auto struct s {
    int8_t   x;
    int32_t  y;
} s = {.x = foo, .y = bar};

Initializing the padding could make it slower. I agree that the standard could have special-cased {0}, since anyway it's already special (it works even if the first subobject is not a scalar).