Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on Why is the new auto keyword from C++11 or C23 dangerous?

Parent

Why is the new auto keyword from C++11 or C23 dangerous?

+7
−0

In older C and C++ standards, the auto keyword simply meant automatic storage duration. As in the compiler automatically handles where the variable is stored, typically on the stack or in a register. And it was a pretty useless keyword since it can only be used at local scope, where all variables default to automatic storage duration anyway.

The C++11 committee decided to change the meaning of this keyword so that during declaration, the type is picked based on the initializer(s) provided. For example auto i=0; will result in int because the integer constant 0 is of type int.

As I understand it, the main rationale was to get rid of cumbersome declarations in for loops in particular.

for(auto i = cont.begin(); ...

is admittedly easier for the eye than

for(std::vector<std::string>::iterator i = cont.begin(); ...

However, veteran programmers seem to raise concerns about auto being unsafe. It seems to be a topic where there's plenty of personal opinions as seen over at SO: How much is too much with C++11 auto keyword? Some people just happily encourage "go for it everywhere". Others, including various well-known C++ gurus, speak in favour of using it with caution.

Now C too is adapting the same functionality of auto as C++, as per C23.

What exactly is dangerous with the auto keyword?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

The question just assumes the `auto` keyword is "dangerous". Are there examples that are "dangerous" ... (3 comments)
Post
+6
−0

The auto feature was indeed mainly meant to solve long cumbersome template container declarations in C++. But when introduced to C23 - where there are no templates let alone template containers - it just ends up as a solution without any problem that it solves.

auto can create new problems just fine, however! And that goes for C and C++ both, although this answer will mainly focus on C where the feature is just about to get introduced. In C++ you can use auto as long as you know what you are doing and it is done with caution.

The only problem that the language committee(s) seem to have consider was backwards compatibility with the previous use of auto. C++20 (annex C) about compatibility for examples notes that using auto as a classic storage class specifier when no initializers are present is problematic. But I think that scenario is the least concerning use of auto though. The main problem lies in how it behaves as a new feature.

The problem with the new use of the auto keyword is that the actual type of the initializer is not often obvious. In many cases you won't even know which type you actually ended up with, which is often something very important to know. A lot of these problems are caused by well-known design mistakes and old language bugs in C, where adding auto to the pot makes things even worse.

In general, when we write an initializer which is wrong for whatever the reason, we like to be informed by the compiler that we messed up, rather than getting the code silently expected. This is the very reason why horribly dangerous language features like "implicit int" were removed from C ages ago.


Old, well-known language problems in C colliding with new language problems in C23
auto is particularly problematic in C23 because C has not come as far as C++ in correcting old sins of the past. For example auto ch = 'A' will give you a char in C++ but an int in C.

Or when dealing with boolean logic, something like auto a = b && c; will give you a bool in C++ but an int in C. Even if b and c happens to be bool operands.

Similarly, auto ptr = NULL may give you an int rather than a void* in both languages. Both languages supposedly encourage the use of nullptr instead, but there's a whole lot of old code out there using NULL.

Re-writing the old malloc(n * sizeof(*ptr)) trick will also suffer as it can't be written as auto ptr = malloc(n * sizeof(*ptr));

Having some typedef enum { A } a; and then auto x = A; will result in an int and not an a. Where a may be a smaller integer type than int.

Except when you use the new enum feature in C23 and do typedef enum : int8_t { A } a;. Now auto x = A; suddenly results in an a type.


Const/qualifier correctness
Another sin of the past would be that auto ptr = "hello" leads to a char* in C and not a const char* as in C++.

Well we can fix that easily enough, we just write const auto ptr or auto const ptr right? Not quite... Just as in the case of hiding a pointer behind a typedef, we end up with a char* const and not a const char* as was the intention.

So it simply turns out that you can't meaningfully combine auto and const in C. Meaning you can't have auto and const correctness at the same time.


Subtle type rules
auto is particularly nasty when used in low-level programming, together with certain operators, resulting in another type and/or signedness than expected.

Consider something like this:

unsigned int i = 1+1;
i = ~i;
printf("%#x\n", i); // prints 0xfffffffd 
i += 3;
printf("%#x\n", i); // prints 0

That's well-defined code. Now how about auto...

auto i = 1+1;
i = ~i;
printf("%#x\n", i); // undefined behavior, wrong conversion specifier
i += 3;             // undefined behavior, integer overflow
printf("%#x\n", i);

Oops. Well how about this?

auto i = 0xFFFFFFFF;
i = ~i;
printf("%#x\n", i); // well-defined, prints 0
i -= 3;             // well-defined
printf("%#x\n", i); // well-defined, prints 0xfffffffd

A slip of the type used by the initializer can obviously have major consequences and tracking down the root cause of that bug may not be easy.

auto f = true ? 1.0f : 0.0; would be another subtle type promotion rule of C. Here f ends up as double, which might not have been expected.

Something like auto c = a | b; where a and b are bool, char or unsigned short etc will result in c becoming an int in both C and C++ due to integer promotion.

In case of short a = 1; auto b = -a; we might have expected b to also become short and not int.

And so on.


Wrong initializer by mistake
When dealing with more complex declarations like 2D arrays and pointers to them, a simple slip of the finger can silently result in the wrong type.

int arr [2][2];
auto p1 = arr;
auto p2 = *arr;
auto p3 = &arr;

Here p1 is int(*)[2] (array decayed), p2 is int* (array decayed) and p3 is int (*)[2][2] (array did not decay). A simple miss of * or & will lead to a very different type.

Now had we typed out this explicitly like int (*p1)[2] = &arr, then I will get a compiler message informing me that I typed & when I shouldn't have. In case of auto anything goes and the program might compile cleanly, but with a different result.

Also throw type qualifiers into the declaration on top of that and we are guaranteed to have a complete mess if we use auto.


Known problems in C23 The C23 standard notes under the 6.7.10 Type inference chapter that using auto together with anonymous struct/union declarations would cause implementation-defined behavior as the declared variable and its members may end up in the tag namespace, rather than the ordinary namespace as may have been expected.


The (lack of) rationale why auto was added to C23

auto was added as per proposal N3007. The main reason appears to be making C in sync with C++. However, in C++ auto is somewhat handy and actually solves a few problems, as previously mentioned. Whereas the "rationale", if there ever was one, in N3007 boils down to subjective statements like.

However when the definition includes an initializer, it makes sense to derive this type directly from the type of the expression used to initialize the variable.

As we can see from the numerous examples I made above, deriving the type from the initializer does not obviously make sense. At all.

Or worse:

...obvious convenience for programmers who are perhaps too lazy to lookup the type

Oh come on! If they are too lazy for proper engineering they should maybe consider a different career. Maybe their boss ought to help them out with a swift career change even!

Or just maybe they should start using a programming IDE that does this for them, by a single keystroke or a few mouse clicks. Such IDEs become popular in the 1990s, it's hardly a new tool for the average programmer out there.


Recommended usage

In C++, it is recommended to use auto to make long object type declarations readable, where you don't really care about the exact type. Particularly when reaching for an iterator or a returned type from a member function in some verbose template class.

In C, it is not recommended to use auto at all, because it only serves to create problems. It is a poorly researched and poorly implemented feature.

If anyone can actually give a non-subjective example of when it makes sense to use auto to clearly improve everyday C code, I will certainly reconsider.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

3 comment threads

auto is useful in C for avoiding double-evaluation in macros (6 comments)
C++ actually has an answer to the const pointer problem - you can write `const auto* ptr = &x;` just ... (1 comment)
Note to self: stick to C11. (1 comment)
auto is useful in C for avoiding double-evaluation in macros
alx‭ wrote 3 months ago

Re: "If anyone can actually give a non-subjective example of when it makes sense to use auto to clearly improve everyday C code, I will certainly reconsider.":

I fully agree with your reasons to consider auto dangerous and avoid it as much as possible.

However, there's one case where auto is useful. Preventing double evaluation in macros.

alx‭ wrote 3 months ago · edited 3 months ago

Here's my implementation of swap():

#define is_same_type(a, b)    __builtin_types_compatible_p(a, b)
#define is_same_typeof(a, b)  is_same_type(typeof(a), typeof(b))

#define SWAP(ap, bp)  do                                              \
{                                                                     \
	auto          ap_ = ap;                                       \
	auto          bp_ = bp;                                       \
	typeof(*ap_)  tmp_;                                           \
								      \
	static_assert(is_same_typeof(ap_, bp_), "");                  \
								      \
	tmp_ = *ap_;                                                  \
	*ap_ = *bp_;                                                  \
	*bp_ = tmp_;                                                  \
} while (0)

I use a GNU C builtin for some extra added type safety, but you could ignore that part. I can probably rewrite that part with _Generic(3).

alx‭ wrote 3 months ago · edited 3 months ago

And here's my implementation of MIN(3) and MAX(3):

#define MIN(x, y)                                                     \
({                                                                    \
	auto  x_ = x;                                                 \
	auto  y_ = y;                                                 \
                                                                      \
	(x_ < y_) ? x_ : y_;                                          \
})

#define MAX(x, y)                                                     \
({                                                                    \
	auto  x_ = x;                                                 \
	auto  y_ = y;                                                 \
                                                                      \
	(x_ > y_) ? x_ : y_;                                          \
})

This one uses a GNU C extension, ({}), which I hope gets into ISO C some day.

Lundin‭ wrote 2 months ago · edited 2 months ago

alx‭ I don't see why auto ap_ = ap; is any better than typeof(ap) ap_ = ap; though. And in either case you risk getting parameters that have gone through implicit conversion by accident, so this would be better off as a function.

alx‭ wrote 2 months ago · edited 2 months ago

Lundin‭ It has been hard to come up with an input that would trigger double evaluation and would be valid input to MIN() and MAX(), but here it is:

$ cat auto.c 
#include <stdio.h>

int
main(void)
{
	int  i = 3;
	int  j[2 * i];
	int  (*p)[2 * i];

	printf("%d\n", i);

	p = &j;
	typeof(p + i++) x = p + i++;
	printf("%d\n", i);

	auto y = p + i++;
	printf("%d\n", i);
}
$ cc -Wall -Wextra -std=gnu23 auto.c -Wno-unused-variable
auto.c: In function ‘main’:
auto.c:13:34: warning: operation on ‘i’ may be undefined [-Wsequence-point]
   13 |         typeof(p + i++) x = p + i++;
      |                                 ~^~
auto.c:16:23: warning: operation on ‘i’ may be undefined [-Wsequence-point]
   16 |         auto y = p + i++;
      |                      ~^~

(The second diagnostic seems bogus; I'll report it to GCC.)

$ ./a.out 
3
5
6
alx‭ wrote 2 months ago · edited 2 months ago

Lundin‭ Re: implicit conversions and using functions instead:

To SWAP(), I pass pointers, so I don't think implicit conversions should be a problem here.

To MIN() and MAX(), implicit conversions are usually not a problem, as long as the original types are of the same sign. The compiler should issue warnings about sign mismatch. In any case, I could add some static assertion to check that they have the same sign. Something like the following should make MIN/MAX safe enough:

static_assert(is_signed(a) == is_signed(a), "");

But the point on not having a function is that I can reuse the macro for every integer (or even pointer) type.

I am only slightly concerned about implicit conversions here, I think.