Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Why is the new auto keyword from C++11 or C23 dangerous?

+7
−0

In older C and C++ standards, the auto keyword simply meant automatic storage duration. As in the compiler automatically handles where the variable is stored, typically on the stack or in a register. And it was a pretty useless keyword since it can only be used at local scope, where all variables default to automatic storage duration anyway.

The C++11 committee decided to change the meaning of this keyword so that during declaration, the type is picked based on the initializer(s) provided. For example auto i=0; will result in int because the integer constant 0 is of type int.

As I understand it, the main rationale was to get rid of cumbersome declarations in for loops in particular.

for(auto i = cont.begin(); ...

is admittedly easier for the eye than

for(std::vector<std::string>::iterator i = cont.begin(); ...

However, veteran programmers seem to raise concerns about auto being unsafe. It seems to be a topic where there's plenty of personal opinions as seen over at SO: How much is too much with C++11 auto keyword? Some people just happily encourage "go for it everywhere". Others, including various well-known C++ gurus, speak in favour of using it with caution.

Now C too is adapting the same functionality of auto as C++, as per C23.

What exactly is dangerous with the auto keyword?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

The question just assumes the `auto` keyword is "dangerous". Are there examples that are "dangerous" ... (3 comments)

2 answers

You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.

+6
−0

The auto feature was indeed mainly meant to solve long cumbersome template container declarations in C++. But when introduced to C23 - where there are no templates let alone template containers - it just ends up as a solution without any problem that it solves.

auto can create new problems just fine, however! And that goes for C and C++ both, although this answer will mainly focus on C where the feature is just about to get introduced. In C++ you can use auto as long as you know what you are doing and it is done with caution.

The only problem that the language committee(s) seem to have consider was backwards compatibility with the previous use of auto. C++20 (annex C) about compatibility for examples notes that using auto as a classic storage class specifier when no initializers are present is problematic. But I think that scenario is the least concerning use of auto though. The main problem lies in how it behaves as a new feature.

The problem with the new use of the auto keyword is that the actual type of the initializer is not often obvious. In many cases you won't even know which type you actually ended up with, which is often something very important to know. A lot of these problems are caused by well-known design mistakes and old language bugs in C, where adding auto to the pot makes things even worse.

In general, when we write an initializer which is wrong for whatever the reason, we like to be informed by the compiler that we messed up, rather than getting the code silently expected. This is the very reason why horribly dangerous language features like "implicit int" were removed from C ages ago.


Old, well-known language problems in C colliding with new language problems in C23
auto is particularly problematic in C23 because C has not come as far as C++ in correcting old sins of the past. For example auto ch = 'A' will give you a char in C++ but an int in C.

Or when dealing with boolean logic, something like auto a = b && c; will give you a bool in C++ but an int in C. Even if b and c happens to be bool operands.

Similarly, auto ptr = NULL may give you an int rather than a void* in both languages. Both languages supposedly encourage the use of nullptr instead, but there's a whole lot of old code out there using NULL.

Re-writing the old malloc(n * sizeof(*ptr)) trick will also suffer as it can't be written as auto ptr = malloc(n * sizeof(*ptr));

Having some typedef enum { A } a; and then auto x = A; will result in an int and not an a. Where a may be a smaller integer type than int.

Except when you use the new enum feature in C23 and do typedef enum : int8_t { A } a;. Now auto x = A; suddenly results in an a type.


Const/qualifier correctness
Another sin of the past would be that auto ptr = "hello" leads to a char* in C and not a const char* as in C++.

Well we can fix that easily enough, we just write const auto ptr or auto const ptr right? Not quite... Just as in the case of hiding a pointer behind a typedef, we end up with a char* const and not a const char* as was the intention.

So it simply turns out that you can't meaningfully combine auto and const in C. Meaning you can't have auto and const correctness at the same time.


Subtle type rules
auto is particularly nasty when used in low-level programming, together with certain operators, resulting in another type and/or signedness than expected.

Consider something like this:

unsigned int i = 1+1;
i = ~i;
printf("%#x\n", i); // prints 0xfffffffd 
i += 3;
printf("%#x\n", i); // prints 0

That's well-defined code. Now how about auto...

auto i = 1+1;
i = ~i;
printf("%#x\n", i); // undefined behavior, wrong conversion specifier
i += 3;             // undefined behavior, integer overflow
printf("%#x\n", i);

Oops. Well how about this?

auto i = 0xFFFFFFFF;
i = ~i;
printf("%#x\n", i); // well-defined, prints 0
i -= 3;             // well-defined
printf("%#x\n", i); // well-defined, prints 0xfffffffd

A slip of the type used by the initializer can obviously have major consequences and tracking down the root cause of that bug may not be easy.

auto f = true ? 1.0f : 0.0; would be another subtle type promotion rule of C. Here f ends up as double, which might not have been expected.

Something like auto c = a | b; where a and b are bool, char or unsigned short etc will result in c becoming an int in both C and C++ due to integer promotion.

In case of short a = 1; auto b = -a; we might have expected b to also become short and not int.

And so on.


Wrong initializer by mistake
When dealing with more complex declarations like 2D arrays and pointers to them, a simple slip of the finger can silently result in the wrong type.

int arr [2][2];
auto p1 = arr;
auto p2 = *arr;
auto p3 = &arr;

Here p1 is int(*)[2] (array decayed), p2 is int* (array decayed) and p3 is int (*)[2][2] (array did not decay). A simple miss of * or & will lead to a very different type.

Now had we typed out this explicitly like int (*p1)[2] = &arr, then I will get a compiler message informing me that I typed & when I shouldn't have. In case of auto anything goes and the program might compile cleanly, but with a different result.

Also throw type qualifiers into the declaration on top of that and we are guaranteed to have a complete mess if we use auto.


Known problems in C23 The C23 standard notes under the 6.7.10 Type inference chapter that using auto together with anonymous struct/union declarations would cause implementation-defined behavior as the declared variable and its members may end up in the tag namespace, rather than the ordinary namespace as may have been expected.


The (lack of) rationale why auto was added to C23

auto was added as per proposal N3007. The main reason appears to be making C in sync with C++. However, in C++ auto is somewhat handy and actually solves a few problems, as previously mentioned. Whereas the "rationale", if there ever was one, in N3007 boils down to subjective statements like.

However when the definition includes an initializer, it makes sense to derive this type directly from the type of the expression used to initialize the variable.

As we can see from the numerous examples I made above, deriving the type from the initializer does not obviously make sense. At all.

Or worse:

...obvious convenience for programmers who are perhaps too lazy to lookup the type

Oh come on! If they are too lazy for proper engineering they should maybe consider a different career. Maybe their boss ought to help them out with a swift career change even!

Or just maybe they should start using a programming IDE that does this for them, by a single keystroke or a few mouse clicks. Such IDEs become popular in the 1990s, it's hardly a new tool for the average programmer out there.


Recommended usage

In C++, it is recommended to use auto to make long object type declarations readable, where you don't really care about the exact type. Particularly when reaching for an iterator or a returned type from a member function in some verbose template class.

In C, it is not recommended to use auto at all, because it only serves to create problems. It is a poorly researched and poorly implemented feature.

If anyone can actually give a non-subjective example of when it makes sense to use auto to clearly improve everyday C code, I will certainly reconsider.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

3 comment threads

auto is useful in C for avoiding double-evaluation in macros (6 comments)
C++ actually has an answer to the const pointer problem - you can write `const auto* ptr = &x;` just ... (1 comment)
Note to self: stick to C11. (1 comment)
+2
−0

A pitfall in C++ that I didn't see mentioned in the other answer is that it might give unexpected results with libraries using expression templates.

What are expression templates?

In a nutshell, expression templates are a technique that allows to write efficient numeric code with intuitive notation.

Consider for example a matrix library with straightforward implementation using operator overloading. Then when you write e.g.

Matrix A = B + C + D

where B, C and D are also of type Matrix, what will happen is that B + C will generate a temporary matrix, which then is passed to the second operator+ as first argument, where the second argument is C; this will then generate yet another temporary matrix that is used to initialise A. Now with move semantics, one may actually get rid of the temporary storage (I've not checked if that is actually possible), but the fact remains that the order of accesses will be very cache-unfriendly.

Now one way to solve this is to have instead a function that directly implements the optimal access sequence (and also ensures no temporary storage even without optimisations):

add_three_matrices(A, B, C, D);

however that doesn't give the nice intuitive syntax. Now what expression templates do if that the expression 'A + B + C' does not actually calculate the sum, but creates an object built from templates that represents the expression, and initialising the Matrix A then triggers the actual, optimised code. That is, you can now write

Matrix A = B + C + D

and still get the optimised code.

How does auto affect this?

One might think that

auto A = B + C + D

gives equivalent code to the one above, but that is not the case. Instead auto is determined to be the expression template type describing the operation. This is particularly bad if the expression contains some actual temporary that will have been destroyed at the end of the statement; say you are scaling D with a double returned from a function:

auto A = B + C + D*f(x)

The return value of f is bound to a reference inside the expression, but since that reference is not A, but some reference inside the expression, it won't extend the life time of the temporary. So if A is ever used later (in a way that actually triggers the calculation), it will access a dangling reference.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »