Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on When does it not work to dereference the pointer for sizeof during malloc?

Post

When does it not work to dereference the pointer for sizeof during malloc?

+6
−0

Background

This is kind of a subquestion to How to properly use malloc?

When allocating, there are basically two common ways of using the sizeof operator:

int *p;
p = malloc(n * sizeof *p);   // Method 1: Dereference the pointer
p = malloc(n * sizeof(int)); // Method 2: Explicitly use the type

I personally prefer method 1, because it reduces code duplication.* And the rule is pretty simple. Take whatever you have on the left of the equal sign, and add an asterisk to get the argument for sizeof. It also works for 2D or 3D:

int ***p;
p = malloc(x * sizeof *p); // Add asterisk to p
for(int i=0; i<x; i++) {
    p[i] = malloc(y * sizeof *p[i]); // Add asterisk to p[i]
    for(int j=0; j<y; j++) {
        p[i][j] = malloc(z * sizeof *p[i][j]); // Add asterisk to p[i][j]
    }
}

* In the sense that if you have multiple malloc calls where you assign to the pointer p, then if you want to change the type of the pointer, you would need to find ALL malloc calls and change them.

Array parameters

I know of one instance where it "doesn't work", and that is when you have complex parameters to functions. Like this:

void foo(int n, int32_t p[5][5]) {

But in this case, I'd say that the problem is that when arrays are declared as function arguments like that, they are not arrays. Indeed, the C syntax here is a bit strange. The equivalent declaration int32_t (*p)[5] works as expected with this rule. Because the type of p is pointer to array 5 of int32_t. In other words, I expect sizeof *p to be 20, which it also is. The solution to the above problem is to - explicitly in code or mentally in your head - introduce a temporary variable:

void foo(int n, int32_t p[5][5]) {
    int32_t (*tmp)[5];
    tmp = malloc(n * sizeof *tmp);

void pointers

I know that it does not work if you have void pointers. For example:

void *p;
p = malloc(n * sizeof *p); // Error: A void pointer cannot be dereferenced
p = malloc(sizeinbytes);   // Works fine

However, even if this isn't a very extreme and unusual case, this is indeed a pretty special case. And it's also pretty obvious that it's an exception. A C programmer should know that void pointers cannot be dereferenced. However, do note that both gcc and clang required compiling with -pedantic to warn about this.

Flexible array members

Another case is if you have a pointer to struct where the struct contains a flexible array. But although this indeed is a very valid real world use case where this approach would not work, it's also abundantly obvious that it is an exception, since the whole foundation around flexible array members is that you manually Furthermore, the rule can be shoehorned in if you want to. At least for the case where you want several instances of the struct and the flexible array equally big for each of them. Like this:

struct s {
    int x;
    char y;
    double z;
    long a[];
};

// m is how many elements the array s::a should have in each instance of s
// n is how many instances of s we want
struct s *p = malloc(n * (sizeof(*p) + m * sizeof *(p->a)));

Question

So my question is, when does this approach with just adding an asterisk to the pointer not work? I'm interested in both realistic use cases and examples created just to break this rule of thumb. Please specify if your example is a real issue or just a theoretical one.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

3 comment threads

"[...] because it reduces code duplication." Does it, though? (9 comments)
De-referencing void pointers (3 comments)
Typo? (2 comments)
"[...] because it reduces code duplication." Does it, though?
elgonzo‭ wrote almost 3 years ago · edited almost 3 years ago

With

int *p;
p = malloc(n * sizeof(int));

the type identifier ("int") is being duplicated, thus one duplicated token.

With

int *p;
p = malloc(n * sizeof *p);

the variable identifier ("p") is being duplicated, thus one duplicated token.

Where exactly is there a reduction of code duplication to be found when in each case one identifier/token is being duplicated?

How is duplicating one identifier somehow less or more code duplication than duplicating another identifier? ;-)

elgonzo‭ wrote almost 3 years ago · edited almost 3 years ago

As an advantage of malloc(n * sizeof *p), i would consider the locality (in terms of location in the source code) of the identifiers used in p = malloc(n * sizeof *p);, thus potentially reducing the chance of coding mistakes when a function declares a number of pointers (or a program when global/static pointers are concerned) of different types and/or the declaration of the pointer variable is so far away from the malloc line so that it does not fit together with the malloc line within an smallish/average editor view size (my totally subjective rule of thumb: more than 20..30 lines), thus less chance to mis-remember and choose the wrong sizeof type when writing the line with the malloc statement. (Not because of reduced code duplication, though, as i fail to see any reduction of code duplication.)

klutt‭ wrote almost 3 years ago

The duplication comes if you have this:

int *p;
p = malloc(10 * sizeof(int));
free(p);
p = malloc(10 * sizeof(int));
free(p);
p = malloc(10 * sizeof(int));

Changing p to long* would require three extra changes.

Lundin‭ wrote almost 3 years ago

elgonzo‭ That's an excellent argument actually. The argument in favour of this sizeof *p style is that someone would aimlessly change the type of a variable without checking the code using that variable. But what if they change the identifier of that variable instead - again without checking the code using it. That's equally dangerous: if p was renamed to something else, the p in the malloc code could now refer to some other variable in an outer scope. Like I ended my answer to that other question: "Overall the programmer must actually know what they are doing - no way around it. If you change the type of some variable, then you better review every single line in the code base using that variable - simple as that." This naturally goes for changing the variable name too.

Lundin‭ wrote almost 3 years ago · edited almost 3 years ago

klutt‭ And how exactly is it less duplication to increase the references to the p identifier from 3 times to 6 times? Imagine if it isn't called p but long_contrived_name_foo, which happens to have a sibling long_contrived_name_bar and if we mix up these identifiers, everything will crash & burn in subtle but severe ways.

elgonzo‭ wrote almost 3 years ago · edited almost 3 years ago

klutt‭, thanks for adding a footnote to your question text that contextualizes where you see the reduction of code duplication. Otherwise, in a general sense there would still be no reduction in code duplication, because changing the variable name instead of the type identifier would also require three "extra" replacements/changes, but this time favoring the malloc(n * sizeof(int)); case. :-)

klutt‭ wrote almost 3 years ago

Lundin‭ I'm 100% confident that you understand what I'm trying to illustrate. If you think that "code duplication" isn't a good phrase to use, please suggest another one.

elgonzo‭ wrote almost 3 years ago · edited almost 3 years ago

Lundin‭ i guess this could be a matter of refactoring support offered by IDEs. I don't have a good overview over the refactoring capabilities across different popular C/C++ IDEs, but it would probably be much easier to implement and offer a refactoring assistant that can reliably track and variable identifiers and their scopes -- thus enabling easy renaming of all instances of a given variable identifier, than implementing a refactoring assistant that is able to reliably analyze expressions and statements like p = malloc(10 * sizeof(int)); so as to successfully auto-rename type identifiers somehow associated with a certain variable without mucking up the show by either missing to replace related or mis-replacing unrelated type identifiers throughout the source code. (And in many cases, a text replacement function in a text editor can be sufficient help in renaming variable identifiers, but less so for type identifiers, especially for common ones like "int".)

elgonzo‭ wrote almost 3 years ago · edited almost 3 years ago

Lundin‭

"Overall the programmer must actually know what they are doing - no way around it. If you change the type of some variable, then you better review every single line in the code base using that variable - simple as that." This naturally goes for changing the variable name too.

That's true. And yet, there are things that encourage and increase the risk of errors, and there are things that help reducing the risk of making errors. While i generally agree that one must know what they are doing, humans are unfortunately not computers, and humans tend to make mistakes. With respect to what i was talking about, it could be summed up under "Out of sight, out of mind". Keeping related stuff reasonably together can help keeping things in the eye and in the mind.