Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on When does it not work to dereference the pointer for sizeof during malloc?

Post

When does it not work to dereference the pointer for sizeof during malloc?

−0

Background

This is kind of a subquestion to How to properly use malloc?

When allocating, there are basically two common ways of using the sizeof operator:

int *p;
p = malloc(n * sizeof *p);   // Method 1: Dereference the pointer
p = malloc(n * sizeof(int)); // Method 2: Explicitly use the type

I personally prefer method 1, because it reduces code duplication.^* And the rule is pretty simple. Take whatever you have on the left of the equal sign, and add an asterisk to get the argument for sizeof. It also works for 2D or 3D:

int ***p;
p = malloc(x * sizeof *p); // Add asterisk to p
for(int i=0; i<x; i++) {
    p[i] = malloc(y * sizeof *p[i]); // Add asterisk to p[i]
    for(int j=0; j<y; j++) {
        p[i][j] = malloc(z * sizeof *p[i][j]); // Add asterisk to p[i][j]
    }
}

^{* In the sense that if you have multiple malloc calls where you assign to the pointer p, then if you want to change the type of the pointer, you would need to find ALL malloc calls and change them.}

Array parameters

I know of one instance where it "doesn't work", and that is when you have complex parameters to functions. Like this:

void foo(int n, int32_t p[5][5]) {

But in this case, I'd say that the problem is that when arrays are declared as function arguments like that, they are not arrays. Indeed, the C syntax here is a bit strange. The equivalent declaration int32_t (*p)[5] works as expected with this rule. Because the type of p is pointer to array 5 of int32_t. In other words, I expect sizeof *p to be 20, which it also is. The solution to the above problem is to - explicitly in code or mentally in your head - introduce a temporary variable:

void foo(int n, int32_t p[5][5]) {
    int32_t (*tmp)[5];
    tmp = malloc(n * sizeof *tmp);

void pointers

I know that it does not work if you have void pointers. For example:

void *p;
p = malloc(n * sizeof *p); // Error: A void pointer cannot be dereferenced
p = malloc(sizeinbytes);   // Works fine

However, even if this isn't a very extreme and unusual case, this is indeed a pretty special case. And it's also pretty obvious that it's an exception. A C programmer should know that void pointers cannot be dereferenced. However, do note that both gcc and clang required compiling with -pedantic to warn about this.

Flexible array members

Another case is if you have a pointer to struct where the struct contains a flexible array. But although this indeed is a very valid real world use case where this approach would not work, it's also abundantly obvious that it is an exception, since the whole foundation around flexible array members is that you manually Furthermore, the rule can be shoehorned in if you want to. At least for the case where you want several instances of the struct and the flexible array equally big for each of them. Like this:

struct s {
    int x;
    char y;
    double z;
    long a[];
};

// m is how many elements the array s::a should have in each instance of s
// n is how many instances of s we want
struct s *p = malloc(n * (sizeof(*p) + m * sizeof *(p->a)));

Question

So my question is, when does this approach with just adding an asterisk to the pointer not work? I'm interested in both realistic use cases and examples created just to break this rule of thumb. Please specify if your example is a real issue or just a theoretical one.

c pointers malloc

posted over 3 years ago

CC BY-SA 4.0

3y ago by hkotsubo‭

klutt‭

1273 reputation 17 12 162 52

Raw

Markdown

History

is a duplicate

This question has been asked before and has already been answered. It should be marked as a duplicate.

Please enter the URL of the proposed duplicate in the details field below.

not constructive

This question cannot be answered in a way that is helpful to anyone. It's not possible to learn something from possible answers, except for the solution for the specific problem of the asker.

3 comment threads

"[...] because it reduces code duplication." Does it, though? (9 comments)

De-referencing void pointers (3 comments)

Typo? (2 comments)

elgonzo‭ wrote over 3 years ago · edited over 3 years ago

copy link

With

int *p;
p = malloc(n * sizeof(int));

the type identifier ("int") is being duplicated, thus one duplicated token.

With

int *p;
p = malloc(n * sizeof *p);

the variable identifier ("p") is being duplicated, thus one duplicated token.

Where exactly is there a reduction of code duplication to be found when in each case one identifier/token is being duplicated?

How is duplicating one identifier somehow less or more code duplication than duplicating another identifier? ;-)

elgonzo‭ wrote over 3 years ago · edited over 3 years ago

copy link

As an advantage of malloc(n * sizeof *p), i would consider the locality (in terms of location in the source code) of the identifiers used in p = malloc(n * sizeof *p);, thus potentially reducing the chance of coding mistakes when a function declares a number of pointers (or a program when global/static pointers are concerned) of different types and/or the declaration of the pointer variable is so far away from the malloc line so that it does not fit together with the malloc line within an smallish/average editor view size (my totally subjective rule of thumb: more than 20..30 lines), thus less chance to mis-remember and choose the wrong sizeof type when writing the line with the malloc statement. (Not because of reduced code duplication, though, as i fail to see any reduction of code duplication.)

klutt‭ wrote over 3 years ago

copy link

The duplication comes if you have this:

int *p;
p = malloc(10 * sizeof(int));
free(p);
p = malloc(10 * sizeof(int));
free(p);
p = malloc(10 * sizeof(int));

Changing p to long* would require three extra changes.

Lundin‭ wrote over 3 years ago

copy link

elgonzo‭ That's an excellent argument actually. The argument in favour of this sizeof *p style is that someone would aimlessly change the type of a variable without checking the code using that variable. But what if they change the identifier of that variable instead - again without checking the code using it. That's equally dangerous: if p was renamed to something else, the p in the malloc code could now refer to some other variable in an outer scope. Like I ended my answer to that other question: "Overall the programmer must actually know what they are doing - no way around it. If you change the type of some variable, then you better review every single line in the code base using that variable - simple as that." This naturally goes for changing the variable name too.

elgonzo‭ wrote over 3 years ago · edited over 3 years ago

copy link

klutt‭, thanks for adding a footnote to your question text that contextualizes where you see the reduction of code duplication. Otherwise, in a general sense there would still be no reduction in code duplication, because changing the variable name instead of the type identifier would also require three "extra" replacements/changes, but this time favoring the malloc(n * sizeof(int)); case. :-)

elgonzo‭ wrote over 3 years ago · edited over 3 years ago

copy link

Lundin‭ i guess this could be a matter of refactoring support offered by IDEs. I don't have a good overview over the refactoring capabilities across different popular C/C++ IDEs, but it would probably be much easier to implement and offer a refactoring assistant that can reliably track and variable identifiers and their scopes -- thus enabling easy renaming of all instances of a given variable identifier, than implementing a refactoring assistant that is able to reliably analyze expressions and statements like p = malloc(10 * sizeof(int)); so as to successfully auto-rename type identifiers somehow associated with a certain variable without mucking up the show by either missing to replace related or mis-replacing unrelated type identifiers throughout the source code. (And in many cases, a text replacement function in a text editor can be sufficient help in renaming variable identifiers, but less so for type identifiers, especially for common ones like "int".)

elgonzo‭ wrote over 3 years ago · edited over 3 years ago

copy link

Lundin‭

"Overall the programmer must actually know what they are doing - no way around it. If you change the type of some variable, then you better review every single line in the code base using that variable - simple as that." This naturally goes for changing the variable name too.

That's true. And yet, there are things that encourage and increase the risk of errors, and there are things that help reducing the risk of making errors. While i generally agree that one must know what they are doing, humans are unfortunately not computers, and humans tend to make mistakes. With respect to what i was talking about, it could be summed up under "Out of sight, out of mind". Keeping related stuff reasonably together can help keeping things in the eye and in the mind.

Communities

Comments on When does it not work to dereference the pointer for sizeof during malloc?

When does it not work to dereference the pointer for sizeof during malloc?

Background

Array parameters

void pointers

Flexible array members

Question

3 comment threads