Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on How to properly use malloc?

Parent

How to properly use malloc?

+10
−0

I have seen this construct quite a lot:

int *p = (int*) malloc(n * sizeof(int));

Is this how it should be? What is the proper way to use malloc? I have also seen this:

int *p = malloc(n * sizeof *p);

Which one is preferable?

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

Post
+2
−2

There are several things to consider when calling the malloc(3) family of functions:

  • nelem * sizeof(elem) or sizeof(elem) * nelem?
  • Use the type or the pointer name in sizeof()?
  • To cast or not to cast?

TL;DR:

Use a simple and readable interface, and let it handle the safety for you. Of course, you need to write such an interface, wrapping the standard C library functions.

foo_t  *p;

p = MALLOC(nelem, foo_t);

Let's answer the questions one by one.


nelem * sizeof(elem) or sizeof(elem) * nelem?

None of the alternatives is good.

Use functions that take separate arguments:

  • calloc(3) (ISO C)
  • reallocarray(3) (several Unix systems)
  • recallocarray(3) (several Unix systems)

I suggest writing your own mallocarray() and reallocarrayf() as wrappers around these. reallocarrayf() is more complex (read about reallocf(3)), but mallocarray() can be trivially implemented:

#define mallocarray(nmemb, size)  reallocarray(NULL, nmemb, size)

As a rule, I always use array allocation functions, even if I only want 1 element:

foo_t *p;

// p = MALLOC(1, foo_t);  // which expands to:

p = (foo_t *) mallocarray(1, sizeof(foo_t));

This has several benefits:

  • It clearly shows the number of elements I want. If it's 1, or if it's 42.

    An exception is when allocating a structure with a flexible array member. In such case, you really need to call malloc(3) with the precise size you want:

struct foo {
    int    a;
    bar_t  b[];
};

struct foo *p;

p = (struct foo *) malloc(sizeof(struct foo) + sizeof(bar_t) * n);
  • The multiplication is performed inside the function, inside a check to prevent overflow.

    However, as a general rule, when you multiply numbers, put first the value with the widest type, to avoid overflowing intermediate operations (read about precedence in operators).

int     a, b;
size_t  c, d;

...

d = a * b * c;  // might overflow `a * b` easily
d = c * a * b;  // less likely

Use the type or the pointer name in sizeof()?

None of the alternatives is good.

If you use the type, then if you change the type of the pointer might introduce silent bugs in malloc(3) calls, since there's no way that the compiler knows if your doing the right thing.

If you use the name, then you reduce ability to search in the code. Code is more often read (and searched) than modified. It's quite useful to be able to search through a code base for all allocations of a given type with something like the following.

grep -rni 'alloc.*\bfoo_t)' src

The solution comes thanks to the next question, which is about casting. As a starter, I'll say that in this case readability is preferable to safety; we'll try to make it safe somehow... continue reading.


To cast or not to cast? That is the question.

The unsurprising answer, again, is that none of the alternatives is good.

Casting, in general, can disable many warnings, and is a dreaded thing that should be used very carefully. However, that doesn't apply to malloc(3), since it (by returning void *) implicitly converts to any pointer type, so we can't do more damage by forcing a conversion.

Casting also had issues regarding implicit functions, but those are gone for good (since C99), so this is a non-issue. If you lived in a cave for the last $year - 1999 years, please get out and see the light.

Casts still have issues, which is why they're not ideal:

  • Clutters the code, considerably reducing readability.
  • They aren't DRY.

So let's not write the cast (which is not the same as don't cast).

Casting the result of malloc(3) is good because:

  • It couples the call with the type of the pointer. Since the cast converts the void * to a non-void pointer, it no longer converts to anything implicitly. This means that if the type of the pointer ever changes, malloc(3) calls assigned to that pointer will no longer compile, and we will be able to fix any compilation errors.

So, the solution is to write the cast inside a macro, which uses the same information that we will need for sizeof(), thus avoiding repeating ourselves, and also avoiding readability issues.

#define MALLOC(n, type)  ((type *) mallocarray(n, sizeof(type)))

which is to be used as this:

foo_t  *p;

p = MALLOC(42, foo_t);  // 42 elements of type foo_t.

It's readable, it's concise, it's easily searchable, and it's safe.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

1 comment thread

Re-inventing C (4 comments)
Re-inventing C
Lundin‭ wrote about 1 year ago

Re-inventing the C language by replacing it with your private macro language is a cardinal sin. Yes, a whole lot of standard lib functions have horrible APIs, but other C programmers supposedly know about those quirks. They don't know how your secret macro language works however, and will be really pissed if you force them to learn in order to understand your code. If you wish to write self-documenting code showing how many items to allocate, that can already be done with p = malloc( sizeof(foo_t[nelem]) ); Everyone who knows basic C will understand that this allocates an array. Whereas MALLOC(42, foo_t) is not readable because it forces the reader to go dig up the definition of the macro in order to tell what magic that was hidden inside it. The readable solution is rather to hide all this relatively low-level allocation behind an abstraction layer, so that the person using your code doesn't have to care of how allocations were done to begin with.

alx‭ wrote about 1 year ago · edited about 1 year ago

I guess your comment only applies to the macros, and not the Unix-only *array() variants like reallocarray(3), right?

The malloc API is not really horrible; it's good, as far as C functions go. There's really not much more that could be acomplished, in terms of type safety. Overflow due to multiplication is one thing they could have solved, and well, the BSD *array() variants did solve it.

So the only real remaining issue of the functions is that of type safety; this issue is not resolvable via functions for generic code. If you have allocators for specific objects, then for sure, define your wrapper allocator that already knows the size, and returns a specific pointer.

For generic allocations, you need to rely on macros to achieve type safety (or be content with risky code). My macros are just examples to show how type safety can be achieved with a simple interface; or rather how I did it. Of course, readers are encouraged to rewrite those macros as they better suit them.

alx‭ wrote about 1 year ago · edited about 1 year ago

As for having to read the definition of the macros: that really goes for any code you read. I don't know why these wrappers would be any different. Encapsulation is not bad (it can be bad, but it's not necessarily bad); especially if it solves a long-standing issue of type safety.

How to read the definitions for these macros? It's simple, with the right tools. Let's say we're on a project that uses these macros (e.g., shadow utils, where I first implemented these macros). Then we run grepc CALLOC:

$ grepc CALLOC
./lib/alloc.h:23:
#define CALLOC(n, type)   ((type *) calloc(n, sizeof(type)))

I hope C programmers can read that macro. :) If not, the context in which it is used should be straightforward.

alx‭ wrote about 1 year ago · edited about 1 year ago

Regarding VLA notation, is it mandated by the standard, or is it optional? Anyway, I'm not going to complain about good taste extensions to the language:) I'll complain about the real issues of it (IMO): while it solves the readability issues, it doesn't solve type safety. In fact, that's one of the least safe approaches to calling malloc(3), since changing either the type or the name of the pointer will break the code, possibly adding silent bugs.