Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on Which platforms return a non-null pointer on malloc(0)

Parent

Which platforms return a non-null pointer on malloc(0)

+4
−2

What is the portability of malloc(0);?

  • Which platforms return NULL without setting errno?
  • Which platforms return a non-null pointer?
  • Do any platforms have some other behavior?
History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

realloc(ptr, 0) (1 comment)
Post
+3
−0

It is trivial enough to test:

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

#define KNOWN_GARBAGE ((int*)~0u)

int main (void)
{
  int* ptr = KNOWN_GARBAGE;
  ptr = malloc(0);
  int errno_changed = errno;

  if(ptr == NULL)
    puts("Returned NULL.");
  else if(ptr == KNOWN_GARBAGE)
    puts("Didn't modify the pointer, non-conforming?");
  else
    puts("Returned non-zero, modified the pointer.");
  if(errno_changed)
    printf("Weird use of errno detected, error code 0x%X\n", errno_changed);
}

Then try it with whatever compiler, version, standard lib and system you want. The vast majority of gcc-like compilers + libc flavours appear to return a new non-zero address.

The setting errno part appears to be some old Unix gunk from the 90s according to man(?), so you may have to find some sufficiently obscure computer for that, I guess.


Related to this question: C no longer has standard support for realloc(ptr, 0) since C23 likely comes with major defects here. See realloc(ptr, 0) in C23 - now what?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

2 comment threads

re: give me a list (18 comments)
realloc(ptr, 0) (2 comments)
re: give me a list
alx‭ wrote 3 months ago · edited 3 months ago

While giving a list of libraries or other software is typically not wanted, I think a list answer would be the most specific answer to a broader question "How portable is malloc(0) returning a valid pointer?", which is what I really wanted to know.

My system (glibc) returns a valid pointer (which I think is the only correct thing to do, FWIW), but I'd like to know how far from my system I can go without problems. I'm also interested in knowing if the systems that return an invalid pointer are dying platforms, or if they're still alive and widely used.

Lundin‭ wrote 3 months ago

alx‭ I'd say that code bases relying on malloc(0) are dying platforms. The appropriate way to write programs is to initialize the pointer to NULL, only allocate when there is something to allocate, and upon deallocation pass the pointer along to free() no matter if malloc was ever called or not.

alx‭ wrote 2 months ago

But think of the following algorithm:

We want to have an array of N elements, which correspond to the emails in my mailbox. Unconditionally allocate an array of N elements of mail_t.

Iterate over those N elements, to do some work.

Free the array when we've finished.

If the algorithm is well written, 0 might be just one case that is not special at all. You'll enter a loop with 0 iterations, and fall through to free(). If you treat 0 as a special case, you're adding unnecessary code.

Of course, if you write portable code, you'll need to do it, because of the broken malloc() implementations that return an invalid pointer for 0. But it's not something inherent of the algorithm.

Lundin‭ wrote 2 months ago

alx‭ Sounds like premature optimization, if(size>0) ptr=malloc(size); else ptr=NULL; is hardly some massive overhead. You can still have your for loop after that and then pass the pointer to free when done.

alx‭ wrote 2 months ago

It's not about the optimization, but rather about readability. Removing branches is one of the things that works best --at least for me-- to reduce the complexity of a function.

It would be along the lines of Linus's good taste example: https://youtu.be/o8NPllzkFhE?feature=shared&t=858 https://github.com/mkirchner/linked-list-good-taste

Lundin‭ wrote 2 months ago

alx‭ Obviously you can write that code in any number of ways. ptr = (size ? malloc(size) : NULL); for example though I personally think that's less readable. Error handling always comes with overhead - it's a fact. The most elegant code out there is usually not very rugged.

Lundin‭ wrote 2 months ago

alx‭ As for Linus Torvalds he's probably as far from a C style authority as you can possible get. Anyone who is disagreeing ought to read the Linux kernel source. I was a fan of Linux until I read the that... afterwards I started to wonder if I need to keep a 10 meter distance to all Linux computers. Similarly, the "Linux kernel coding style" is naive, amateur-level stuff lacking every little bit of rationale.

Lundin‭ wrote 2 months ago · edited 2 months ago

And well seriously, who thinks while((*indirect) != entry) indirect = &(*indirect)->next; is good-looking code? The first parenthesis is superfluous. &(*indirect)->next; is obfuscating posing and here an extra parenthesis could perhaps have been motivated: &((*indirect)->next). But the main problem is obviously that some dummy decided to iterate across a linked list using a pointer-to-pointer rather than using a plain pointer. Either that or it's one big syntax error with double de-referencing. The API seems to suggest that the caller is passing along a pointer to the item they wish to remove. This is a bunch of nonsense to be immediately dismissed.

alx‭ wrote 2 months ago · edited 2 months ago

I don't necessarily agree with Linus on everything, but in this case I tend to prefer less branches, even if that means using double de-referencing.

And in the case of malloc(0), there's no double de-referencing issue or of other kind. It's a free removal of the branches.

Here's a dummy example of an algorithm that benefits of malloc(0) returning non-NULL. Rewriting it to work with NULL is non-trivial, and necessarily hurts readability.

    n = 0;

    p = malloc(n);
    if (p == NULL)
        err(1, "malloc");

    bzero(p, n);
    len = arc4random_uniform(n + 1);
    memset(p, 'x', len);

    assert(len == strnlen(p, n));

    free(p)
Lundin‭ wrote 2 months ago

alx‭ As I said in the initial comment, it is probably naive to think malloc(0) results in less branches, because there will likely be a special case inside the malloc implementation instead, meaning that a rare branch gets hit. Also the implementation of heap allocation in itself is usually huge, so micro-optimizing the code around it isn't very meaningful.

Lundin‭ wrote 2 months ago

...now we could of course always over-allocate malloc with one garbage byte. malloc(n+1) but the program only uses n bytes. That's 100% portable.

alx‭ wrote 2 months ago · edited 2 months ago

Adding an unconditional +1 would significantly hurt readability (a reader may wonder why we're adding 1). However, malloc(foo ?: 1) might make sense (especially in combination with a useful commit message that explains why). But that's still a workaround for an unportable specification. In an ideal world, users shouldn't have to worry about it.

I don't care at all about performance, since malloc(3) is a huge function. But branches do hurt readability, at least to my brain. Everyone's brain works differently, so yours may not suffer from that. :)

Lundin‭ wrote 2 months ago

alx‭ The general approach for readability could be to add an abstraction layer on top of malloc to hide away all the ugly gunk. Or the other way around, strip away malloc and go straight for the system APIs which will be better defined than the C standard libs - at the expense of non-portable code.

JohnRando‭ wrote 2 months ago

" system APIs which will be better defined"

Can you provide a simple example?

Lundin‭ wrote 2 months ago

JohnRando‭ System APIs have no benefits from providing a scenario where the program may potentially be running off into the woods. They are non-portable, whereas the C standard libs seek to be portable and therefore have vague wording and poorly-defined behavior - sometimes where it is justified, sometimes where it is not. A concrete example would be the memoryapi.h in Windows. If you check for example VirtualAlloc, it won't have any documentation saying things like "if you pass zero to this parameter, we don't know what the function will return".

alx‭ wrote 2 months ago · edited 2 months ago

I've poked a friend of mine to research this, and has done some quite interesting research. This all seems to originate from a bug in SysV r2, which got later standardized by AT&T's SVID out of thin air.

https://nabijaczleweli.xyz/content/blogn_t/017-malloc0.html

Every sane malloc(0) had always returned a unique pointer. In fact, I've read the allocator code of Unix V7, and it seems natural to just return a unique pointer.

Lundin‭ wrote 2 months ago

alx‭ Once some behavior makes it into the Unix/POSIX libs, it tends to stay the same (for good or bad). But I would imagine that you would find oddities if you'd go on a tour and inspect all embedded systems compilers out there - most were designed for targets where heap allocation is senseless. So an implementation hiccup like this would easily fly under the radar there.

Also as someone pointed out in a similar "has there ever been..." question: C has been a very popular language for a very long time, meaning there's been all manner of truly awful implementations made by students and hobbyists over the years.

alx‭ wrote 2 months ago

Yeah, I guess low QoI implementations exist. But the standard shouldn't be bound by them. A newer revision could just tighten malloc(0) to return a unique pointer since it's simpler to implement, it's more useful (simpler to use), and almost ubiquitous. Just like C23 mandated 2's complement.