Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Should I check if pointer parameters are null pointers?
When writing any form of custom function such as this:
void func (int* a, int* b)
Should I add code to check if a
and b
are null pointers or not?
if(a == NULL)
/* error handling */
When posting code for code reviews, one frequent comment is that such checks against null should be added for robustness. However, these reviewers are then just as often corrected by others telling them to not add such checks because they are bad practice.
What should be done? Is there an universal policy to check or not to check, or should this be done on case-to-case basis?
4 answers
Whether null pointer checking should be the caller's responsibility or the callee is debatable and probably a matter of opinion, local convention, sometimes logical choices but in all cases should be documented explicitly.
Consider for example memcpy()
: it is the caller's responsibility, the behavior is undefined if either pointer is invalid unless the size is 0
and if the destination and source ranges overlap. There are good reasons for this: memcpy
is meant to be as efficient as possible so not extra tests are needed. This allows compilers to generate optimal inline code for memcpy
calls, sometimes as small as a single instruction.
Conversely, consider free()
: this is function specifically accepts a null pointer argument, so no test is required at the call site to free any allocated data, whether it was allocated successfully or not as long as the pointers were initialized to NULL
.
The original rationale of the C language was to favor performance at the cost of increased programmer's responsibility. Note that it is always the programmer's responsibility to pass valid pointers along with enough information for the callee to access only relevant data via the pointer. Small functions can be more efficient without null checking, larger ones can have extra checks at minimal cost (quite negligible on CPUs with branch prediction). Handling null pointers gracefully is more a question of quality of implementation in this case (eg: printf("%s", (char *)NULL);
).
Another sound approach is to add assertions in the function prologue to perform null checking only in DEBUG builds. Such functions should be documented as taking non null pointers and purposely adorned in the source code if annotations are available to specify this restriction (via __attribute__((__nonnull__))
, static
size specifications, or similar annotations)
0 comment threads
As with most everything in engineering, how much call arguments to a subroutine should be validated is a tradeoff. There is no single universal right answer. Note that checking for null pointers is only one instance of validating information from elsewhere before acting on it.
Advantages of data validation:
- Helps during development and debugging.
- Protects against possible malicious use.
- Reduces cascading errors. When everything is a mess, it's often hard to figure out what initially went wrong.
- Allowing bad data to pass may cause a hard failure later and crash the whole system. Trying to dereference a NIL pointer will usually crash a program. At that point, it can't apply any corrective action. However, if a NIL pointer is detected, then there is opportunity to handle the error without disrupting other parts of the operation.
- There might be legitimate reasons data could be invalid. Your subroutine might be the right place to check for that.
- Some errors can cause correct-looking results. Getting wrong answers can be far worse than the process stalling and giving no answer.
- Safety-critical applications. "Invalid message received about adjusting the control rods" may be preferred over "Raise the control rods by 1038 meters".
Of course nothing is free. Data validation comes at the cost of larger code size and slower run-time response. Whether that is a worthwhile tradeoff depends on a number of parameters:
- Does the extra delay in responsiveness matter? In critical control applications, it might.
- Does the extra code space matter? If it's in a high-volume throw-away consumer product that maxxes out a cheap microcontroller, then using the next size up micro may make the whole product non-viable.
- What's the cost of failure? Misreading a button press on a child's toy versus on an x-ray cancer treatment machine have vastly different downsides.
- How likely is bad data? Something a user types in could be anything, whereas the reading from your 10 bit A/D is always going to be 0-1023. Also, at some point data needs to be trusted. Is this a low level routine that should only be handed good data that has already been validated by upper layers?
- Is there anything you can do about it? On an embedded microcontroller there may be no operating system, no user console to write messages to, etc. In the end, your not really going to output 1038 volts. Resetting the system may be worse than the disruption due to a bad data value.
Everything is a tradeoff.
To extend a bit on some points already mentioned:
Regarding where to add checks on a case by case basis: If you have a static code analysis tool that can give you information about areas where provably no null pointer will be passed / no undefined behavior can occur, you can omit the checks in these areas. (Most tools will indicate if they find a problem, but if they don't report something you are still not sure if there is provably none or if only the tool's analysis depth was exceeded - PolySpace is one example for a tool that can provide you that more precise information.) Obviously, that only works if you have the full program code available or are looking at some static
function.
Regarding the possibility mentioned by @chqrlie to use a configurable approach as with assertions (assuming some assertion concept adapted to your situation): Assertions are obviously a run-time mechanism to detect issues, but they are also beneficial for readers of your code - as well as for many static analysis tools that can use them to improve their analysis. (Well, the de-referencing of a null
pointer will be detected anyway by the analysis tools, but there will likely be more general function argument checks also...)
Regarding weighing the pros and cons: In addition to weighing the pros and cons for adding checks for the code in the field, one should also think about the pros and cons of having the checks in different development scenarios, like, static code analysis, code reviews, unit-testing, integration-testing on different levels, ... - the fact that typically in some of these cases you want the checks while you don't want them in others often makes a configurable approach desirable.
0 comment threads
The following users marked this post as Works for me:
User | Comment | Date |
---|---|---|
Lundin | (no comment) | Apr 5, 2022 at 13:32 |
The kind of comments telling you to add checks against null are typically coming from programmers mostly used to deal with higher level programming languages. They think that generally, more explicit error handling is always a good thing. And that is true in most cases, but not in this one.
The spirit of C has always been performance over safety. Give the programmer the freedom to do what they want, as fast as possible, at the price of potential risks. For example arrays were purposely not designed with a stored size or any bounds-checking (The Development of the C Language - Dennis M. Ritchie).
If we look at the standard library, lots of functions like memcpy
/strcpy
do for example not support overlapping parameters even though they could. Instead a specialized function memmove
was designed for that purpose - safer but slower. Similarly, malloc
does not needlessly initialize the allocated memory to zero/null, because that would lead to execution overhead. A specialized function calloc
was designed for that purpose. And so on.
We should never add checks against null because they add needless overhead code.
Instead the function should be explicitly documented to not handle the case where an argument which is a null pointer is passed, thereby leaving error handling to the caller.
The reason for this is simple: there are a whole lot of scenarios where the caller is sure that the passed arguments are most definitely not null pointers. Having the function needlessly checking for null pointers only adds bloat in the form of additional branches. Take this example:
char* array[n] = { ... };
for(size_t i=0; i<n; i++)
{
func(array[i]);
}
Now if this snippet is performance-critical, we stall the whole loop in case func
repeatedly checks the passed array for null. We know it isn't a null pointer, but the repeated check may lead to branch prediction problems or cache misses on some systems. On any system, it means a useless check taking up performance. And it cannot get optimized away unless the function is inlined, perhaps not even then.
To give the caller the freedom to do as they like, we should let them handle the null pointer check. Which should ideally be done as close to the point where something might end up as a null pointer, rather than inside some unrelated library function.
As a side note, some very modern compilers like recent gcc or clang versions have limited possibilities of static analysis at compile time, in case we use the exotic static
declarator feature (-Wall
is necessary for this in gcc). But it has very limited use:
// must point at array of at least 1 item and not null
void func (int a[static 1]);
int main (void)
{
func(NULL); // warning null passed to a callee that requires a non-null argument
int* x=NULL;
func(x); // no warning, the compiler can't predict what x contains
}
It's better to use dedicated static analyser tools to find bugs like this and then we don't need the exotic language feature either. Compilers are still to this date not very good at finding application bugs through static analysis.
1 comment thread