Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Review Suggested Edit

You can't approve or reject suggested edits because you haven't yet earned the Edit Posts ability.

Approved.
This suggested edit was approved and applied to the post over 3 years ago by Alexei‭.

12 / 255
  • **Summary (TL;DR)**
  • - Using `strcpy` directly on non-sanitized user input is bad, otherwise it's fine.
  • - `strncpy` is a dangerous function that should be avoided. Its presence in your source is a much greater danger than buffer overruns.
  • - If portability and backwards-compatibility are no concerns, then there's nothing wrong with using `strcpy_s`, given that the function is available.
  • ---
  • **What is a buffer overflow/overrun exploit?**
  • Long time ago, Microsoft did a study/article (I can't find the link, seems MS removed it from their site) where they analysed hacks and exploits, to see which functions that were most often exploited by hackers. They looked at a broad range of functions, not just standard library ones but Microsoft-specific and POSIX ones too. They found that `strcpy` is often exploited when it is used directly on raw user input.
  • Old school "buffer exploits" uses various command-line input or command-line arguments to provide more data than the input buffer of the program was designed for. This could in the easiest form be abused to simply crash the program.
  • The more sinister hacks would however rather disassemble the target executable, finding out where exactly on the stack something like a return address was stored, then use the buffer exploit to overwrite that particular location. You could then sneak in something like the address to some location at the bottom of the executable, where you have injected your potentially malicious program.
  • So if the application programmer just merrily `strcpy` some provided `argv` command line argument into a 100 bytes large stack-allocated buffer, and there's a return address sitting on the stack 5 bytes further down, then the hacker would provide those extra bytes to overwrite that address.
  • ---
  • **Is `strcpy` dangerous?**
  • Based on this, Microsoft naively made the wrong conclusion that the `strcpy` function is dangerous, since it was a recurring function abused by a lot of such exploits. For example if you don't provide a null terminated string, the function will just keep on going, copying beyond array bounds.
  • They came to the conclusion that this was the fault of `strcpy` since it doesn't check the amount of characters to copy. After which they listed `strcpy` as deprecated and dangerous. They started to lobby for alternative non-standard functions invented by themselves, such as [`strcpy_s`](https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/strcpy-s-wcscpy-s-mbscpy-s?view=msvc-160).
  • _However_, the actual problem isn't `strcpy` but programmers who don't sanitize their program input. This could be done with functions like `fgets` or `memchr` where you can set a fixed size, then only copy as much as the set limit allows. In case of strings you can then parse the input to verify that it contains a null terminator, all before you label the user input as a valid C string. `strcpy_s` works in a similar manner, taking a size and stopping upon encountering a null terminator.
  • If you know that the C string is in fact null terminated and proper, then there is no harm in calling `strcpy` - it is perfectly safe and likely quite efficient. From an old answer of mine at [another site](https://stackoverflow.com/a/23490019/584518):
  • > There is nothing wrong with the `strcpy()` function, that's a myth. This function has existed for some 30-40 years and every little bit of it is properly documented. So what the function does and what it does not should not come as a surprise, even to beginner C programmers.
  • >
  • > What `strcpy` does and does not:
  • >
  • > - It copies a null-terminated string into another memory location.
  • > - It does not take any responsibility for error handling.
  • > - It does not fix bugs in the caller application.
  • > - It does not take any responsibility for educating C programmers.
  • >
  • > Because of the last remark above, you must know the following before calling `strcpy`:
  • >
  • > - If you pass a string of unknown length to strcpy, without checking its length in advance, you have a bug in the caller application.
  • > - If you pass some chunk of data which does not end with \0, you have a bug in the caller application.
  • > - If you pass two pointers to `strcpy()`, which point at memory locations that overlap, you invoke undefined behavior. Meaning you have a bug in the caller application.
  • Summary: using `strcpy` directly on non-sanitized user input is bad, otherwise it's fine.
  • ---
  • **What about `strncpy`?**
  • Somewhere at the time when Microsoft flagged `strcpy` as obsolete and dangerous, some other misguided rumour started. This nasty rumour said that `strncpy` should be used as a safer version of `strcpy`. Since it takes the size as parameter and it's already part of the C standard lib, so it's portable. This seemed very convenient - spread the word, forget about non-standard `strcpy_s`, lets use `strncpy`! No, this is not a good idea...
  • Looking at the history of `strncpy`, it goes back to the very earliest days of Unix, where several string formats co-existed. Something called "fixed width strings" existed - they were not null terminated but came with a fixed size stored together with the string. One of the things Dennis Ritchie (the inventor of the C language) wished to avoid when creating C, was to store the size together with arrays [[_The Development of the C Language, Dennis M. Ritchie_](https://www.bell-labs.com/usr/dmr/www/chist.html)]. Likely in the same spirit as this, the "fixed width strings" were getting phased out over time, in favour for null terminated ones.
  • The function used to copy these old fixed width strings was named `strncpy`. This is the sole purpose that it was created for. It has no relation to `strcpy`. In particular it was never intended to be some more secure version - computer program security wasn't even invented when these functions were made.
  • Somehow `strncpy` still made it into the first C standard in 1989. A whole lot of highly questionable functions did - the reason was always backwards compatibility. We can also read the story about `strncpy` in the [C99 rationale](http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf) 7.21.2.4:
  • > **The strncpy function**
  • strncpy was initially introduced into the C library to deal with fixed-length name fields in
  • structures such as directory entries. Such fields are not used in the same way as strings: the
  • trailing null is unnecessary for a maximum-length field, and setting trailing bytes for shorter
  • 5 names to null assures efficient field-wise comparisons. strncpy is not by origin a “bounded
  • strcpy,” and the Committee preferred to recognize existing practice rather than alter the function
  • to better suit it to such use.
  • This is where it starts to smell fishy. "The trailing null is unnecessary"? Yet somewhere on the way to standardization, they made `strncpy` stop upon encountering null termination. But what if it doesn't? That's where the function becomes wildly dangerous. From the C standard (ISO 9899:2018) 7.24.2.4 we can read:
  • char *strncpy(char * restrict s1,
  • const char * restrict s2,
  • size_t n);
  • > If the array pointed to by s2 is a string that is shorter than n characters, null characters are appended to the copy in the array pointed to by s1, until n characters in all have been written.
  • _If_ it is shorter... uh-oh. Else go haywire and _don't_ null terminate the string.
  • Now how do programmers usually and most naturally call this supposed safe function? Like most other functions - by passing along the buffer size. Like in this little program:
  • ```c
  • #include <string.h>
  • #include <stdio.h>
  • #define n 11
  • int main()
  • {
  • char str[n];
  • char src[] = "hello world eat deadbeef";
  • strncpy(str, src, n);
  • puts(str);
  • return 0;
  • }
  • ```
  • This prints `hello world` when I try it on Windows (gcc/mingw x86_64). But there is undefined behavior... when I try it on gcc Linux x86_64, I get `hello worldhello world eat deadbeef`. Simply because the `strncpy()` call doesn't store the null terminator, since there was no room - the source string is much longer than the destination. `n-1` won't solve it either. We have to stomp in and manually null terminate it. This is all very unintuitive and `strncpy` was never intended to be used in this manner in the first place.
  • Summary: `strncpy` is a dangerous function that should be avoided. Its presence in your source is a much greater danger than buffer overruns.
  • ---
  • **What about `strcpy_s`?**
  • Originally released as a non-standard function by Microsoft, it comes with a size parameter. `strcpy_s` returns an error code if it fails, rather than a pointer. You'll need to check this error code.
  • Using this function is however the wrong solution to the problem of no input sanitation, so it is dubious which problem this function was supposed to solve in the first place.
  • Later on somehow, all of these `_s` functions made it into an optional library of the C standard "C11", the so called "Annex K bounds-checking interface". They were first introduced by a pre-study technical report known as [TR 24731-1](http://open-std.org/jtc1/sc22/wg14/www/projects#24731-1). But even to this day, this library is barely implemented by any C compiler - it is barely implemented in Microsoft Visual Studio even though they invented most of it. Annex K is not always compatible with the Microsoft functions using the same names.
  • Overall, the "bounds-checking interface" was a big fiasco. Experts from within the C standard committee itself filed some valid criticism against the library [here](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm). They address problems with `strcpy_s` specifically in the report. Most notably, switching out `strcpy` for `strcpy_s` in existing code comes with numerous pitfalls.
  • So while `strcpy_s` might be safer than `strcpy` in some special cases (and most certainly safer than `strncpy`) it suffers from portability and compatibility concerns. It should be regarded just like any system-specific API function and can't be assumed to be portable.
  • Summary: if portability and backwards-compatibility are no concerns, then there's nothing wrong with using `strcpy_s`, given that the function is available.
  • ---
  • **What about other similar functions: `memcpy`? `strncpy_s`? `strlcpy`?**
  • `memcpy` is always preferred when you know the size in advance. It's always faster than `strcpy`. It is safe and portable.
  • There exists various other "safe" versions in the criticised "bounds-checking interface", including `strncpy_s` which fixes the null termination problem mentioned earlier.
  • The `strlcpy` etc functions originate from BSD/Unix and are basically the non-standard Unix equivalents to the non-standard Microsoft ones. And similarly, `strlcpy` etc are fine to use if portability is not a concern.
  • There are lots of subtle details and difference between all of these functions, I won't go into details here.
  • ---
  • **EDIT:**
  • While it didn't find the original Microsoft article, I did find an old related one here: [Security Development Lifecycle (SDL) Banned Function Calls](https://docs.microsoft.com/en-us/previous-versions/bb288454(v=msdn.10)). Notably, Microsoft also raises the same valid concerns against `strncpy` etc as I do above - Microsoft is likely innocent of the rumour that `strncpy` is a safe verion of `strcpy`.
  • **Summary (TL;DR)**
  • - Using `strcpy` directly on non-sanitized user input is bad, otherwise it's fine.
  • - `strncpy` is a dangerous function that should be avoided. Its presence in your source is a much greater danger than buffer overruns.
  • - If portability and backwards-compatibility are no concerns, then there's nothing wrong with using `strcpy_s`, given that the function is available.
  • ---
  • **What is a buffer overflow/overrun exploit?**
  • Long time ago, Microsoft did a study/article (I can't find the link, seems MS removed it from their site) where they analysed hacks and exploits, to see which functions were most often exploited by hackers. They looked at a broad range of functions, not just standard library ones but Microsoft-specific and POSIX ones too. They found that `strcpy` is often exploited when it is used directly on raw user input.
  • Old school "buffer exploits" use various command-line input or command-line arguments to provide more data than the input buffer of the program was designed for. This could in the easiest form be abused to simply crash the program.
  • The more sinister hacks would however rather disassemble the target executable, finding out where exactly on the stack something like a return address was stored, then use the buffer exploit to overwrite that particular location. You could then sneak in something like the address to some location at the bottom of the executable, where you have injected your potentially malicious program.
  • So if the application programmer just merrily `strcpy` some provided `argv` command line argument into a 100 bytes large stack-allocated buffer, and there's a return address sitting on the stack 5 bytes further down, then the hacker would provide those extra bytes to overwrite that address.
  • ---
  • **Is `strcpy` dangerous?**
  • Based on this, Microsoft naively made the wrong conclusion that the `strcpy` function is dangerous, since it was a recurring function abused by a lot of such exploits. For example if you don't provide a null terminated string, the function will just keep on going, copying beyond array bounds.
  • They came to the conclusion that this was the fault of `strcpy` since it doesn't check the amount of characters to copy. After which they listed `strcpy` as deprecated and dangerous. They started to lobby for alternative non-standard functions invented by themselves, such as [`strcpy_s`](https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/strcpy-s-wcscpy-s-mbscpy-s?view=msvc-160).
  • _However_, the actual problem isn't `strcpy` but programmers who don't sanitize their program input. This could be done with functions like `fgets` or `memchr` where you can set a fixed size, then only copy as much as the set limit allows. In case of strings you can then parse the input to verify that it contains a null terminator, all before you label the user input as a valid C string. `strcpy_s` works in a similar manner, taking a size and stopping upon encountering a null terminator.
  • If you know that the C string is in fact null terminated and proper, then there is no harm in calling `strcpy` - it is perfectly safe and likely quite efficient. From an old answer of mine at [another site](https://stackoverflow.com/a/23490019/584518):
  • > There is nothing wrong with the `strcpy()` function, that's a myth. This function has existed for some 30-40 years and every little bit of it is properly documented. So what the function does and what it does not should not come as a surprise, even to beginner C programmers.
  • >
  • > What `strcpy` does and does not:
  • >
  • > - It copies a null-terminated string into another memory location.
  • > - It does not take any responsibility for error handling.
  • > - It does not fix bugs in the caller application.
  • > - It does not take any responsibility for educating C programmers.
  • >
  • > Because of the last remark above, you must know the following before calling `strcpy`:
  • >
  • > - If you pass a string of unknown length to strcpy, without checking its length in advance, you have a bug in the caller application.
  • > - If you pass some chunk of data which does not end with \0, you have a bug in the caller application.
  • > - If you pass two pointers to `strcpy()`, which point at memory locations that overlap, you invoke undefined behavior. Meaning you have a bug in the caller application.
  • Summary: using `strcpy` directly on non-sanitized user input is bad, otherwise it's fine.
  • ---
  • **What about `strncpy`?**
  • Somewhere at the time when Microsoft flagged `strcpy` as obsolete and dangerous, some other misguided rumour started. This nasty rumour said that `strncpy` should be used as a safer version of `strcpy`. Since it takes the size as parameter and it's already part of the C standard lib, so it's portable. This seemed very convenient - spread the word, forget about non-standard `strcpy_s`, lets use `strncpy`! No, this is not a good idea...
  • Looking at the history of `strncpy`, it goes back to the very earliest days of Unix, where several string formats co-existed. Something called "fixed width strings" existed - they were not null terminated but came with a fixed size stored together with the string. One of the things Dennis Ritchie (the inventor of the C language) wished to avoid when creating C, was to store the size together with arrays [[_The Development of the C Language, Dennis M. Ritchie_](https://www.bell-labs.com/usr/dmr/www/chist.html)]. Likely in the same spirit as this, the "fixed width strings" were getting phased out over time, in favour for null terminated ones.
  • The function used to copy these old fixed width strings was named `strncpy`. This is the sole purpose that it was created for. It has no relation to `strcpy`. In particular it was never intended to be some more secure version - computer program security wasn't even invented when these functions were made.
  • Somehow `strncpy` still made it into the first C standard in 1989. A whole lot of highly questionable functions did - the reason was always backwards compatibility. We can also read the story about `strncpy` in the [C99 rationale](http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf) 7.21.2.4:
  • > **The strncpy function**
  • strncpy was initially introduced into the C library to deal with fixed-length name fields in
  • structures such as directory entries. Such fields are not used in the same way as strings: the
  • trailing null is unnecessary for a maximum-length field, and setting trailing bytes for shorter
  • 5 names to null assures efficient field-wise comparisons. strncpy is not by origin a “bounded
  • strcpy,” and the Committee preferred to recognize existing practice rather than alter the function
  • to better suit it to such use.
  • This is where it starts to smell fishy. "The trailing null is unnecessary"? Yet somewhere on the way to standardization, they made `strncpy` stop upon encountering null termination. But what if it doesn't? That's where the function becomes wildly dangerous. From the C standard (ISO 9899:2018) 7.24.2.4 we can read:
  • char *strncpy(char * restrict s1,
  • const char * restrict s2,
  • size_t n);
  • > If the array pointed to by s2 is a string that is shorter than n characters, null characters are appended to the copy in the array pointed to by s1, until n characters in all have been written.
  • _If_ it is shorter... uh-oh. Else go haywire and _don't_ null terminate the string.
  • Now how do programmers usually and most naturally call this supposed safe function? Like most other functions - by passing along the buffer size. Like in this little program:
  • ```c
  • #include <string.h>
  • #include <stdio.h>
  • #define n 11
  • int main()
  • {
  • char str[n];
  • char src[] = "hello world eat deadbeef";
  • strncpy(str, src, n);
  • puts(str);
  • return 0;
  • }
  • ```
  • This prints `hello world` when I try it on Windows (gcc/mingw x86_64). But there is undefined behavior... when I try it on gcc Linux x86_64, I get `hello worldhello world eat deadbeef`. Simply because the `strncpy()` call doesn't store the null terminator, since there was no room - the source string is much longer than the destination. `n-1` won't solve it either. We have to stomp in and manually null terminate it. This is all very unintuitive and `strncpy` was never intended to be used in this manner in the first place.
  • Summary: `strncpy` is a dangerous function that should be avoided. Its presence in your source is a much greater danger than buffer overruns.
  • ---
  • **What about `strcpy_s`?**
  • Originally released as a non-standard function by Microsoft, it comes with a size parameter. `strcpy_s` returns an error code if it fails, rather than a pointer. You'll need to check this error code.
  • Using this function is however the wrong solution to the problem of no input sanitation, so it is dubious which problem this function was supposed to solve in the first place.
  • Later on somehow, all of these `_s` functions made it into an optional library of the C standard "C11", the so called "Annex K bounds-checking interface". They were first introduced by a pre-study technical report known as [TR 24731-1](http://open-std.org/jtc1/sc22/wg14/www/projects#24731-1). But even to this day, this library is barely implemented by any C compiler - it is barely implemented in Microsoft Visual Studio even though they invented most of it. Annex K is not always compatible with the Microsoft functions using the same names.
  • Overall, the "bounds-checking interface" was a big fiasco. Experts from within the C standard committee itself filed some valid criticism against the library [here](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm). They address problems with `strcpy_s` specifically in the report. Most notably, switching out `strcpy` for `strcpy_s` in existing code comes with numerous pitfalls.
  • So while `strcpy_s` might be safer than `strcpy` in some special cases (and most certainly safer than `strncpy`) it suffers from portability and compatibility concerns. It should be regarded just like any system-specific API function and can't be assumed to be portable.
  • Summary: if portability and backwards-compatibility are no concerns, then there's nothing wrong with using `strcpy_s`, given that the function is available.
  • ---
  • **What about other similar functions: `memcpy`? `strncpy_s`? `strlcpy`?**
  • `memcpy` is always preferred when you know the size in advance. It's always faster than `strcpy`. It is safe and portable.
  • There exists various other "safe" versions in the criticised "bounds-checking interface", including `strncpy_s` which fixes the null termination problem mentioned earlier.
  • The `strlcpy` etc functions originate from BSD/Unix and are basically the non-standard Unix equivalents to the non-standard Microsoft ones. And similarly, `strlcpy` etc are fine to use if portability is not a concern.
  • There are lots of subtle details and difference between all of these functions, I won't go into details here.
  • ---
  • **EDIT:**
  • While it didn't find the original Microsoft article, I did find an old related one here: [Security Development Lifecycle (SDL) Banned Function Calls](https://docs.microsoft.com/en-us/previous-versions/bb288454(v=msdn.10)). Notably, Microsoft also raises the same valid concerns against `strncpy` etc as I do above - Microsoft is likely innocent of the rumour that `strncpy` is a safe verion of `strcpy`.

Suggested over 3 years ago by J-hen‭