Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on How can I manage multiple consecutive strings in a buffer (and add more later)?

Parent

How can I manage multiple consecutive strings in a buffer (and add more later)?

+2
−0

This question is inspired by If I have a char array containing strings with a null byte (\0) terminating each string, how would I add another string onto the end? on Stack Overflow.

Suppose I have a char[] buffer that I'm using to represent multiple null-terminated (ASCII) strings, one after the other. I can easily set up an initial state that has two strings and sufficient room to add a third:

/* The exact amount of space is not critical to the question; it's enough
   to store these strings and leave room for more. */
char buffer[80] = {'o', 'n', 'e', '\0', 't', 'w', 'o', '\0'};

Now suppose I have char* another_string = "three";. How can I append or concatenate another_string to the buffer, generally? I do not want to concatenate the three text with the two, but instead put it in the buffer as a separate string.

I already know that the <string.h> library functions expect a string to be null-terminated, so it seems like they won't help here. For example, strcat would find the first null in the array instead of the second, and overwrite it; and strncpy would need a pointer to where to start writing.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

Shows why C strings are a bad idea in the first place. (1 comment)
Post
+3
−0

The fundamental problem here is that it is already ambiguous where the "end" of the data in the buffer is. Strings can be empty (have zero length as reported by strlen); as such, buffer could equally well be interpreted as containing three strings, where the last is empty. Or more than that - up to what the buffer can hold.

The situation is even worse if we start with uninitialized memory; then there's no way to tell whether the byte after the last intentionally-written null is just uninitialized garbage, or the start of another actual string.

If we don't need to be able to store empty strings, one way around the problem is to mimic how null-termination works, but at a string level rather than a byte level. That is to say, we can establish a convention that the sequence of strings is "empty-string-terminated", and use strlen repeatedly to search for this string. That will tell us where to copy the new string.

However, it will be both simpler and more flexible to just remember where the end of the string sequence is, and update it whenever another string is added. For example, we could do this using an integer index:

/* the lengths of the two initial strings and their null terminators */
int used = 8;
int usable = sizeof(buffer);
strncpy(buffer + used, another_string, usable - used);
buffer[usable - 1] = '\0';
used += strlen(another_string) + 1;
if (used > usable) used = usable;

This code takes care of a few important issues. Note the pointer arithmetic: buffer decays to a pointer to the start of the array, so buffer + used is the desired destination pointer. We need to restrict strncpy to the amount of space that remains in the buffer - between buffer + used and the end of the buffer - to avoid writing beyond the end of the array. Note that strncpy avoids writing more than the declared amount of room, but does not null-terminate if it reaches that limit. To avoid ending up with non-null-terminated data at the end of the array, we can just unconditionally add a null to the last spot in the buffer each time, as shown. (A more sophisticated approach might detect this situation and report an error somehow.) After writing, we need to update the record of how much space is used. (When the buffer is full, used will be limited to the array length; future attempts at strncpy will see that zero bytes are available.)

Also keep in mind that a representation like this is not convenient for modifying the strings later. In particular, anything that tries to change the length of a string that isn't at the end of the sequence, will cause a major headache - because every other string after it will need to be shifted around to make room or close a gap. (This is the same reason that you can't easily modify a single line of a text file "in place".)

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

strncpy (4 comments)
strncpy
Lundin‭ wrote 10 months ago

strncpy should pretty much never be used since it was never intended to be a function used on null terminated C strings. See Is strcpy dangerous and what should be used instead?. In this case you added the null terminator manually, but people tend to forget that. Since these kind of self-answered Q&A posts are supposed to be educational, we shouldn't teach anyone to use the dangerous strncpy function. memcpy or strcpy can be safely used - unless the program is full of bugs at the point where it takes string inputs - in which case that's a separate problem entirely.

Lundin‭ wrote 10 months ago

I hope you don't mind that I now posted a complementary answer. The fight against strncpy is kind of a pet peeve of mine, see :)

Karl Knechtel‭ wrote 10 months ago

I'm afraid I don't follow. With strcpy or memcpy, it's still necessary to verify that the buffer has enough room, and it comes across to me that this is not any less work or harder to overlook.

I'm familiar with your previous Q&A and I have to admit I didn't find it very convincing. That said, your answer here provides a lot of very good supplemental information - the bit about alternate designs is quite important. (As it happens, I have done some work before that involved pre-computing a string table and storing pointers somewhere else - and also storing lengths somewhere else and skipping null terminators, so that the string table could be optimized ahead of time by allowing the strings to overlap. This was a table with I think a few thousand strings, none longer than 255 bytes.)

Lundin‭ wrote 10 months ago

Karl Knechtel‭ The whole point is that program design-wise input sanitation should happen at the point where input is taken. And if that part is done correctly, strcpy is safer than strncpy. And memccpy is arguably a bit safer too since it has no misleading str prefix.