Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
A simple implementation of a mutable String in C
(Brought over from SE.)
The following code implements a simple interface to operate on mutable* String
s in C. It is composed of two files: one for the structure definition and the available operations, and the other with the actual implementation.
* = This can be inexact, as only adding characters to the end is implemented.
Any comments on coding style and improvements are greatly appreciated.
String.h
#ifndef __STRING_H__
#define __STRING_H__
#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#define BLOCK_SIZE 256
typedef struct {
char *data;
int length;
int blocks;
} String;
String * string_create();
void string_dispose(String *str);
bool string_empty(String *str);
void string_append_char(String *str, char c);
char * string_get_all(String *str);
char string_get(String *str, int pos);
int string_find(String *str, char c);
#endif /* __STRING_H__ */
String.c
#include "String.h"
/** Create a String */
String * string_create() {
String *ans = calloc(1, sizeof *ans);
if (ans != NULL) {
ans->data = malloc( BLOCK_SIZE * sizeof *(ans->data) );
ans->length = 0;
ans->blocks = (ans->data == NULL) ? 0 : 1;
}
return ans;
}
/** Free the memory associated with a String */
void string_dispose(String *str) {
if (str != NULL) {
free(str->data);
free(str);
}
}
/** Is the String empty? */
bool string_empty(String *str) {
if (str == NULL) return true;
if (str->length == 0) return true;
return false;
}
/** Add a character to the end of the String */
void string_append_char(String *str, char c) {
if (str != NULL) {
if (str->length == str->blocks * BLOCK_SIZE) {
char *new_str = realloc( str->data, BLOCK_SIZE * (str->blocks + 1) * sizeof *(str->data));
if (new_str != NULL) {
str->data = new_str;
++(str->blocks);
}
}
if (str->length < str->blocks * BLOCK_SIZE) {
str->data[str->length] = c;
++(str->length);
}
}
}
/** Get a C-String with the proper null-terminator */
char * string_get_all(String *str) {
char *res = NULL;
if (str != NULL) {
res = malloc((str->length + 1) * sizeof *str->data);
if (res != NULL) {
memcpy(res, str->data, str->length);
res[str->length] = '\0';
}
}
return res;
}
/** Get a character at a given position in the String */
char string_get(String *str, int pos) {
char res = '\0';
if (str != NULL) {
if (pos >= 0 && pos < str->length) {
res = str->data[pos];
}
}
return res;
}
/** Get where the first occurrence of a character in the String is */
int string_find(String *str, char c) {
int pos = -1;
if (str != NULL) {
pos = 0;
while (str->data[pos] != c) ++pos;
if (pos == str->length) pos = -1;
}
return pos;
}
test_string.c
This is a test file that uses some of the core methods implemented in a String
.
#include <stdio.h>
#include "String.h"
int main() {
String *str = string_create();
char i;
for (i = 'a'; i <= 'z'; i++)
string_append_char(str, i);
char *cstr = string_get_all(str);
printf("%s\n", cstr);
int pos = string_find(str, 'f');
printf("Character 'f' occurs at position %d.\n", pos);
printf("Reading from the String, we get \"%c\".\n", string_get(str, pos));
free(cstr);
string_dispose(str);
return 0;
}
3 answers
You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.
(User JS1 answered as follows; source.)
Unsafe loop
This loop in string_find()
is unsafe since it could read past the end of your buffer:
while (str->data[pos] != c) ++pos;
You should add an additional check like this:
while (pos < str->length && str->data[pos] != c) ++pos;
Short circuit error conditions
Rather than doing this:
if (str != NULL) { // Rest of function indented }
it would be easier to read if you rewrote it like this:
if (str == NULL)
return NULL;
// Rest of function, not indented
Other things
Your reallocation strategy will lead to O(n^2) behavior when appending to long strings. You might want to double the allocation instead of adding a fixed amount.
You might want to use size_t
for your sizes and lengths instead if int
, because an int
might overflow at 32KB on some platforms.
0 comment threads
Design/API
- Consider hiding all details of the String internals to the caller by implementing opaque type - How to do private encapsulation in C? (As it happens, that post contains a example taken from another string handling class.)
- The code responsible for allocation should also be responsible for freeing. Avoid interfaces where you return pointers to dynamically allocated memory, then expect the caller to clean that up - that's exactly how memory leaks are made. Instead leave memory allocation to the caller whenever possible.
- Avoid
bool
as error information type. Ok so this is a simple little library here so it's OK in this context. But for more complex libraries, consider implementing anenum
return code instead. So that the caller gets information about what exactly failed, rather than just "something failed". - Similarly, returning
-1
orNULL
etc isn't very helpful either. Avoid mixing up error information with the data, if possible. The return type of the function is usually reserved for the error information and data should be passed through parameters.
Bugs
-
string_find
ought to both check for end of the array as well as the null terminator. You could just callstrchr
instead. Or otherwise, maybe implement it along the lines of this:for(int i=0; i<str->length; && str[i]!='\0'; i++) { if(str[i] == c) { return i; } } /* error handling here */
But again, better yet would be to implement some error information enum.
-
for (i = 'a'; i <= 'z'; i++)
Strictly speaking, the letters of the alphabet are actually not guaranteed to be placed in sequence, so this loop could get stuck forever or add stuff that isn't letters to the string. It will work fine on mainstream implementations (ASCII, UTF8 etc) but fail on strange and exotic implementations (EBCDIC, extended locales etc).One possible fix:
const char alpha[] = "abcdefghijklmnopqrstuvwxyz"; for(int i=0; i<26; i++) string_append_char(str, alpha[i]);
Best practices & coding style
- Use const correctness on all parameters passed through pointers when the function should not modify that parameter.
- I agree with the previous review that
if (str == NULL) return NULL;
can help clarity a lot, since deeply nested functions with many brace levels turns harder to read.
Standard compliance
- Although it is common to name header guards something like
#ifndef __STRING_H__
(totally guilty of that myself in the past), this is actually a reserved identifier for the compiler. Avoid any identifier beginning with an underscore, in particular double underscore or underscore followed by an upper-case letter.
0 comment threads
My biggest gripe with your code is the (almost) complete lack of comments! Nothing gives even a brief overview of what this library is supposed to do, and we are left to guess what the subroutines do from their names only.
By the way, if you want strings that can automatically grow and shrink as needed at run time, you are welcome to use my STRFLEX library and/or its code as you see fit, with the only restriction being proper attribution. I use these strings when they need to arbitrarily grow, like in my ESCR interpreter and pre-processor.
For strings with fixed maximum length, I use my STRING library. Both the current length and maximum storage length are kept for each string. That gets around many of the problems with C strings. The library contains a rather extensive set of string manipulation capabilities, including quite flexible string to/from number conversions.
0 comment threads