Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Code Reviews

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

A simple implementation of a mutable String in C

+2
−0

(Brought over from SE.)


The following code implements a simple interface to operate on mutable* Strings in C. It is composed of two files: one for the structure definition and the available operations, and the other with the actual implementation.

* = This can be inexact, as only adding characters to the end is implemented.

Any comments on coding style and improvements are greatly appreciated.

String.h

#ifndef __STRING_H__
#define __STRING_H__

#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>

#define BLOCK_SIZE 256

typedef struct {
    char *data;
    int length;
    int blocks;
} String;

String * string_create();
void string_dispose(String *str);

bool string_empty(String *str);
void string_append_char(String *str, char c);

char * string_get_all(String *str);
char string_get(String *str, int pos);

int string_find(String *str, char c);

#endif /* __STRING_H__ */

String.c

#include "String.h"

/** Create a String */
String * string_create() {
    String *ans = calloc(1, sizeof *ans);

    if (ans != NULL) {
        ans->data = malloc( BLOCK_SIZE * sizeof *(ans->data) );

        ans->length = 0;
        ans->blocks = (ans->data == NULL) ? 0 : 1;
    }

    return ans;
}

/** Free the memory associated with a String */
void string_dispose(String *str) {
    if (str != NULL) {
        free(str->data);
        free(str);
    }
}

/** Is the String empty? */
bool string_empty(String *str) {
    if (str == NULL) return true;
    if (str->length == 0) return true;
    return false;
}

/** Add a character to the end of the String */
void string_append_char(String *str, char c) {
    if (str != NULL) {
        if (str->length == str->blocks * BLOCK_SIZE) {
            char *new_str = realloc( str->data, BLOCK_SIZE * (str->blocks + 1) * sizeof *(str->data));

            if (new_str != NULL) {
                str->data = new_str;
                ++(str->blocks);
            }
        }

        if (str->length < str->blocks * BLOCK_SIZE) {
            str->data[str->length] = c;
            ++(str->length);
        }
    }
}

/** Get a C-String with the proper null-terminator */
char * string_get_all(String *str) {
    char *res = NULL;

    if (str != NULL) {
        res = malloc((str->length + 1) * sizeof *str->data);

        if (res != NULL) {
            memcpy(res, str->data, str->length);
            res[str->length] = '\0';
        }
    }

    return res;
}

/** Get a character at a given position in the String */
char string_get(String *str, int pos) {
    char res = '\0';

    if (str != NULL) {
        if (pos >= 0 && pos < str->length) {
            res = str->data[pos];
        }
    }

    return res;
}

/** Get where the first occurrence of a character in the String is */
int string_find(String *str, char c) {
    int pos = -1;

    if (str != NULL) {
        pos = 0;
        while (str->data[pos] != c) ++pos;

        if (pos == str->length) pos = -1;
    }

    return pos;
}

test_string.c

This is a test file that uses some of the core methods implemented in a String.

#include <stdio.h>

#include "String.h"

int main() {
    String *str = string_create();

    char i;
    for (i = 'a'; i <= 'z'; i++)
        string_append_char(str, i);

    char *cstr = string_get_all(str);

    printf("%s\n", cstr);

    int pos = string_find(str, 'f');

    printf("Character 'f' occurs at position %d.\n", pos);
    printf("Reading from the String, we get \"%c\".\n", string_get(str, pos));

    free(cstr);
    string_dispose(str);

    return 0;
}
History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

3 answers

You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.

+1
−0

(User JS1 answered as follows; source.)


Unsafe loop

This loop in string_find() is unsafe since it could read past the end of your buffer:

while (str->data[pos] != c) ++pos;

You should add an additional check like this:

while (pos < str->length && str->data[pos] != c) ++pos;

Short circuit error conditions

Rather than doing this:

if (str != NULL) {
    // Rest of function indented
}

it would be easier to read if you rewrote it like this:

if (str == NULL)
    return NULL;

// Rest of function, not indented

Other things

Your reallocation strategy will lead to O(n^2) behavior when appending to long strings. You might want to double the allocation instead of adding a fixed amount.

You might want to use size_t for your sizes and lengths instead if int, because an int might overflow at 32KB on some platforms.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

+3
−0

Design/API

  • Consider hiding all details of the String internals to the caller by implementing opaque type - How to do private encapsulation in C? (As it happens, that post contains a example taken from another string handling class.)
  • The code responsible for allocation should also be responsible for freeing. Avoid interfaces where you return pointers to dynamically allocated memory, then expect the caller to clean that up - that's exactly how memory leaks are made. Instead leave memory allocation to the caller whenever possible.
  • Avoid bool as error information type. Ok so this is a simple little library here so it's OK in this context. But for more complex libraries, consider implementing an enum return code instead. So that the caller gets information about what exactly failed, rather than just "something failed".
  • Similarly, returning -1 or NULL etc isn't very helpful either. Avoid mixing up error information with the data, if possible. The return type of the function is usually reserved for the error information and data should be passed through parameters.

Bugs

  • string_find ought to both check for end of the array as well as the null terminator. You could just call strchr instead. Or otherwise, maybe implement it along the lines of this:

    for(int i=0; i<str->length; && str[i]!='\0'; i++)
    {
      if(str[i] == c)
      {
        return i;
      }
    }
    
    /* error handling here */
    

    But again, better yet would be to implement some error information enum.

  • for (i = 'a'; i <= 'z'; i++) Strictly speaking, the letters of the alphabet are actually not guaranteed to be placed in sequence, so this loop could get stuck forever or add stuff that isn't letters to the string. It will work fine on mainstream implementations (ASCII, UTF8 etc) but fail on strange and exotic implementations (EBCDIC, extended locales etc).

    One possible fix:

    const char alpha[] = "abcdefghijklmnopqrstuvwxyz";
    for(int i=0; i<26; i++)
       string_append_char(str, alpha[i]);
    

Best practices & coding style

  • Use const correctness on all parameters passed through pointers when the function should not modify that parameter.
  • I agree with the previous review that if (str == NULL) return NULL; can help clarity a lot, since deeply nested functions with many brace levels turns harder to read.

Standard compliance

  • Although it is common to name header guards something like #ifndef __STRING_H__ (totally guilty of that myself in the past), this is actually a reserved identifier for the compiler. Avoid any identifier beginning with an underscore, in particular double underscore or underscore followed by an upper-case letter.
History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

+0
−0

My biggest gripe with your code is the (almost) complete lack of comments! Nothing gives even a brief overview of what this library is supposed to do, and we are left to guess what the subroutines do from their names only.

By the way, if you want strings that can automatically grow and shrink as needed at run time, you are welcome to use my STRFLEX library and/or its code as you see fit, with the only restriction being proper attribution. I use these strings when they need to arbitrarily grow, like in my ESCR interpreter and pre-processor.

For strings with fixed maximum length, I use my STRING library. Both the current length and maximum storage length are kept for each string. That gets around many of the problems with C strings. The library contains a rather extensive set of string manipulation capabilities, including quite flexible string to/from number conversions.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »