Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Post History
General/program design: I would have expected a public header file with the API for the whole thing, rather than having some main() test case calling static functions from the same file. Sure,...
Answer
#1: Initial revision
**General/program design:** - I would have expected a public header file with the API for the whole thing, rather than having some main() test case calling `static` functions from the same file. Sure, this program is small, but having everything in a single file makes things less maintainable. - A unified error type/handling throughout the whole library would be nice. Rather than handling errors in many different ways: returning null pointers, -1, bools on a case by case basis. Normally done with an `enum` instead. Document which error codes each function may return. You can use the same error type internally as you use for the exposed API to the user. - Weird use of `restrict`. It doesn't make sense to use `restrict` for example here: ```c static void parse_options(const struct option *restrict long_options, struct flags *restrict opt_ptr, int argc, char *const argv[], const char **restrict prefix) ``` None of these types may alias and you have no file scope variables of those types either which could confuse the optimizer. Just drop `restrict`. You are using it at a couple of other strange places as well. A `char*` won't alias with a `char***` and so on, so no use for `restrict`. - The code isn't thread-safe because of `Trie` and other file scope variables/statics, but maybe that's not a concern. One possible improvement if you'd want to add a GUI in the future would otherwise be to outsource stuff like file handling to separate threads, not to lag up the rest of the application. - You could make `Trie` use a flexible array member instead and allocate the whole struct dynamically. Probably won't improve performance in any notable way though. - I don't recommend compiling with `-Wconversion`. That one is prone to false positives. Use it temporarily for code review purposes, treat the results with skepticism and then turn it off again. **Performance:** - Calling `realloc` repeatedly in a loop gives bad performance. You do this at a couple of places. Probably dwarfed by the file I/O etc, but still. You could consider using a common reallocation pattern where we start by allocating a "large enough" chunk, often at some suitable aligned size (divisible by 8 on 64 bit systems etc). Then keep track of how much of that memory that is actually used. When running out/low of it, realloc twice that amount. And next time you run out of it, twice the amount again, making it grow exponentially. Never realloc to a smaller size. This is a speed over memory use optimization. **Bad practices:** - You should avoid assignment inside conditions like the plague, since it is such a well-known source of bugs (and usually results in compiler warnings because of it). Consider something like this instead: ```c int c=0; while(c != -1) { c = getopt_long(argc, argv, "shkc:p:", long_options, NULL); ... switch(c) ... } ``` Or maybe even `while(1)` ... `case -1: ...` if that makes more sense. - You use a lot of `*const` qualified pointers which is just odd and don't really add anything except potential pointer conversion issues. For example `char *const argv[]` boils down to `char*const*` which is probably not what anyone expected. - "Three star programming". There are a few valid uses for `char***` but generally they are just code smell. `split_lines` in particular is a seriously evil-looking function, I would recommend to rewrite it. Maybe something like this? (not tried, obviously, so treat it as pseudo code) ```c static char** split_lines(char* s, size_t* line_count) { char** lines = NULL; const size_t chunk_size = 1024; size_t capacity = 0; line_count = 0; for(; s != NULL && *s != '\0'; s++) { if (line_count >= capacity) { ... } lines[*line_count] = s; (*line_count)++; s = strchr(s, '\n'); if (s != NULL) { *s = '\0'; } } return lines; } ``` - Similarly, do not design a "leaky API" where your lib does some allocation internally and then leave it the caller to clean up. Such designs probably stand for the majority of all memory leaks in C, rather than leaks through bugs. Instead do like the `free_pool` case, where the one responsible for the allocation (your lib) is also responsible for the clean-up through a provided function. The above rewrite of `split_lines` should come with a corresponding clean-up function. - "Magic numbers". Don't type out stuff like `1024` in the middle of the code, where did that number come from? Place it behind a named identifier/macro with a meaningful name instead. Some times you do this properly, some times you don't. **Impl.defined behavior (minor):** - `ULL` does not necessarily correspond to `uint64_t`, that's implementation-defined behavior and non-portable. If portability is important, you could instead use the standard C constant `UINT64_C(1024)` which results in a portable integer constant of type `uint_least64_t`. **Style (minor):** - For long function declarations/definitions with a lot of parameters, consider different formatting for readability. This style is common: ```c static void parse_options (const struct option* long_options, struct flags* opt_ptr, int argc, char *const argv[], const char** prefix) ``` - You keep mixing "K&R `{` brace" style with `{` on a line of its own style. Both of these styles are fine but pick one and use it consistently. - `while (*text)` should ideally be replaced with explicit, self-documenting style `while (*text != '\0')` to avoid mix-ups with cases where you compare a pointer against null etc. This program here has a lot of null termination checks and a lot of null pointer checks, so you should be explicit each time. - `new` is a bad identifier name since it is a reserved keyword in C++. Even if you never care for porting/compatibility with C++, you should still avoid naming identifiers after keywords from that language, since IDE style formatting tends to have a "C/C++" setting and then it will format `new` in weird ways.