Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
array of arrays vs array of pointers to store array of string literals
Let's consider the following code:
const char a[][4] = {"aa", "aaa"};
const char *b[] = {"bb", "bbb"};
const char *const c[] = {"cc", "ccc"};
For shared libraries, both b
and c
arrays require the array of pointers to be generated at runtime, which implies performance costs.
See https://www.akkadia.org/drepper/dsohowto.pdf 2.4.3
But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
If the array can be read-only for c
in a standalone program, it could be even better than a
, since it doesn't consume an unnecessary byte for "cc"
.
Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
Array of arrays:
aa\0\0aaa\0 // total 8 bytes
Array of pointers:
cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
p and q being pointers (64-bit) to the strings
so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
Edit: This issue was raised in a patch submitted to NGINX Unit. It can be helpful to see it, which contains some real numbers coming from real code: https://github.com/nginx/unit/pull/721
Edit2: Experimentally, an array of pointers seems to be much worse that an array of arrays (see that link). It more or less confirms my expectations of a large array that is put in the (initialized) data section of the binary. I expect that to slow down the startup.
The most relevant part of that link is the following:
$ git switch array_of_pointers
$ git clean -dffx
$ ./configure
$ make -j
$ size build/unitd
text data bss dec hex filename
374088 29640 1224 404952 62dd8 build/unitd
$ git switch array_of_arrays
$ git clean -dffx
$ ./configure
$ make -j
$ size build/unitd
text data bss dec hex filename
375266 29000 1224 405490 62ff2 build/unitd
1 comment thread