Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Post History

80%
+6 −0
Q&A array of arrays vs array of pointers to store array of string literals

Let's consider the following code: const char a[][4] = {"aa", "aaa"}; const char *b[] = {"bb", "bbb"}; const char *const c[] = {"cc", "ccc"}; For shared libraries, both b and c arrays require...

0 answers  ·  posted 2y ago by alx‭  ·  edited 2y ago by alx‭

#8: Nominated for promotion by user avatar Alexei‭ · 2022-06-12T05:34:26Z (almost 2 years ago)
#7: Post edited by user avatar alx‭ · 2022-06-11T22:26:33Z (almost 2 years ago)
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
  • ```txt
  • Array of arrays:
  • aa\0\0aaa\0 // total 8 bytes
  • Array of pointers:
  • cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
  • p and q being pointers (64-bit) to the strings
  • ```
  • so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
  • Edit:
  • This issue was raised in a patch submitted to NGINX Unit. It can be helpful to see it, which contains some real numbers coming from real code: <https://github.com/nginx/unit/pull/721>
  • Edit2:
  • Experimentally, an array of pointers seems to be much worse that an array of arrays (see that link). It more or less confirms my expectations of a large array that is put in the (initialized) data section of the binary. I expect that to cause the startup to slow down.
  • The most relevant part of that link is the following:
  • ```
  • $ git switch array_of_pointers
  • $ git clean -dffx
  • $ ./configure
  • $ make -j
  • $ size build/unitd
  • text data bss dec hex filename
  • 374088 29640 1224 404952 62dd8 build/unitd
  • ```
  • ```
  • $ git switch array_of_arrays
  • $ git clean -dffx
  • $ ./configure
  • $ make -j
  • $ size build/unitd
  • text data bss dec hex filename
  • 375266 29000 1224 405490 62ff2 build/unitd
  • ```
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
  • ```txt
  • Array of arrays:
  • aa\0\0aaa\0 // total 8 bytes
  • Array of pointers:
  • cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
  • p and q being pointers (64-bit) to the strings
  • ```
  • so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
  • Edit:
  • This issue was raised in a patch submitted to NGINX Unit. It can be helpful to see it, which contains some real numbers coming from real code: <https://github.com/nginx/unit/pull/721>
  • Edit2:
  • Experimentally, an array of pointers seems to be much worse that an array of arrays (see that link). It more or less confirms my expectations of a large array that is put in the (initialized) data section of the binary. I expect that to slow down the startup.
  • The most relevant part of that link is the following:
  • ```
  • $ git switch array_of_pointers
  • $ git clean -dffx
  • $ ./configure
  • $ make -j
  • $ size build/unitd
  • text data bss dec hex filename
  • 374088 29640 1224 404952 62dd8 build/unitd
  • ```
  • ```
  • $ git switch array_of_arrays
  • $ git clean -dffx
  • $ ./configure
  • $ make -j
  • $ size build/unitd
  • text data bss dec hex filename
  • 375266 29000 1224 405490 62ff2 build/unitd
  • ```
#6: Post edited by user avatar alx‭ · 2022-06-11T22:26:02Z (almost 2 years ago)
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
  • ```txt
  • Array of arrays:
  • aa\0\0aaa\0 // total 8 bytes
  • Array of pointers:
  • cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
  • p and q being pointers (64-bit) to the strings
  • ```
  • so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
  • Edit:
  • This issue was raised in a patch submitted to NGINX Unit. It can be helpful to see it, which contains some real numbers coming from real code: <https://github.com/nginx/unit/pull/721>
  • Edit2:
  • Experimentally, an array of pointers seems to be much worse that an array of arrays (see that link). It more or less confirms my expectations of a large array that is put in the initialized section of the binary. I expect that to cause the startup to slow down.
  • The most relevant part of that link is the following:
  • ```
  • $ git switch array_of_pointers
  • $ git clean -dffx
  • $ ./configure
  • $ make -j
  • $ size build/unitd
  • text data bss dec hex filename
  • 374088 29640 1224 404952 62dd8 build/unitd
  • ```
  • ```
  • $ git switch array_of_arrays
  • $ git clean -dffx
  • $ ./configure
  • $ make -j
  • $ size build/unitd
  • text data bss dec hex filename
  • 375266 29000 1224 405490 62ff2 build/unitd
  • ```
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
  • ```txt
  • Array of arrays:
  • aa\0\0aaa\0 // total 8 bytes
  • Array of pointers:
  • cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
  • p and q being pointers (64-bit) to the strings
  • ```
  • so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
  • Edit:
  • This issue was raised in a patch submitted to NGINX Unit. It can be helpful to see it, which contains some real numbers coming from real code: <https://github.com/nginx/unit/pull/721>
  • Edit2:
  • Experimentally, an array of pointers seems to be much worse that an array of arrays (see that link). It more or less confirms my expectations of a large array that is put in the (initialized) data section of the binary. I expect that to cause the startup to slow down.
  • The most relevant part of that link is the following:
  • ```
  • $ git switch array_of_pointers
  • $ git clean -dffx
  • $ ./configure
  • $ make -j
  • $ size build/unitd
  • text data bss dec hex filename
  • 374088 29640 1224 404952 62dd8 build/unitd
  • ```
  • ```
  • $ git switch array_of_arrays
  • $ git clean -dffx
  • $ ./configure
  • $ make -j
  • $ size build/unitd
  • text data bss dec hex filename
  • 375266 29000 1224 405490 62ff2 build/unitd
  • ```
#5: Post edited by user avatar alx‭ · 2022-06-11T22:25:11Z (almost 2 years ago)
experimental data
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
  • ```txt
  • Array of arrays:
  • aa\0\0aaa\0 // total 8 bytes
  • Array of pointers:
  • cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
  • p and q being pointers (64-bit) to the strings
  • ```
  • so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
  • Edit: This issue was raised in a patch submitted to NGINX Unit. It can be helpful to see it, which contains some real numbers coming from real code: <https://github.com/nginx/unit/pull/721>
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
  • ```txt
  • Array of arrays:
  • aa\0\0aaa\0 // total 8 bytes
  • Array of pointers:
  • cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
  • p and q being pointers (64-bit) to the strings
  • ```
  • so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
  • Edit:
  • This issue was raised in a patch submitted to NGINX Unit. It can be helpful to see it, which contains some real numbers coming from real code: <https://github.com/nginx/unit/pull/721>
  • Edit2:
  • Experimentally, an array of pointers seems to be much worse that an array of arrays (see that link). It more or less confirms my expectations of a large array that is put in the initialized section of the binary. I expect that to cause the startup to slow down.
  • The most relevant part of that link is the following:
  • ```
  • $ git switch array_of_pointers
  • $ git clean -dffx
  • $ ./configure
  • $ make -j
  • $ size build/unitd
  • text data bss dec hex filename
  • 374088 29640 1224 404952 62dd8 build/unitd
  • ```
  • ```
  • $ git switch array_of_arrays
  • $ git clean -dffx
  • $ ./configure
  • $ make -j
  • $ size build/unitd
  • text data bss dec hex filename
  • 375266 29000 1224 405490 62ff2 build/unitd
  • ```
#4: Post edited by user avatar alx‭ · 2022-06-11T21:48:50Z (almost 2 years ago)
add link to NGINX Unit
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
  • ```txt
  • Array of arrays:
  • aa\0\0aaa\0 // total 8 bytes
  • Array of pointers:
  • cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
  • p and q being pointers (64-bit) to the strings
  • ```
  • so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
  • ```txt
  • Array of arrays:
  • aa\0\0aaa\0 // total 8 bytes
  • Array of pointers:
  • cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
  • p and q being pointers (64-bit) to the strings
  • ```
  • so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
  • Edit: This issue was raised in a patch submitted to NGINX Unit. It can be helpful to see it, which contains some real numbers coming from real code: <https://github.com/nginx/unit/pull/721>
#3: Post edited by user avatar alx‭ · 2022-06-11T20:51:08Z (almost 2 years ago)
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
  • ```txt
  • Array of arrays:
  • aa\0\0aaa\0 // total 8 bytes
  • Array of pointers
  • cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
  • p and q being pointers (64-bit) to the strings
  • ```
  • so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
  • ```txt
  • Array of arrays:
  • aa\0\0aaa\0 // total 8 bytes
  • Array of pointers:
  • cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
  • p and q being pointers (64-bit) to the strings
  • ```
  • so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
#2: Post edited by user avatar alx‭ · 2022-06-11T20:50:51Z (almost 2 years ago)
array of pointers really uses a lot more space
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Let's consider the following code:
  • ```c
  • const char a[][4] = {"aa", "aaa"};
  • const char *b[] = {"bb", "bbb"};
  • const char *const c[] = {"cc", "ccc"};
  • ```
  • For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.
  • See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3
  • But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?
  • If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.
  • Although... since the array of pointers approach requires an array of pointers that is separate from the strings, it might use even more than an extra byte:
  • ```txt
  • Array of arrays:
  • aa\0\0aaa\0 // total 8 bytes
  • Array of pointers
  • cc\0ccc\0ppppppppqqqqqqqq // total 23 bytes
  • p and q being pointers (64-bit) to the strings
  • ```
  • so strings should be much more different (> 8 bytes in average compared to the longest string) in size to compensate for the extra array. Unless I'm missing something.
#1: Initial revision by user avatar alx‭ · 2022-06-11T09:26:12Z (almost 2 years ago)
array of arrays vs array of pointers to store array of string literals
Let's consider the following code:

```c
const char a[][4] = {"aa", "aaa"};
const char *b[] = {"bb", "bbb"};
const char *const c[] = {"cc", "ccc"};
```

For shared libraries, both `b` and `c` arrays require the array of pointers to be generated at runtime, which implies performance costs.

See <https://www.akkadia.org/drepper/dsohowto.pdf> 2.4.3

But for a standalone program, does the same issue exist, or is the array generated at link time (ld(1))?

If the array can be read-only for `c` in a standalone program, it could be even better than `a`, since it doesn't consume an unnecessary byte for `"cc"`.