Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Are there technical reasons to pick one struct coding style over the other?
C offers two different styles when it comes to structs (and union/enum too). Either declare them using a struct tag only ("struct tag style"):
struct my_type
{ ... };
struct my_type x;
Or by using a typedef
, where the tag is optional ("typedef style"):
typedef struct optional_tag
{ ... } my_type;
my_type x;
My personal observation is that the latter style is far more common, though the former style is dominant in Linux programming (but perhaps not necessarily in older *nix programming?).
Are there any real technical arguments in favour of one style or the other?
Some mostly subjective arguments that I've heard:
-
The so-called "Linux kernel coding style" rants subjectively against using
typedef
in general, using "arguments" like "Lots of people think that typedefs help readability. Not so."There may be a grain of truth in what they are trying to say here - we should't use
typedef
to come up with some home-made, "local garage standard" type system such astypedef uint8_t my_little_uint8;
ortypedef bool boule;
- crap like that is far too common.But in general the document fails completely to make a case in favor of the struct tag style, which is the coding standard used in Linux. Basically the reasons for sticking to the style are 100% subjective - because some random open source dude(s) says so. Which isn't necessarily wrong in case all arguments for/against are subjective, then we just have to pick something and stick with it.
They do manage to make one good arguments against their own style: "totally opaque objects (where the typedef is actively used to hide what the object is)". This is a valid argument against the style used in Linux.
-
"Save typing" is a misguided argument often used when defending all manner of more or less bad coding practices, such as coming up with strange
typedef
types. This is a poor argument, because 1) programming is all about typing - those who have a problem with it should consider a different trade, and 2) copy/paste has been available since the dawn of time, as has the pro tip ctrl+shift+left/right arrow key - who cares how long the words are, and 3) various forms of code completion is nowadays supported by all programming IDEs at some extent. -
"Everyone hates Hungarian notation". That is, the type system which Windows has used internally since the first version, where you would name types so that it would be clear what underlying type is.
LPVOID
would mean "long pointer to void", essentially just avoid*
. The Linux side of things are happy to jump on the bandwagon and rant against this style too, and yet they preach "struct tag style" for the very same reasons - to explicitly make it clear what lies underneath a certain type.Today you can just hit a shortcut in your IDE and land in the struct/
typedef
, so the need for creative naming like this was perhaps more of a thing in the past when coding in raw text editors with no C awareness.
None of these arguments are all that convincing for one style over the other.
1 answer
Arguments in favour of "struct tag style":
-
Less namespace clutter.
C has a peculiar name space system where all identifiers belong to one of: labels, tags, members, ordinary. Struct tag style only occupies a name in the tag name space.
Meaning that struct tags will not collide with everything else in the ordinary name space and so using that style means less naming collisions overall, if the structs are exposed to the whole program through a header file.
(They may of course still collide with other structs, unions or enums however.)
Similarly, self-referencing structs that use "typedef style" will occupy twice the amount of identifiers:
typedef struct node { .... struct node* next; } node;
This occupies the identifier
node
in both the tag and ordinary name spaces.
Arguments in favour of "typedef style":
-
Type swapping/punning/serializing struct types with the help of unions.
Suppose you have some struct
struct s { uint16_t x; uint16_t y; };
and later on realize that it would be handy to access this struct in more ways than one. With the nice little feature of C11 anonymous structs, we can rewrite this declaration into a union while remaining backwards-compatible:
union s { struct { uint16_t x; uint16_t y; }; struct { uint8_t x_lo; // assuming little endian & no padding uint8_t x_hi; uint8_t y_lo; uint8_t y_hi; }; uint8_t raw_data [4]; };
You'll see unions like this all the time for embedded system MCU register maps, data protocol definitions and similar.
Thanks to C11 anonymous structs we can rewrite the struct like this and not break any existing code already using it as
obj.x
.However, if we used "struct tag" style, we would have to rewrite all the code using what was once a struct and instead type
union
all over the place.Similarly, in case of the mentioned register maps, we may have all manner of hardware registers where some are declared as structs and some as union. It becomes an unholy mess for the user of the register map if they are to type out either
struct
orunion
depending on register. I've never seen a MCU vendor-provided register map using "struct tag style", likely for this very reason. -
Type-generic and/or macro trick programming.
Type-generic programmming in old school C typically uses void pointers in combination with some enum to keep track of what type the data is. There may also be various macro tricks where you want to pass along a type as a pre-processor token, for whatever reason.
In both of these cases, it is highly inconvenient to have types consisting of multiple words with white space in between them.
Example - suppose we have a "X macro" listing all supported types by some program:
#define SUPPORTED_TYPES(X) \ X(int) \ X(double) \
We might now use this for example to generate an enum, to get a unique integer value corresponding to each type supported:
typedef enum { #define ENUM_NAME(type) SOMETHING_##type, SUPPORTED_TYPES(ENUM_NAME) } type_t;
So far so good. Then I want to add a struct to the list:
struct my_struct { int x; }; #define SUPPORTED_TYPES(X) \ X(int) \ X(double) \ X(struct my_struct) \
And bam: syntax error upon token concatenation
SOMETHING_##type
, because we suddenly try to create an enumeration constant calledSOMETHING_struct my_struct
.Whereas
typedef struct { int x; } my_struct;
works seamlessly.Similarly, only using one single pre-processor token for the type might help with macro "stringification" using the
#
operator. -
Opaque types.
Yes, that subjective Linux kernel document did manage to find a single non-subjective argument and ironically it was one against using "struct tag style".
When declaring opaque types - see How to do private encapsulation in C? - the preferred forward declaration in the header is
typedef struct my_struct my_struct;
. Mainly because the user of the opaque type need not and should not worry about what type they are actually dealing with.But at the same time we cannot use
typedef
twice both for forward declaration and struct defintion. So the actual struct definition in the .c file would still have to bestruct my_struct {...};
even if the API of our opaque type always usesmy_struct*
without thestruct
keyword everywhere else. It kind of ends up a combination of the two styles. -
C++ compatibility.
In C++, structs are just a poor man's class and tags do not work quite the same - what you type after
struct
is the name of the type, period. You never type outstruct my_type x;
in C++. Or well, you can, but it looks quite alien and out of place among the rest of the language conventions.
0 comment threads