Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
C naming convention, module trigrams?
For my company, I'm writing naming conventions for embedded code in C language.
- Function names must be named in lowerCamelCase() and start with a verb.
- Global variables are in Maj_started_lower_case and are noun groups.
Exemple:
void initCanDriver();
uint8_t *Can_driver_buffer;
I have propositions to start our names with a module trigram? Exemple:
void CDR_init();
uint8_t *CDR_buffer;
Pros of trigram methode:
- clearly identify modules (and force to have modules)
- avoid duplicate names
Cons of trigram methode:
- actually, the majority of our codes don't really have modules, how to name a new function inside? We would have to define all modules.
- often less readable [you have to know CDR is CAN Driver]
Do you think it is a good idea?
3 answers
Starting with the goals: A naming convention typically aims at supporting several, partly contradictory goals, among which are the following:
- Compliance with the limitations of the language standard and possibly other standards that apply. There are many symbols that are reserved by the ISO-C standard.
- Avoid name clashes between code pieces. This can be code from different supplier companies, but even code from different teams. The risk of name clashes increases with scope, but also with the degree of re-use of the respective code in different contexts.
- Clarity ("Use variable names that mean something" - Kernighan and Plauger, 1974)
- Short to reduce typing and (more importantly) reading effort.
- Indicate architectural / module structure.
Particularly the second of the aforementioned points is a good reason to introduce prefixes for globally visible symbols (macros in header files, functions and variables with external linkage, ...), but module prefixes also support the fifth point. You are calling them "module trigrams", but I would recommend to think of them in a more general way:
- Think of them as namespaces that isolate/protect different sets of symbols from each other.
- Do not limit yourself to three letters (could be more, but also less).
- Consider even using them hierarchically.
The last point can be relevant if you need to combine code from different companies or plan your code to be re-used across many projects: I have worked at companies that, for this reason, put a company-prefix before the module prefix, even for smaller embedded systems.
Now, to the contradicting goal of having short names: Prefixes increase symbol length, but they reduce the risk of name clashes. The risk of name clashes, however, is much smaller and more manageable in smaller scopes. For example, local variables don't have to (and should not) follow the same rules, using the prefix approach here is overkill. (Don't forget, it is all about risk - there is no absolute protection: Someone somewhere might still have the great idea to define a macro called "index" in a public header file...)
For symbols with internal linkage (like, static helper functions) there is no simple rule, because depending on how tools like debugger etc. work with the symbol names it may still be beneficial to use the prefixes there as well, and certainly also if you can not exclude the possibility that foreign header files are included into your .c code...
Summary: The use of prefixes is a proven practice to avoid name clashes and show component structure. Understand your goals to see if and how to apply this practice best in your context.
For what it's worth, I have some 20 years of experience designing embedded C systems, with large and small code bases both. Code design is some of the hardest things to do, since books about object orientation (OO) etc only gets you so far - you have to learn what works out of experience.
Generally speaking, style and design matters become increasingly important when your code base grows. Code re-use and portability between projects is of course always nice too.
Camel vs snake style
Check out this interesting research paper regarding "CamelCase" vs "snake_case", which was carried out with eye tracking. Unsurprisingly they found that people used at reading one style or the other had it easier to read their preferred style. But also that when such bias was removed, camel case proved harder to read.
I have used all manner of styles over the years. I used to prefer camel at a time when I was doing a lot of PC programming, because it's more common there. Whereas snake is more common in embedded systems (and also in the *nix world). Generally I would say that when it comes to pure C (not C++), snake case code bases are more common.
Nowadays I solely use snake case and would recommend that if you have the option. Partially because of the above study, partially because it is more common in the code bases and protocol stacks etc you are likely to encounter when writing embedded C.
As for starting with lower/upper case using either style, that's still quite subjective and it's hard to argue for/against one or the other.
The most important thing of all though, is that you have a coding standard and that everyone follows it. It might be wise to separate coding style (identifiers, indention, brace placement) from coding guidelines (use this feature, don't use that feature). For example MISRA C mostly only covers the latter.
Identifier naming
First of all please note that the C standard reserves a whole lot of identifiers. I won't go into this in detail since it's a big topic, but at least stay clear of all identifiers staring with an underscore, since they may collide with compiler libs.
Regarding constants and/or pre-processor macros, there is a wide consensus to use ALL_CAPS style for those. That's so common that I'd say it is an industry de facto standard, both in the PC world and the embedded world.
Regarding types, there's one very old but perhaps not necessarily common convention that types should be written starting with an upper case letter. I prefer to declare types with _t
in the end instead, which is fine for embedded but not when coding POSIX.
Regarding identifiers for variables/functions: assuming you went with snake case, these should be lower case most of the time. However, in my experience, one thing that really helps making code manageable is to use source module prefixes.
That is, if you are writing a CAN driver, you have for example some can.h
and can.c
. You would then prefix all your functions with CAN_
or can_
and prefix all constants CAN_
, documenting that these belong to the can.h
driver. They need to start the name with this, so you'd write can_init
and not init_can
. It has to be a prefix so that the reader can quickly see which module a function belongs to.
In your code templates for headers, you could have something in the top like Prefix: CAN
. For the coding standard I've developed at my current company, we use something like this on top of each header:
/*
<legal/copyright stuff here>
File can.h
Replaces old_can.h
Status Active
Conformance MISRA C:2012 // filled in after passing code review
Author John Doe
Created 2022-09-26
Prefix CAN
Description This is a HAL for CAN drivers-...
*/
Further down you'll have the documentation for each function together with the prototype format function declaration. In this case maybe something like can_err_t can_init (void);
, assuming you have created a common result/error type (some enum) for all functions in the driver (another good idea).
One thing I would recommend to avoid, is to mix in vague terms like "handler", "driver" etc in the identifier name. It's rather superfluous and it's very hard to read code that calls can_handler(); //handles CAN
... this literally tells the reader nothing other than "do not worry your pretty head". Better names are specific: can_read_rxfifo
, can_send
or whatever the function is actually doing.
I have propositions to start our names with a module trigram?
In general, for the love of obfuscation please never invent new TLA. Huh, what's a TLA? Yeah exactly that... three letter abbreviations. In your case CAN
is a standardized and well-known one (Controller Area Network), so the reader is expected to understand what it means. That's fine, it's an established term. They do not however have a clue what CDR
is supposed to mean... recordable compact disc? You should be using CAN
in this case.
However, letters are cheap - they are in fact free. So not everything has to be an evil 3-letter abbreviation. Naming your relay driver with the prefix relay_
rather than rel_
is in fact far more readable, so go with that.
actually, the majority of our codes don't really have modules, how to name a new function inside?
You should start using more discipline when designing then. Sure, there will be the application level top-tier and it may not need to have prefixes or be organized in modules. But everything else, particularly drivers, protocol handlers, specific algorithms - these should have their own module. This is correct design according to object orientation and other popular ways of designing programs. It reduces tight coupling (important!), it allows code re-usage and it reduces namespace collisions.
If I am to do a new project with CAN, I should just be able to grab your CAN driver, import it into the new project and be ready to go. Now if the CAN driver instead starts whining about some missing "relay module", then it was badly designed with tight coupling - it knows things about other modules that should be no business for a CAN driver, and consequently it also depends on other unrelated modules. That's a very bad state of affairs. The main problem is that tight coupling causes bugs to escalate through your whole program. Write a bug in the relay driver and suddenly the CAN driver stops working too.
Also consider using hardware abstraction layers (HAL). CAN is a perfect example of when this is useful. I have one HAL for CAN drivers (prefixed CAN
) which I re-use across a whole lot of different microcontrollers. The HAL is the user interface but also dictates which functions each driver needs to implement. The user calls the HAL, ensures that the correct drivers are linked to the project, and then everything just works. You can then port the code to another MCU with a minimum of effort, assuming that the CAN driver for that system is already written and tested. When using HAL, the prefix belongs to the HAL and not to the underlying drivers, which probably have some more cryptic name taken from the specific CAN Controller (FlexCAN, bxCAN or whatever it might be called).
1 comment thread
Personally, I don't like the first form (initCanDriver) at all.
The routine name is supposed to present some information as to where/how the routine fits into the larger software world. Information is best presented in global to local context order. This is because the local information often makes no sense outside of its context hierarchy.
To me, "CanDriverInit" or "CanDriver_init" is much better. It also has the advantage that all the CAN driver routines will show up in a sorted list next to each other. That can be useful.
I think the concept behind "CDR_init" is OK, but the implementation is lacking. Only three letters to indicate the package or library is too cryptic. In this case, someone seeing just "CDR" won't have much of a guess what the library is about, and there is a significant possibility of a name collision in the future. "CANDR" would already be much better in my opinion.
I do appreciate the effort to minimize gratuitous typing somewhat. Long routine names distract the mind to get the name right instead of what the code is supposed to do at that point, and reduce the space for end of line comments.
In the end, it's a judgement call based on how you value the various tradeoffs. There is no universal right answer. Personally, I like the "CanDriverXxx" or "CanDriver_xxx" naming schemes, but wouldn't be too opposed to "CANDR_xxx" or "CanDr_xxx". I'd really not like "CDR_xxx".
0 comment threads