Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
C Language Standard Linking Specifications
What (if anything) does the C standard have to say about linking objects? My guess is that, because C only defines language->behavior rules, it completely ignores any potential intermediate form the code may take. Obviously, C doesn't guarantee anything about the executable itself because it may well be "compiled" down to something that isn't an executable for a computer platform or even something similar. That said, does C make any guarantees about interoperability between compiled output targets on a given platform? For example, if I add this to a source file:
extern int foo(void);
the resulting C program will expect to be able to jump to such a symbol but doesn't require any knowledge of the definition to successfully compile and output valid code. Does C require that the above eventually resolve to an executable entity (i.e. during linking), or is C itself completely indifferent to any linking mechanics that take place after compilation? If the former, what guarantees does C make?
EDIT: Following some commentary, I think the question can be further clarified.
In a freestanding C program, there is no I/O capability, simply due to the fact that the language imposes no restrictions whatsoever on the capabilities of a compliant C implementation without a hosted environment like an operating system. In practice, to perform a processor I/O call on an x86 machine in a kernel program, a call must be made out of the C code to assembly (or something similar). An assembly module is by no means, of course, required to comply in any way to the C standard. Can such a module be considered a separate translation unit? According to the standard, I'd assume it cannot. The way the C module and the assembly module are linked together is completely dependent on the environment and implementation.
Having said all this, consider the above function foo
once more. If this function is defined in assembly, it is not part of any C translation unit, and therefore can only be linked by relying on implementation details. My question is this: is this environment-specific linking behavior implementation-defined? If so, there should be reference to it within the C standard somewhere, and I'd like to know exactly what can be said about linking C translation units with other non-C libraries. However, intuitively I'd assume that, since the definition of foo
is not contained within any C translation unit, the C program should not be valid. This would mean that every kernel ever written that has any I/O of any kind is not standards-compliant C. That doesn't necessarily matter in practice because a non-compliant program that does its job is just as useful as a compliant one. My question is purely theoretical: is my analysis correct? Is a compliant freestanding C program unable to produce observable results? Or am I missing something in the standard that relates to linking against something that is not a C translation unit?
1 answer
The standard does not talk about object files and/or linking, but it does talk about translation units. In typical compilers, a single translation unit translates into a single object file.
Obviously the C standard does not talk about translation units that are not themselves C; that's what a platform's ABI is about.
Now what the C standard tells you is this:
-
If you have the declaration
extern int foo(void);
in some translation unit, then it makes the name available to the current translation unit. The declaration itself does not imply that this function exists anywhere.
-
If you do call that function from anywhere in your translation unit, then some translation unit must define it, and it must define it with exactly that signature. From the view of the C standard, that other translation unit is, of course, another C file compiled with the same C implementation.
Your compiler's ABI (which is not covered by the C standard) then tells what this means in terms of symbols and code. For example, it tells you that there has to be a symbol that's named
foo
(in some ABIs, it's_foo
instead), which must lead to executable code that leaves some integer value in a specific register (and usually has some further requirements, e.g. on what it does with the stack).Obviously some object file generated with another compiler or hand-written assembly will work exactly if the correct symbol exists and points to code fulfilling all the conditions required by the ABI.
These days, the C ABI is usually fixed by the platform. That is, all C compilers on the same platform are interoperable. Note however that this is not mandated by the C standard, which always assumes a single C implementation. Note also that a compiler may support different ABIs (this was especially the case with early compilers on 8086, which supported different so-called memory models).
Note that usually different languages have different calling conventions. But often languages allow to explicitly use the C calling conventions of the platform. Sometimes it is even part of the language specification of that language (such es extern "C"
in C++). But obviously the C language standard doesn't tell you anything about other languages.
1 comment thread