Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on C Language Standard Linking Specifications

Parent

C Language Standard Linking Specifications

+5
−0

What (if anything) does the C standard have to say about linking objects? My guess is that, because C only defines language->behavior rules, it completely ignores any potential intermediate form the code may take. Obviously, C doesn't guarantee anything about the executable itself because it may well be "compiled" down to something that isn't an executable for a computer platform or even something similar. That said, does C make any guarantees about interoperability between compiled output targets on a given platform? For example, if I add this to a source file:

extern int foo(void);

the resulting C program will expect to be able to jump to such a symbol but doesn't require any knowledge of the definition to successfully compile and output valid code. Does C require that the above eventually resolve to an executable entity (i.e. during linking), or is C itself completely indifferent to any linking mechanics that take place after compilation? If the former, what guarantees does C make?

EDIT: Following some commentary, I think the question can be further clarified.

In a freestanding C program, there is no I/O capability, simply due to the fact that the language imposes no restrictions whatsoever on the capabilities of a compliant C implementation without a hosted environment like an operating system. In practice, to perform a processor I/O call on an x86 machine in a kernel program, a call must be made out of the C code to assembly (or something similar). An assembly module is by no means, of course, required to comply in any way to the C standard. Can such a module be considered a separate translation unit? According to the standard, I'd assume it cannot. The way the C module and the assembly module are linked together is completely dependent on the environment and implementation.

Having said all this, consider the above function foo once more. If this function is defined in assembly, it is not part of any C translation unit, and therefore can only be linked by relying on implementation details. My question is this: is this environment-specific linking behavior implementation-defined? If so, there should be reference to it within the C standard somewhere, and I'd like to know exactly what can be said about linking C translation units with other non-C libraries. However, intuitively I'd assume that, since the definition of foo is not contained within any C translation unit, the C program should not be valid. This would mean that every kernel ever written that has any I/O of any kind is not standards-compliant C. That doesn't necessarily matter in practice because a non-compliant program that does its job is just as useful as a compliant one. My question is purely theoretical: is my analysis correct? Is a compliant freestanding C program unable to produce observable results? Or am I missing something in the standard that relates to linking against something that is not a C translation unit?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

"My question is this: is this environment-specific linking behavior implementation-defined?" (6 comments)
Post
+3
−0

The standard does not talk about object files and/or linking, but it does talk about translation units. In typical compilers, a single translation unit translates into a single object file.

Obviously the C standard does not talk about translation units that are not themselves C; that's what a platform's ABI is about.

Now what the C standard tells you is this:

  • If you have the declaration

    extern int foo(void);
    

    in some translation unit, then it makes the name available to the current translation unit. The declaration itself does not imply that this function exists anywhere.

  • If you do call that function from anywhere in your translation unit, then some translation unit must define it, and it must define it with exactly that signature. From the view of the C standard, that other translation unit is, of course, another C file compiled with the same C implementation.

    Your compiler's ABI (which is not covered by the C standard) then tells what this means in terms of symbols and code. For example, it tells you that there has to be a symbol that's named foo (in some ABIs, it's _foo instead), which must lead to executable code that leaves some integer value in a specific register (and usually has some further requirements, e.g. on what it does with the stack).

    Obviously some object file generated with another compiler or hand-written assembly will work exactly if the correct symbol exists and points to code fulfilling all the conditions required by the ABI.

    These days, the C ABI is usually fixed by the platform. That is, all C compilers on the same platform are interoperable. Note however that this is not mandated by the C standard, which always assumes a single C implementation. Note also that a compiler may support different ABIs (this was especially the case with early compilers on 8086, which supported different so-called memory models).

Note that usually different languages have different calling conventions. But often languages allow to explicitly use the C calling conventions of the platform. Sometimes it is even part of the language specification of that language (such es extern "C" in C++). But obviously the C language standard doesn't tell you anything about other languages.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

This is certainly a good answer and packed with useful information, and I'm hoping to get more theore... (8 comments)
This is certainly a good answer and packed with useful information, and I'm hoping to get more theore...
Josh Hyatt‭ wrote almost 3 years ago

This is certainly a good answer and packed with useful information, and I'm hoping to get more theoretical information than practical. For example, with that extern function declaration, from the standard's perspective, what happens when I make a call to foo? If the standard is unconcerned with the problem of linking, how can C tell me that the call will result and anything other than undefined behavior? Do external object files and other static libraries count as "translation units"? I'm less interested in the practical implementation on specific platforms than I am in the theory behind it, and this with the aim of potentially writing my own compiler.

elgonzo‭ wrote almost 3 years ago · edited almost 3 years ago

I am not quite sure what you are trying to ask about, but i am assuming for now you are asking wrt calling conventions. The C standard is unconcerned about calling conventions and does not try to lay down a "law" about that. It explicitly leaves the layout of storage for parameters of externally defined functions unspecified (see C standard section 6.9.1, paragraph 9). Basically something the build tools (compiler+linker) have to define depending on certain factors. A major factor in this, albeit not the only one, is that different target platforms a C program might be built for can requirie or prefer different calling conventions. As you might already guess, the C standard is also not concerned about any particular target platform a C source code might be compiled for. Naturally something the build tools deal with, not the C standard. (1/2)

elgonzo‭ wrote almost 3 years ago · edited almost 3 years ago

(2/2) Compilers might (and often do) provide additional keyword extensions not covered by the C standard to allow specifying a particular calling convention (such as _stdcall, _cdecl or _fastcall, for example).

If your interrogation of "[...] from the standard's perspective, what happens when I make a call to foo? If the standard is unconcerned with the problem of linking, how can C tell me that the call will result and anything other than undefined behavior?" was not aimed at calling conventions, please clarify.

Josh Hyatt‭ wrote almost 3 years ago

My question indeed is not concerned with calling conventions unless I am misunderstanding the term. I understand calling convention to be a protocol specific to a particular build environment that ensures proper communication across function calls. Let me try to clarify: It says here: Each function that is actually called must be defined only once in a program. By that reasoning, if a function is never defined, the program should fail to compile. Of identifiers with external linkage the same site says: The identifier can be referred to from any other translation units in the entire program. From this, I deduce that a set of C sources should not compile to a valid C program if linking fails. Therefore, the C standard cannot confine itself solely to rules applying to a single translation unit; it must have something to say about linking. Does that clarify the question?

elgonzo‭ wrote almost 3 years ago · edited almost 3 years ago

By that reasoning, if a function is never defined, the program should fail to compile.

If your code calls some externally defined function, and the linker is unable to link the call with the actual function implementation because there is no such function implementation, of course the build will fail. Do you see reasons to believe the build process should succeed despite the linkage failing?

From this, I deduce that a set of C sources should not compile to a valid C program if linking fails.

Incorrect (unless compiling and linking are merged together into a single inseparable build phase/tool). Linking is a build phase that comes after the compilation phase and depends on the result generated by the compilation phase. For your deduction to become true would require a time machine or some other mechanism capable of reversing the arrow of time ;-)

elgonzo‭ wrote almost 3 years ago · edited almost 3 years ago

Therefore, the C standard cannot confine itself solely to rules applying to a single translation unit; it must have something to say about linking.

It does. See section 6.2.2 Linkages of identifiers. In case you don't have the actual C standard at hand, here is the PDF for C18: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf

celtschk‭ wrote almost 3 years ago

Josh Hyatt‭ A program consists of several translation units. As far as the C standard is concerned, a program consists of nothing but translation units written in C and the standard library supplied by the implementation. It doesn't say there needs to be a separate linking step; a compiler that required all translation units of a program to be listed on the compile command and then directly spits out an executable would also be conforming.

Josh Hyatt‭ wrote almost 3 years ago

celtschk‭ Thank you for the insight. I've updated the question to try and better reflect the intent accordingly.