It has been mentioned in several episodes on how we can hint the linker to work the way we need in regards to the accessibility between the different compile units.
In concrete, the concept on how the internal and external linkage works for every compile unit and how that can possibly affect linking times or that we can produce linker errors if not applied correctly.
After experimenting with a minimal code example, want to share what I have put together and the results.
Will appreciate any corrections, and will try to keep the discussion under the C realm, but I have found that the C++ classes might also be a topic of interest when trying to further understand how the linker works.
Specs: MVS Community 2019, Win10Pro
Multiple compile units
Command line
| cl /c -Z7 ModuleA.cpp
cl /c -Z7 ModuleB.cpp
dumpbin /SYMBOLS ModuleA.obj > ModuleA.dmp
dumpbin /SYMBOLS ModuleB.obj > ModuleB.dmp
cl -Z7 linker.cpp ModuleA.obj ModuleB.obj /link
dumpbin /SYMBOLS linker.obj > linker.dmp
|
linker.cpp
1
2
3
4
5
6
7
8
9
10
11
12 | #include "ModuleA.h"
#include "ModuleB.h"
int main()
{
externalG();
internalA = 9; // internal linkage, behaves like a local variable
// internalF(); // error! function not accesible, internal linkage
externalF();
externalA = 19; // external linkage
return (0);
}
|
ModuleA.h/cpp, internalA, according to the documentation, should not be visible to a different compile unit (other than ModuleA)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24 | #if !defined(MODULEA_H)
#define MODULEA_H
static int internalA; // internal linkage
extern int externalA; // external linkage
inline void internalF(); // internal linkage
extern void externalF(); // external linkage
#endif
#include "ModuleA.h"
void internalF()
{
internalA = 2;
externalA = 23;
}
void externalF()
{
internalA = 22;
}
|
ModuleB.h/cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20 | #if !defined(MODULEB_H)
#define MODULEB_H
extern void externalG(); // external linkage
#endif
#include "ModuleB.h"
#include "ModuleA.h"
int externalA;
void externalG()
{
externalA = 200;
internalA = 400; // internal linkage, behaves like a local variable
// internalF(); // error, UNDEF symbol f() because of internal linkage
//internalA = 10;
}
|
At this point, the interesting findings is related to the internalA, declared in the ModuleA: regardless is declared as static, it can be accessed in any other module.
But more important, it behaves as a local variable: its value is different in every scope is used.
Does it mean that internalA should only be declared as static in the cpp file to really behave as internal?
For those who have experience working with other compiles, should I expect that this behavior is gonna be the same for g++ too ?
For the rest of the functions, it works as expected. For instance ModuleB.cpp, include ModuleA.h, but it can't access the internalF function, and the linker is gonna report that error as an "Unreasolved external symbol".
externalA variable, which is defined in ModuleB, it is still accessible and has a global scope. All good.
Single compile unit (Unity build)
Command line
| cl -Z7 linker.cpp ModuleA.obj ModuleB.obj /link
dumpbin /SYMBOLS linker.obj > linker.dmp
|
linker.cpp
1
2
3
4
5
6
7
8
9
10
11
12 | #include "ModuleA.cpp"
#include "ModuleB.cpp"
int main()
{
externalG();
internalA = 9; // internal linkage, behaves like a local variable
internalF(); // unity build now can access the internal linkage function
externalF();
externalA = 19; // external linkage
return (0);
}
|
internalA, now really behaves as a global variable, regardless is declared static.
Again, should the internalA variable be declared in the cpp file to define the internal linkage?
Further more, the internalF is not accessible regardless the internal linkage. Is this result correct or a misconception on my side?
It seems that the concept of internal/external linkage is less meaningful to the linker and more informative to the coder when using a unity build??
Appreciate your feedback.