How to distinguish between variable and function declaration?

I'm adding some extras to basic meta-programming for C, and I can't come up with a simple, but fairly robust way to distinguish between global variable and function declaration.

Function declaration being:
int oompaloompa(int a, int b);

Since anything can be pointer to function returning another N times nested function pointer...
And I don't know the full extent of craziness going on there (such as rules for omitting function return
types and god knows how deep the rabbit hole goes) -

I'm not quite sure how to do this, any ideas?

For brevity's sake lets assume that I know all the defined symbols (e.g. the typedefs).

Edited by pragmatic_hero on
To solve this issue, you will probably need to actually parse C. This is not that difficult but C does have many weird edge cases like what you have said.

There is no easier way because C does not have a simple grammar like LL(1), LALR(1), etc. C's declarations use the Clockwise/Spiral Rule: http://c-faq.com/decl/spiral.anderson.html
So far I've come up with a "hypothesis" that if - scanning left to right - an identifier has a open parentheses right next to it, it has to be a function.

1
2
3
int(*oompaloompa(int a, int b))(void);
     -----------^------------^
       this is function decl


As long as I have the symbol list, this is a simple test.
(can this test be done without symbol list?)

Can anyone come up with a counter-example where this isn't true?

Edited by pragmatic_hero on
What does your hypothesis say about following declarations?

1
2
3
int (*a[10])(int (*b)(int c));
int (*(*x)(int y))[10];
int (*a)(int b(int c));




Edited by Mārtiņš Možeiko on
mmozeiko
What does your hypothesis say about following declarations?

1
2
3
int (*a[10])(int (*b)(int c));
int (*(*x)(int y))[10];
int (*a)(int b(int c));




First two would be classified as variables.
The last one would be erroneously recognized as a function as 'b' has '(' next to it.

I suppose the rule has to be amended that it's the first identifier that has to be looked at - a, x, a. Then it would work with the given samples.

Edited by pragmatic_hero on
And this definitely can't be done without symbol list:
1
2
3
4
5
6
7
typedef cool(beans);

cool beans; // function declaration

beans() {
  cool a, b, c; // func decl
}


All legit C apparently.
Whenever you see "a b;" anywhere, at any scope, it might as well be function declaration.
And you can't tell for sure unless you have the FULL symbol list in your head.

Great language design.
"Declaration reflects use" - What a genius idea!
Cool beans.

Edited by pragmatic_hero on
1
int (*(*x)(int y))[10];

Not on topic but I was going to ask about that one but I think I figured it out:
X is a pointer to a function taking an int and returning a pointer to an array of exactly 10 int.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
int array[ 10 ] = { 0 };

int (* fn( int y ) )[ 10 ] {
    return &array;
}

int main( int argc, char* argv[ ] ) {
    
    int ( * ( *x )( int y ) )[ 10 ];
    
    x = fn;
    
    return 0;
}

I didn't know you could write [ 10 ] in the function definition and if I understand correctly it makes the compiler check that the return value is pointer to an array of size 10 exactly. If I change
1
int array[ 10 ] = { 0 };

to
1
int array[ 11 ] = { 0 };

it doesn't compile.