12

The following is invalid code:

int i = 0, double j = 2.0;

The draft standard says why:

[N4140/7.1.6]

2 As a general rule, at most one type-specifier is allowed in the complete decl-specifier-seq of a declaration or in a type-specifier-seq or trailing-type-specifier-seq. The only exceptions to this rule are the following:

const can be combined with any type specifier except itself.

volatile can be combined with any type specifier except itself.

signed or unsigned can be combined with char, long, short, or int.

short or long can be combined with int.

long can be combined with double.

long can be combined with long.

Yes, it prevents something silly like int int, but I don't see anything wrong with the invalid code posted above. Quoting [N4140/7], a simple-declaration consists of a decl-specifier-seqopt init-declarator-listopt;

[N4140/8] then shows that an init-declarator-list consists of an init-declarator-list , init-declarator,

and an init-declarator is a declarator initializeropt.

Since we're concerned with only syntax of the form int i = 0, then the declarator we care about is a ptr-declarator, which is a noptr-declarator, which is a declarator-id attribute-specifier-seqopt and finally a declarator-id consists of merely ...opt id-expression.

For completeness, [N4140/5.1.1] says an id-expression can be an unqualified-id, or simply an identifier.

If I haven't tripped up so far, this is what the grammar reflects.

int decl-specifier-seq

i unqualified-id

= 0 initializer

int i = 0 init-declarator

Since the simple-declaration has the decl-specifier-seq, only one decl-specifier-seq applies to the entire init-declarator-list.

Funnily enough, that means you can't do something like this:

int i, const j;

Yet:

int i, * j;

is perfectly legal because the star is part of a ptr-operator. But you can't do this:

int i, const * j; // pointer to const int

This means in the following code that i becomes a pointer to const int. Surprise!

int h = 25;
int const * j, * i = &h;
*i = 50; // error: assignment of read-only location '* i'

The intent is clear in [N4140/8] with:

3 Each init-declarator in a declaration is analyzed separately as if it was in a declaration by itself.99

99) A declaration with several declarators is usually equivalent to the corresponding sequence of declarations each with a single declarator. That is

T D1, D2, ... Dn;

is usually equivalent to

T D1; T D2; ... T Dn;

The question is why is it this way?


If it was legal, you could do it in for loops, which is somewhat useful.

user4351360
  • 461
  • 4
  • 6
  • 2
    Because it happened to be that way in K&R C and it never bothered people enough to change it? You are allowed to declare more variables of the same type at the same time to save some keystrokes, if you have to change the type you don't save anything (you just write commas instead of semicolons). Regarding the pointer thing, AFAICT it's mostly a confusing historical accident, linking the star to the type - and not to the variable - would have made much more sense. – Matteo Italia Dec 11 '14 at 21:21
  • 4
    At the very least, `unsigned u, long l;` would be confusing. What would `l`'s type be? `long`, or `unsigned long`? `unsigned u; long l;` on the other hand is perfectly clear. I'm pretty certain this isn't really the reason though. –  Dec 11 '14 at 21:26
  • I think @MatteoItalia hits it right on the mark: the multiple declaration syntax is a leftover from K&R's days back when we still counted storage media capacities in bytes instead of terabytes. In 2014 everybody prefers readability over brevity, and many code conventions even explicitly forbid multiple declarations on a single line of code, barring specific exceptions like in closed `for`-loops. – Niels Keurentjes Dec 11 '14 at 21:31
  • 1
    @MatteoItalia `int const * j, * i = &h;` declares `i` to be an `int const *`, but `int * const j = &h, * i;` only makes `j` a `int * const`. – user4351360 Dec 11 '14 at 21:35
  • 1
    According to [The Development of the C Language](http://cm.bell-labs.com/who/dmr/chist.html), this form of declaration was introduced in "new B", which was a predecessor to C. It might have been an artefact from a form of declaration in its typeless predecessor B: "B declarations begin with a specifier like `auto` or `static`, followed by a list of names" – dyp Dec 12 '14 at 00:15
  • @MatteoItalia a practical use for this would be as the declaration part of a `for` loop – M.M Dec 15 '14 at 23:38
  • @MattMcNabb: that's certainly true (and I did miss that possibility), but IMO the flaw is of `for` allowing only one statement, not of the declaration allowing just one type. – Matteo Italia Dec 15 '14 at 23:47
  • I am dubious about the use of *general rule* and *usually* in the standard... I feel that those kinds of words require footnotes referring to the exceptions to the rule which they seem to imply. – nonsensickle Dec 17 '14 at 01:40
  • Remove the nonsense ptr-operator rule. Allow multiple type declarations as OP suggests. Always write `*` where it belongs, i.e. `f(int* p)` not `f(int *p)`. `int* i, j;` means what it should. Suddenly, you have a better language. Also, not going to happen. Thanks backward compatibility. – Roman L Dec 17 '14 at 17:11
  • If multiple types were added though, it would then be possible to write `int i, j, double x, k;`, and a rule would be needed to specify whether `k` is an int or a double in this case. – Roman L Dec 17 '14 at 17:23
  • @ShafikYaghmour I do read your comments, but I think a proper answer would require a serious amount of research (history and specification of "new B" and ancestors; communication with people involved in the processes etc.) This could answer why you may declare multiple things in one statement. Then, you'd still need to answer why there are restrictions on the types of those things, which probably requires searching for proposals etc. – dyp Dec 22 '14 at 21:49

3 Answers3

3

Short answer: one statement can allow only 'one declaration of a type' but this declaration can declare 'multiple identifiers'. Also const/volatile are either type qualifiers or pointer qualifiers so they need a type or a pointer to bind to.

Long answer:

I haven't ever read the standard but here I go...

"it prevents something silly like int int, but I don't see anything wrong with the invalid code posted above."

  • The way I see it is that one statement can allow you only one declaration but one declaration can allow you to declare multiple identifiers of the same type.
  • So the problem with int i = 0, double j = 2.0; is that you have two types, int and double. Which goes against [N4140/7.1.6].
  • It's what the language allows and thus the above statement is incorrect.

But you went ahead and dug deeper and I believe your confusion began here onward "Since we're concerned with only syntax of the form...". Wouldn't the declaration break up as follows?

int i = 0 ::= simple-declaration
Where in...
int ::= type-specifier
i ::= identifier
= 0 ::= init-declarator

More

You mentioned...

Not Allowed: int i, const j;
Allowed: int i, * j;
Not Allowed: int i, const * j; // pointer to const int
Allowed: int const * j, * i = &h;

My response:

Not Allowed: int i, const j; - because const is a type modifier, syntactically there is no type specified to bind to.
Allowed: int i, * j; - because * grabs the type int and j has a complete type.
Not Allowed: int i, const * j; - because const is not associated to a type here. It is the same problem as in the first case. Read it as j is a pointer to <unexpected word in between> and thus screws up the grammar.
Allowed: int const * j, * i = &h; - because syntactically const has a type to bind to.

"The question is why is it this way?"

  • When I was learning C I was initially confused with the use of const before / after the type name and to clear the confusion I tried some test code and figured out what the language allows and the following is what I came up with. It is from my old notes. It definitely looks like something made by a new programmer. However, it clears most of the doubts.

    [storage class] [sign qualifier] [size qualifier] [type qualifiers] <[* [type qualifiers]] [symbol name] [[] | ([parameters])]>

    storage classes: auto, register, static, extern, typedef

    sign qualifiers: signed, unsigned

    size qualifiers: short, long, long long

    basic types: char, int, float, double, void

    type qualifiers: const, volatile

    symbol name could be a variable, constant, type(def) name and function

    A * prefixed to the symbol makes it a pointer. * can appear N number of times, making it pointer-to-pointer and so on.

    A [] suffixed to the symbol makes it an array. [] can appear N number of times, making it a multi-dimension array.

    A () suffixed to the symbol makes it a function. () can appear N number of times but since a function cannot return a function, () can appear again when a function returns a function pointer.

The above helped me think straight when declaring variables.

Modifying the type specifier syntax from my age old notes:

[storage class] [sign qualifier] [size qualifier] <type> [type qualifiers] [* [pointer qualifiers]] [symbol name] [[] | ([parameters])]

That is const and volatile are either a type qualifier or a pointer qualifier and they need a type or a pointer to bind to qualify them.

Consider the idea of "one statement can allow you only one declaration but one declaration can allow you to declare multiple identifiers of the same type." This means the type specifier syntax can be broken down as follows:

type ::= [storage class] [sign qualifier] [size qualifier] <type> [type qualifiers]

symbol ::= [* [pointer qualifiers]] [symbol name] [[] | ([parameters])]

And a simplified syntax of a declaration would be:

type symbol[, symbol...]

Clearly,

int i, const j; - does not agree with the grammar.

int const i, j; - agrees with the grammar.

I am sure a person good with the standard can use the standard and provide the answer using the correct terminology and references. However, please keep in mind that less experienced programmers might find a less technical answer easy to understand.

If the form "int i, const j" is allowed then one can write "int const i, const j" and that would means j is a double constant. That does not make any sense.

RcnRcf
  • 346
  • 1
  • 8
  • Your comment `// pointer to const int - you got it wrong. Read it as constant pointer to int` is wrong. A `const` pointer to an int is declared as `int * const i;` **NOT** `int const * i;` which is what you are suggesting. I think you meant `Allowed: int i, const * j;` to be `Allowed: int i, * const j;`. -1 until edited. See http://stackoverflow.com/questions/9779540/c-const-pointer-declaration – nonsensickle Dec 16 '14 at 00:06
  • Hmm, I'm not sure I'm understanding that line correctly. `int i, const * j;` and `int i, * const j;` are not the same. They mean different things. Your response still starts with the `Not Allowed: int i, const * j;` even though you then correctly continue on by referring to `"int i, * const j;"`. Your comment doesn't make sense to me and it still seems ambiguous. – nonsensickle Dec 17 '14 at 01:32
  • @nonsensickle I removed the ambiguous statement and added a short answer at the top and also extended the long answer to bring everything together. No see if it makes sense or not. – RcnRcf Dec 17 '14 at 16:37
  • Yup, now it makes sense. Removing the -1. :) However, I still don't think that your answer doesn't get to the core point. Although it is a good answer I'm not going to give out a +1 just yet. Mainly because you are explaining the way things are rather than *why they are*. I mean, `const` needs a type to bind to but so does the `*`, how are they different in this case? I think only the committee can answer this one and it's likely to be legacy. Good try anyway though :) – nonsensickle Dec 18 '14 at 22:36
  • Sorry, on my second read through it does seem that you mentioned the statements that I was looking for: "agrees with the grammar" and "does not agree with the grammar". So +1 :) I think that it is the parser of the compiler that dictates this one. – nonsensickle Dec 18 '14 at 22:46
2

The question is why is it this way?

Unfortunately, I can't answer this with certainty. My guess is that it is a short-hand that came about to save keystrokes in C or one of its predecessors. I will say though that the * operator, IMHO, does change the type of the variable you have declared and so I don't know why it is allowed where const is not.

I would also like to add an example that you have not included which is legal:

int i = 0, * const j = &i;

And it is legal because the const, in this instance, is being applied to the * and not the int. At least according to ideone.

It is likely that this is just legacy which needed to be brought forward.

Your extract from the standard sheds some light on the topic:

99) A declaration with several declarators is usually equivalent to the corresponding sequence of declarations each with a single declarator. That is

T D1, D2, ... Dn;
is usually equivalent to

T D1; T D2; ... T Dn;

The intent seems to be that the Dn's when declared together separated by commas would be the same type. Because if you're changing types you might as well use a semicolon, since this saves you no keystrokes i.e.

int i, j = 0; double k, l = 7.3;

Here you've saved yourself typing int and double twice by using the comma in the right place.

I can see your point with the type modifiers like const, volatile etc. Why not allow us to combine those at will? I mean how is this different from the *? I don't think that it is different and I hope that someone smarter than me will come and explain why. I would say either ban the * or allow us to use other modifiers too.

So in short, I don't know, but here's a cool additional example of the craziness:

int i = 0, * const j = &i;
nonsensickle
  • 4,138
  • 2
  • 32
  • 58
1

The question is why is it this way?

Because that's how it was in C at the Dawn of Time (1973 actually) see here

If it was legal, you could do it in for loops, which is somewhat useful.

It is legal, just not very pretty:

for(std::tuple<int, double> t = std::make_tuple(0, 10.0) ; 
    std::get<0>(t) < 10 ; 
    ++std::get<0>(t), std::get<1>(t) *= 1.1)
{
    cout << std::get<0>(t) << ", " << std::get<1>(t) << std::endl;
}
Richard Hodges
  • 64,204
  • 6
  • 75
  • 124
  • If you are going to make your first argument you should at least quote form the document dyp quotes in the comments to the question, otherwise that is not really sufficient. – Shafik Yaghmour Dec 19 '14 at 18:44
  • Also probably the original hacky how do you declare multiple types in a for loop is probably the one in [this answer](http://stackoverflow.com/a/2687427/1708801) ... although C++ does give you more options that one also applies to C. – Shafik Yaghmour Dec 19 '14 at 18:48
  • I'd never thought of using an anonymous struct. It's actually a nicer (if still horrible) solution – Richard Hodges Dec 19 '14 at 18:49
  • Preferably actually quoting from the document, the interesting information there is that is an authoritative since the author was one of the creators of C and two that the style came from "New B" via Fortran and we can see it better looking at the Fortran syntax. – Shafik Yaghmour Dec 19 '14 at 18:50
  • the author mentions the "accident of syntax" that led to the requirement of being touched by a deity in order to decode complex type declarations, but does not mention this specific point. Given that C is little more than a portable macro-assembler I would be amazed if K&R had given it any thought at all at the time. – Richard Hodges Dec 19 '14 at 18:54