38

The following quote is from C++ Templates by Addison Wesley. Could someone please help me understand in plain English/layman's terms its gist?

Because string literals are objects with internal linkage (two string literals with the same value but in different modules are different objects), you can't use them as template arguments either:

sbi
  • 204,536
  • 44
  • 236
  • 426
Aquarius_Girl
  • 18,558
  • 57
  • 191
  • 353
  • I removed the `c++-faq` tag. Feel free to explain why you think it is warranted, if you think it is. – sbi Apr 05 '11 at 07:12
  • @sbi Are you talking to me? If yes, then let me tell you that the ONLY tag added by me was "templates". – Aquarius_Girl Apr 05 '11 at 07:16
  • @Anisha: I wasn't talking to you, I was talking to whoever put that tag there. Mind you, I didn't look who did it, so it _could_ have been you. `:)` – sbi Apr 05 '11 at 07:20
  • Mhmm. Now I see that @GMan did this. I suppose he knew what he was doing. Well, GMan, just speak up if you disagree. I'm not in for edit wars. `:)` – sbi Apr 05 '11 at 07:21
  • @sbi: Just saw now, but we're in chat now, no worries. :) – GManNickG Apr 05 '11 at 07:22
  • 2
    "Because string literals are objects with internal linkage (two string literals with the same value but in different modules are different objects), you can't use them as template arguments either", which is a flawed reasoning for C++0x, so you better get it out of your head for future C++ work. Template arguments can have internal linkage now. You can instead say "Because a string literal does not match any allowed form of template arguments ...". – Johannes Schaub - litb Apr 05 '11 at 15:07
  • @Johannes: Can you elaborate on that? It was my understanding the template arguments pointing to things with internal linkage didn't make sense, but I was wrong. What criteria does a string literal fail at? – GManNickG Apr 05 '11 at 18:04
  • 1
    @GMan it is not of the correct form: It's not an integral constant expression, it is not a template parameter, not a pointer or reference to an object or function that has linkage (internal or external) *expressed as `& id-expression` or `id-expression`* and it is not a pointer-to-member expressed as `& qualified-id` etc. – Johannes Schaub - litb Apr 05 '11 at 19:07
  • Note that C++0x has still a non-normative note that talks about names without linkage not being valid template arguments. I did send an issue report to Pete about this some time ago (Standard's Editor), so this should be resolved for the FDIS. – Johannes Schaub - litb Apr 05 '11 at 19:12
  • @Johannes: Oh, I see it now, thanks. Good to know. – GManNickG Apr 05 '11 at 19:14
  • @GMan and @Johannes: I could not understand very well your geek talk :) Would you please take some time to explain it to me in a layman's language? __Johannes__ second post is difficult for me to understand and secondly if both of you agree that __Johannes__'s answer is the correct one then why not put it as a new answer. Secondly the book I am referring has been published in 2002 :eek: – Aquarius_Girl Apr 06 '11 at 05:48
  • 1
    @Anisha: Just so you know, C++0x is the new version of C++ slated to come out this year. Put simply, they made it so you can use any pointer value as long as it has a *name*. String literals do not have a name. – GManNickG Apr 06 '11 at 05:56
  • @GMan Many thanks to you for bothering to explain, so this means that some new version of C++ (which hasn't yet been released) doesn't have this problem of internal linkage. But then since I am using the current version of C++, the concept explained by you in your answer holds true? – Aquarius_Girl Apr 06 '11 at 06:10

5 Answers5

50

Your compiler ultimately operates on things called translation units, informally called source files. Within these translation units, you identify different entities: objects, functions, etc. The linkers job is to connect these units together, and part of that process is merging identities.

Identifiers have linkage: internal linkage means that the entity named in that translation unit is only visible to that translation unit, while external linkage means that the entity is visible to other units.

When an entity is marked static, it is given internal linkage. So given these two translation units:

// a.cpp
static void foo() { /* in a */ } 

// b.cpp
static void foo() { /* in a */ } 

Each of those foo's refer to an entity (a function in this case) that is only visible to their respective translation units; that is, each translation unit has its own foo.

Here's the catch, then: string literals are the same type as static const char[..]. That is:

// str.cpp
#include <iostream>

// this code:

void bar()
{
    std::cout << "abc" << std::endl;
}

// is conceptually equivalent to:

static const char[4] __literal0 = {'a', 'b', 'c', 0};

void bar()
{
    std::cout << __literal0 << std::endl;
}

And as you can see, the literal's value is internal to that translation unit. So if you use "abc" in multiple translation units, for example, they all end up being different entities.

Overall, that means this is conceptually meaningless:

template <const char* String>
struct baz {};

typedef baz<"abc"> incoherent;

Because "abc" is different for each translation unit. Each translation unit would be given a different class because each "abc" is a different entity, even though they provided the "same" argument.

On the language level, this is imposed by saying that template non-type parameters can be pointers to entities with external linkage; that is, things that do refer to the same entity across translation units.

So this is fine:

// good.hpp
extern const char* my_string;

// good.cpp
const char* my_string = "any string";

// anything.cpp
typedef baz<my_string> coherent; // okay; all instantiations use the same entity

†Not all identifiers have linkage; some have none, such as function parameters.

‡ An optimizing compiler will store identical literals at the same address, to save space; but that's a quality of implementation detail, not a guarantee.

Community
  • 1
  • 1
GManNickG
  • 459,504
  • 50
  • 465
  • 534
  • Isn't "translation unit" the canonical name and "source file" an informal/unofficial one? – jalf Apr 05 '11 at 06:38
  • @jalf: Oops, yeah, crossed my words up. – GManNickG Apr 05 '11 at 06:39
  • @GMan: You might want to add this link to your answer: http://stackoverflow.com/questions/2795443/in-c-whats-the-difference-between-the-terms-source-file-and-translation-un – sbi Apr 05 '11 at 07:17
  • Thank you the detailed answer. What I have understood from your answer is, **a string in double quotes is equivalent ALWAYS to a static char array**. Static char array falls under "internal linkage" and templates fall under "external linkage". Therefore the two are incompatible. _Please confirm whether what I have understood is clear or not_. – Aquarius_Girl Apr 05 '11 at 08:58
  • 1
    @Anisha: your "therefore" loosely implies that - in general - things with internal linkage and external linkage are "incompatible": that's not true. IMHO, the crucial point is that the same double-quoted string literal, appearing in different translation units in a program, can end up being recreated in multiple memory areas, and hence there can be multiple distinct pointer values to the same textual content. This implies code bloat from unnecessary or accidental template instantiations, and template specialisations not being reliably matched, so C++ disallows it for safety. – Tony Delroy Apr 05 '11 at 10:25
  • @Anisha: That's mostly correct. In C++03, template *arguments* need to have external linkage. In C++0x they can have internal linkage, but string literals don't satisfy other requirements. I myself am not sure about the C++0x changes, so I'm hoping Johannes will elaborate on that. – GManNickG Apr 05 '11 at 18:03
  • I have tried the extern trick with GCC 4.2 and it doesn't work, you have to go through another level of indirection, "extern char**" in order to get it to work. Not sure if newer versions or other compilers act the same. – Joseph Garvin Jul 17 '11 at 00:51
  • If I have a string `char* a = "a"` and another string `char* b = "a"`, are they at the same memory address? – 0x499602D2 Feb 14 '13 at 20:28
  • @David: Like footnote two notes, whether or not two identical string literals are stored at the same address is implementation-defined, they may be or may not be depending on your compiler, compiler flags, etc. You shouldn't rely on it if you can help it. So in your code `&a` is not `&b`, and their values may or may not be the same. – GManNickG Feb 14 '13 at 20:32
  • to make it compile with gcc4.9 I had to use `extern const char my_string[]; const char my_string[] = "any string";` – marcinj Dec 29 '14 at 22:55
11

It means you can't do this...

#include <iostream>

template <const char* P>
void f() { std::cout << P << '\n'; }

int main()
{
    f<"hello there">();
}

...because "hello there" isn't 100% guaranteed to resolve to a single integral value that can be used to instantiate the template once (though most good linkers will attempt to fold all usages across linked objects and produce a new object with a single copy of the string).

You can, however, use extern character arrays/pointers:

...
extern const char p[];
const char p[] = "hello";
...
    f<p>();
...
Tony Delroy
  • 94,554
  • 11
  • 158
  • 229
  • 1
    I didn't know the second form was possible, even in the current standard? – Nim Apr 05 '11 at 06:23
  • @Nim: yes, weird but true :-) – Tony Delroy Apr 05 '11 at 06:29
  • Thanks for bothering, but I still couldn't understand "Integral value", "Integral linkage"? – Aquarius_Girl Apr 05 '11 at 06:43
  • EDIT: The below post by Mikael explains that. – Aquarius_Girl Apr 05 '11 at 06:48
  • 1
    @Anisha: integral as in integer, 1, 2, 3. Template parameters have to be types or integers; that's why enums, characters, short/int/long etc. are ok but float and double are not, nor are actual objects. The point here is that a pointer with a compile-time-constant value IS an integer of sorts, and "extern" variables satisfy that requirement while "static" variables - which are effectively hidden within the object being compiled at the time and so don't have their integer address "broadcast" for the linker to stitch together with other objects, can't be used. See also GMan's answer ;-) – Tony Delroy Apr 05 '11 at 06:51
  • 1
    In the second form, you don't need two declarations for `p`. `extern char const p[] = "..."` does the trick quite nicely. – James Kanze Apr 05 '11 at 08:16
  • @James: good point, though it's common for the declaration to be in a header and the definition in an implementation file. – Tony Delroy Apr 05 '11 at 08:20
  • @Tony: Now I have understood your explanation, I think. Please read my comment on Mikael's post below just to check whether my understanding is correct or not. And I always read and reply to all the answers, one by one :) – Aquarius_Girl Apr 05 '11 at 08:28
  • @Tony: and what about the "Global" variables? Are they under internal or external linkage? Is their behaviour different in C and C++? and also could you confirm the comment on GMan's post? He seems to be offline. – Aquarius_Girl Apr 05 '11 at 09:35
  • @Anisha: being "global" (in the sense of not within an anonymous or named namespace, but being in "::") is orthogonal to (independent of) linkage... such variables can be `extern` or `static`. – Tony Delroy Apr 05 '11 at 10:18
  • @Tony: So you mean to say that any Global variable which is not in any "block" has NO linkage? Or I am barking up the wrong tree? – Aquarius_Girl Apr 05 '11 at 10:24
  • 1
    @Anisha: oh, they do have linkage :-)... see http://stackoverflow.com/questions/3281925/what-is-default-storage-class-for-global-variables – Tony Delroy Apr 05 '11 at 10:27
  • @Tony: Thanks for the thread :hattip: So I conclude that a variable which is outside any "block" AND has no storage specifier specified with it, lies under "external linkage" and has the _duration_ static. Is that correct now? – Aquarius_Girl Apr 05 '11 at 11:46
7

Obviously, string literals like "foobar" are not like other literal built-in types (like int or float). They need to have an address (const char*). The address is really the constant value that the compiler substitutes in place of where the literal appears. That address points to somewhere, fixed at compile-time, in the program's memory.

It has to be of internal linkage because of that. Internal linkage just means that cannot be linked across translation units (compiled cpp files). The compiler could try to do this, but is not required to. In other words, internal linkage means that if you took the address of two identical literal strings (i.e. the value of the const char* they translate to) in different cpp files, they wouldn't be the same, in general.

You can't use them as template parameters because they would require a strcmp() to check that they are the same. If you used the ==, you would just be comparing the addresses, which wouldn't be the same when template are instantiated with the same literal string in different translation units.

Other simpler built-in types, as literals, are also internal linkage (they don't have an identifier and can't be linked together from different translation units). However, their comparison is trivial, as it is by value. So they can be used for templates.

Mikael Persson
  • 16,908
  • 6
  • 34
  • 49
  • Thank you very much for the detailed explanation of the terminology. I searched Google further to make the concepts more clear. I have understood that External linkage means that objects lying under this title can communicate with others in other source files. Example: _printf_ declared in stdio.h can be used in any other source file. Now, **the template falls under external linkage**. Is that correct? Because two same strings will have different addresses in different cpp files, they can't be used to refer to the same thing. Thats why they are not permitted. – Aquarius_Girl Apr 05 '11 at 08:27
  • @Anisha: sounds like you've got it... :-) – Tony Delroy Apr 05 '11 at 08:39
  • @Tony: I wrote that explanation without looking at GMan's post. Now I am reading his post and the concept is being more clear now. :) – Aquarius_Girl Apr 05 '11 at 08:43
3

As mentioned in other answers, a string literal cannot be used as a template argument. There is, however, a workaround which has a similar effect, but the "string" is limited to four characters. This is due to multi-character constants which, as discussed in the link, are probably rather unportable, but worked for my debug purposes.

template<int32_t nFourCharName>
class NamedClass
{
    std::string GetName(void) const
    {
        // Evil code to extract the four-character name:
        const char cNamePart1 = static_cast<char>(static_cast<uint32_t>(nFourCharName >> 8*3) & 0xFF);
        const char cNamePart2 = static_cast<char>(static_cast<uint32_t>(nFourCharName >> 8*2) & 0xFF);
        const char cNamePart3 = static_cast<char>(static_cast<uint32_t>(nFourCharName >> 8*1) & 0xFF);
        const char cNamePart4 = static_cast<char>(static_cast<uint32_t>(nFourCharName       ) & 0xFF);

        std::ostringstream ossName;
        ossName << cNamePart1 << cNamePart2 << cNamePart3 << cNamePart4;
        return ossName.str();
    }
};

Can be used with:

NamedClass<'Greg'> greg;
NamedClass<'Fred'> fred;
std::cout << greg.GetName() << std::endl;  // "Greg"
std::cout << fred.GetName() << std::endl;  // "Fred"

As I said, this is a workaround. I don't pretend this is good, clean, portable code, but others may find it useful. Another workaround could involve multiple char template arguments, as in this answer.

Community
  • 1
  • 1
Diamond Python
  • 345
  • 2
  • 7
0

Idea of c++ standard only allowing certain type of parameters to the templates is that parameter should be constant and known at compile time in order to generate "specialized class" code.

For this specific case: When you create string literal their address is unknown until linking time (linking happens after compilation) because two string literals across different translation units are two different objects (as explained brilliantly by accepted answer). When compilation happens we don't know which string literal's address to use to generate the specialized class code from template class.

PnotNP
  • 2,891
  • 2
  • 20
  • 45