107

A slightly strange question, however, if I remember correctly, C++ source code doesn't require a file system to store its files.

Having a compiler that scans handwritten papers via a camera would be a conforming implementation. Although practically not making that much sense.

However C++20 now adds source location with file_name. Does this now imply that source code should always be stored in a file?

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
JVApen
  • 10,085
  • 3
  • 26
  • 56
  • 14
    This has been in C since forever - `__FILE__`. Class `source_location` just allows you to get it at function call site. – StaceyGirl Aug 18 '19 at 20:42
  • 29
    Can't you give filename to your handwritten papers? – Jarod42 Aug 18 '19 at 20:43
  • 8
    I think it is an implementation detail whether the source code is in files, or something else. If the compiler can be fed source code through stdin, the source could be in a database. – Eljay Aug 18 '19 at 20:44
  • Good point, I forgot about preprocessor already – JVApen Aug 18 '19 at 20:46
  • 8
    My example may be a bit off, but if you use some on-the-fly compiler, such as TCC you can always supply some human readable source name for the sake of error reporting even though you compile directly from memory. That is having a "file name" does not imply being stored as a file at all. – user7860670 Aug 18 '19 at 20:54
  • 2
    Surely it's the implementation files such as `` that may not be files (if you see what I mean), not the files written by developers? –  Aug 18 '19 at 20:55
  • If you run the OS in a container chances are that there are no separate physical "files", however exactly you care to define those in this complicated word. Windows allegedly wanted to migrate to a database file system for Windows 8 (but scrapped that), which could conceivably be in the cloud, so there would be no files whatsoever. The compiler likely "opens" something having some properties of a "file" (hey, I can read and close!), but even for that there is no guarantee in an integrated environment or an interpreter where everything may be in memory. **These terms are purely conceptual.** – Peter - Reinstate Monica Aug 19 '19 at 09:56
  • @Jarod42 such as `""`? ☻ – mirabilos Aug 19 '19 at 17:09
  • 1
    @mirabilos: `""` seems more appropriated for source provided by `stdin`. `"Handwritten paper 01"` might be more appropriate in such case. – Jarod42 Aug 19 '19 at 18:21
  • Your question is a little circular. What is a file ? What is a file system ? – Yves Daoust Aug 20 '19 at 19:06
  • There was "Everything is file" philosophy and now "why source must be in file"... – i486 Mar 03 '21 at 19:21

2 Answers2

111

No, source code doesn't have to come from a file (nor go to a file).

You can compile (and link) C++ completely within a pipe, putting your compiler in the middle, e.g.

generate_source | g++ -o- -xc++ - | do_something_with_the_binary

and it's been like that for decades. See also:

The introduction of std::source_location in C++20 doesn't change this state of affairs. It's just that some code will not have a well-defined source location (or it may be well-defined, but not very meaningful). Actually, I'd say that the insistence on defining std::source_location using files is a bit myopic... although in fairness, it's just a macro-less equivalent of __FILE__ and __LINE__ which already exist in C++ (and C).

@HBv6 notes that if you print the value of __FILE__ when compiling using GCC from the standard input stream:

echo -e '#include <iostream>\n int main(){std::cout << __FILE__ ;}' | g++ -xc++  -

running the resulting executable prints <stdin>.

Source code can even come from the Internet.

@Morwenn notes that this code:

#include <https://raw.githubusercontent.com/Morwenn/poplar-heap/master/poplar.h>

// Type your code here, or load an example.
void poplar_sort(int* data, size_t size) {
    poplar::make_heap(data, data + size);
    poplar::sort_heap(data, data + size);
}

works on GodBolt (but won't work on your machine - no popular compiler supports this.)

Are you a language lawyer? Ok, so let's consult the standard..

The question of whether C++ program sources need to come from files is not answered clearly in the language standard. Looking at a draft of the C++17 standard (n4713), section 5.1 [lex.separate] reads:

  1. The text of the program is kept in units called source files in this document. A source file together with all the headers (20.5.1.2) and source files included (19.2) via the preprocessing directive #include, less any source lines skipped by any of the conditional inclusion (19.1) preprocessing directives, is called a translation unit.

So, the source code is not necessarily kept in a file per se, but in a "unit called a source file". But then, where do the includes come from? One would assume they come from named files on the filesystem... but that too is not mandated.

At any rate, std::source_location does not seem to change this wording in C++20 or to affect its interpretation (AFAICT).

einpoklum
  • 86,754
  • 39
  • 223
  • 453
  • 10
    That pipe is a "source file" for the purposes of the standard. – melpomene Aug 18 '19 at 21:32
  • 8
    @melpomene: The units are just *called* source files, it doesn't say that they actually have to be source files. But I'll edit the answer to include this. – einpoklum Aug 18 '19 at 21:55
  • Are we in Humpty Dumpty land, where words mean whatever we want them to mean? We're going to call them source files, even though they might not actually be what computer programmers normally mean by that phrase? – Barmar Aug 19 '19 at 05:42
  • The standard also says "kept", yet in the pipe example the text of the program isn't even kept anywhere, it's generated dynamically on the fly. It would probably have been better if it said something like "If the text of the program is in a named file, `source_location::file_name` contains the name of that file. – Barmar Aug 19 '19 at 05:44
  • 3
    @Barmar: The answer seems to be "Yes, we are." But hey - don't shoot the messenger, I didn't write the standard :-) – einpoklum Aug 19 '19 at 06:18
  • 1
    *"I'd say that the insistence on defining source_location using files is somewhat short-sighted or myopic"* while I see your point, this is the exact kind of mindset that's kept C++ ages behind other languages. That is the reason we don't have a C++ repository to manage and pull libraries from, why we don't have standard stack trace, or `#pragma once` standardised, or even a file system library (until now), or a memory model until c++11 and the list goes on and on. [...] – bolov Aug 19 '19 at 08:13
  • [...] I understand there are difficulties defining these in a language meant to be implemented on any platform present or future, but still, they are keeping C++ from becoming a truly modern language, despite all of the recent efforts. Imho not having a library manager is just inexcusable at this point. – bolov Aug 19 '19 at 08:13
  • 2
    @bolov: That's even more incendiary than what I said... also, comments on this answer are not the appropriate venue for this discussion. I also disagree at least in part with your claims. – einpoklum Aug 19 '19 at 08:27
  • 1
    "We're going to call them source files, even though they might not actually be...?" -- if they're stored somewhere, and they're referenced by names, they're files in a filesystem. Whether that filesystem is ext2, NTFS, WebDAV or IMAP is irrelevant. The filesystem is just an abstraction over a name : byte-sequence mapping. – Roger Lipscombe Aug 19 '19 at 09:12
  • 1
    @RogerLipscombe: The source in a pipe _is_ stored somewhere - in memory; and - you can't reference it by name. – einpoklum Aug 19 '19 at 09:25
  • 13
    Just tried this with GCC: "echo '#include \nint main(){printf("%s\\n", \_\_FILE\_\_); return 1;}' | gcc -o test -xc -" (without quotes). When executed, it prints out . – HBv6 Aug 19 '19 at 09:59
  • That's a bonus feature of gcc, not C++. – Roger Lipscombe Aug 19 '19 at 10:10
  • 1
    @RogerLipscombe: Does it contradict the standard? – einpoklum Aug 19 '19 at 10:19
  • 5
    I don't see how it's a "bonus feature". It's a compliant result of a standard feature. – Lightness Races in Orbit Aug 19 '19 at 11:05
  • 11
    Here is a funny thing about terms and names and concepts in standards (and sciences): they're usually atomic. That is, "source file" is not necessarily a "file" that is "source", in fact, the term "file" may simply not be defined — compare with numbers in the maths: there is no such thing as just a "number", only "natural nmber", "rational number", "real number", etc. – Joker_vD Aug 19 '19 at 19:43
  • @Joker_vD: Well, maybe, but it doesn't say source-file or "source file" in quotes. Still, good point. – einpoklum Aug 19 '19 at 20:15
  • 1
    The ability of Godbolt's compiler explorer to include files over the internet when specifying an URL in a `#include` directive is another example of file mapping that bypasses the traditional filesystem. The implementation is certainly a hack, but the standard definition seems to allow it. – Morwenn Aug 22 '19 at 10:08
  • @Morwenn: Link to an example? – einpoklum Aug 22 '19 at 11:02
53

Even before C++20, the standard has had:

__FILE__

The presumed name of the current source file (a character string literal).

The definition is the same for source_location::file_name.

As such, there has not been a change in regard to support for file system-less implementations in C++20.

The standard doesn't exactly define what "source file" means, so whether it refers to a file system may be up to interpretation. Presumably, it could be conforming for an implementation to produce "the handwritten note that you gave to me just then" if that indeed identifies the "source file" in that implementation of the language.


In conclusion: Yeah, sources are referred to as "files" by the standard, but what a "file" is and whether a file system is involved is unspecified.

Community
  • 1
  • 1
eerorika
  • 181,943
  • 10
  • 144
  • 256
  • What does "*presumed*" mean in this context? Can there be an ambiguity? – Yksisarvinen Aug 18 '19 at 20:49
  • I'm wondering, does my initial claim (not stored in file) contain a mistake, or is there a behavior for that case? – JVApen Aug 18 '19 at 20:54
  • 2
    @Yksisarvinen I don't know exactly the intention of the "presumption" qualification of the rule, but I *presume* :) that it is an clarification that the file name doest need to be absolute or canonical, but rather a relative name from perspective of the compiler is sufficient. I could be wrong. – eerorika Aug 18 '19 at 21:11
  • 4
    I can just see `scanner-c++` returning *"Left-cabinet, third-drawer, fourth red-tabbed folder, page 17"*. – dmckee --- ex-moderator kitten Aug 19 '19 at 17:53
  • 2
    FWIW, in the POSIX sense, a pipe (or any other file-ish thing) is a "file" - as such, stdin/stdout are "files", just not disk files etc. in this sense. –  Aug 19 '19 at 19:09
  • 3
    @Yksisarvinen: The Committee often makes allowances for situations where obscure implementations might have good reasons to do something contrary to commonplace behavior. In so doing, it relies upon compiler writers to judge whether their customers would find the commonplace behavior more or less useful than some alternative. The fact that such things are left to implementers' judgment may be viewed as an "ambiguity", but it's a deliberate one, since good compiler writers will know more about their customers' needs than the Committee ever could. – supercat Aug 19 '19 at 20:13
  • 1
    @dmckee ... *in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”* – Andrew Henle Aug 20 '19 at 13:05