Why do we use functions that return a data structure in C++?

Question

I have been learning C++ and came across a function, but the return type was a vector.

Here is the code:

vector<Name> inputNames() {
    ifstream fin("names.txt");
    string word;
    vector<Name> namelist;

    while (!fin.eof()) {
        Name name;
        fin >> name.first_name;
        fin >> name.last_name;
        namelist.push_back(name);
    }

    return namelist;
}

name is part of a struct defined as:

struct Name {
    string first_name;
    string last_name;

    bool operator<(const Name& d) const {
        return last_name > d.last_name;
    }

    void display() {
        cout << first_name << " " << last_name << endl;
    }
};

What is the purpose of using vector< Name>inputName()? What is it doing?

And why can I just not create a void function and pass a vector through it?

I.e.:

void input(vector<Name>&v){
    ifstream fin("names.txt");
    string word;

    while (!fin.eof()) {
        Name name;
        fin >> name.first_name;
        fin >> name.last_name;
        v.push_back(name);
    }
}

I was tensed before reading this quest and after reading it, a smile got stuck on my face... Well, it will depend on the use-case, one eg: if the programmer doesn't want to mutate the argument passed in function and prefer getting a new value out of the method then he will not choose to pass the property as the function argument. — Amit Upadhyay, Jun 23 '20 at 06:07
`vector` is the *return type* for the function `inputNames()`. It tells you the function will return that type object. Within the function you read multiple names from `names.txt`. By having a vector of names, you can store as many as needed and `vector` provides auto memory management for you. But please read [Why !.eof() inside a loop condition is always wrong.](https://stackoverflow.com/q/5605125/9254539) — David C. Rankin, Jun 23 '20 at 06:08
[this](https://stackoverflow.com/questions/33994995/which-is-more-efficient-return-a-value-vs-pass-by-reference) have the same question read the answer there, I'l hope this will help you — yaodav, Jun 23 '20 at 06:10
Why should we treat vectors or trees differently from integers or floats? All of those are data. — n. 'pronouns' m., Jun 23 '20 at 06:31

score 24 · Accepted Answer · edited Jun 23 '20 at 15:55

Your question is basically: Do I return by value or do I use an output argument?

The general consensus in the community is to return by value, especially from C++17 on with guaranteed copy elision. Although, I also recommend it C++11 onwards. If you use an older version, please upgrade.

We consider the first snippet more readable and understandable and even more performant.

From a callers perspective:

std::vector<Name> names = inputNames();

It's clear that inputNames returns you some values without changing the existing state of the program, assuming you don't use global variables (which you actually do with cin).

The second code would be called the following way:

std::vector<Name> names;
 // Other code
inputNames(names);

This raises a lot of questions:

does inputNames use the names as input or does it extend it?
if there are values in names, what does the function do with it?
does the function have a return value to indicate success?

It used to be good practice when computers were slow and compilers had troubles optimizing, though, at this point, don't use it for output arguments.

When do you use the last style: if you want an in-out argument. In this case, if you intend to append, the vector already had data, and that actually makes sense.

It's worth noting that using the second one is still the only sane way to reuse buffers. — Sopel, Jun 23 '20 at 19:56
Not that relevant for this example, though, yes. In that case it's an in,out argument. Although useful, I have only used that trick in a couple of locations when it really matters. Often changing to boost::containers::small_vector has a much better effect of you know your data — JVApen, Jun 24 '20 at 06:10

DevSolar · Answer 2 · 2020-06-23T06:20:59.053

This basically mirrors the mathematical definition of a function as...

...a relation that associates an input to a single output.

While you could write void functions that modify their parameters, this has disadvantages:

Expression of intent. Consider a function taking multiple parameters. Which ones are input, which ones are output?
Clarity of purpose. A function modifying multiple values at once is usually (not always) attempting to do too many things at once. Focussing on one return value per function helps with keeping your program logic under control.
RAII. You can't use a void function to initialize a variable, meaning you would have to declare that variable first (initialized to some "default" value), then initialize it to the desired value.

There are languages that work without return values, using "out parameters" instead. You can do it this way in C++ as well. But all in all, using return values as the one output of a function helps the structure of your program.

score 0 · Answer 3 · edited Jun 23 '20 at 15:57

vector<Name> is the return value of the method. It will create a new object with a vector of the structs.

Your implementation is called "call by reference". It will pass the pointer to an existing vector. As a example, with the call by the reference implementation, you could call input(vector<Name>&v) multiple times and your preexisting vector will have multiple times the content. If you would to it with the vector return value, it would always create a new object with only one iteration of data.

Why do we use functions that return a data structure in C++?

3 Answers3