8

I have encountered code which does this:

SomeObject parse (std::istream && input) {....

The input argument is an rvalue reference, which normally means the function is intended to take ownership of the argument. That's not quite what is happening here.

The parse function will completely consume the input stream, and it demands an rvalue reference because then calling code will give away ownership of the istream and hence this is a signal that the input stream will be unusable.

I think this is okay because, since the parse function doesn't actually move the object around, there is no danger of slicing out the subtype. This is basically behaving as a normal reference from parse's point of view, only there is a kind of compilable comment to the calling function that you have to give up ownership of the stream.

Is this code actually safe? Or is there some overlooked subtlety which makes this dangerous?

spraff
  • 29,265
  • 19
  • 105
  • 197
  • 6
    Undoubtedly the C++ language lawyers will have a few things to say about this. In the meantime, can you define the terms "okay," "safe," "reasonable" and "dangerous" within the context of your question, please? One man's cognac is another man's poison. – Robert Harvey Jan 16 '19 at 15:58
  • Safe from **what**? **OK** by whom? Dangerous by what? – SergeyA Jan 16 '19 at 16:00
  • Seems to me this is just a way to force this parameter to always be a temporary. – Sam Varshavchik Jan 16 '19 at 16:03
  • Is your goal with `SomeObject parse (std::istream && input)` is to get the function to only accept temporaries? – NathanOliver Jan 16 '19 at 16:04
  • 1
    @SamVarshavchik or to force a programmer to `std::move` the input stream that's worked on, which will indicate that noone should try to extract from it afterwards. – Fureeish Jan 16 '19 at 16:05
  • It doesn't seem "dangerous" given the current state of the `std::istream` interface. But the semantics here are IMHO questionable. What does it mean for the calling code to "give away ownership" but at the same time "not transferring it"? Who "owns" the stream after `parse()` returns!? In what way exactly does `parse()` make the stream "unusable"? What if parsing fails due to some error before the entire stream is "consumed"? Is the stream "unusable" then!? Is "ownership" somehow "given back" to the calling code in this case? – Michael Kenzel Jan 16 '19 at 16:24
  • Is this worth the WTF factor of somewhere down the line a caller being forced to write `parse(std::move(std::cin))`? – Caleth Jan 16 '19 at 16:29
  • @Caleth Consider that it could very well be used with other streams, and `prase(std::move(my_string_stream));` is much less shocking. – François Andrieux Jan 16 '19 at 16:30
  • 1
    @Caleth and after the initial "WTF" you begin to understand that "oh, so it will parse the entire input, since I am giving up the ownership of `std::cin` and I should not use it afterwads. Got it. Neat". At least in my perception. Consider also the usage of `std::stringstream`s and `std::fstream`s. – Fureeish Jan 16 '19 at 16:30
  • 1
    So what can it possibly do to make `std::cin` unusable? This aspect particularly doesn't make any sense to me given the very concept of a stream. A stream can be at EOF. That doesn't make that stream object unusable. A stream can be in a bad state. That doesn't make the object unusable… – Michael Kenzel Jan 16 '19 at 16:31
  • @MichaelKenzel whatever in the entire world the programmer of `parse` wants it to do. It's not promising that "I will destroy this stream so much you won't even be able to recongise what's left". It's expressing that "I want to own that stream, do whatever I want with it and you won't be using it for other things than assigning to it". – Fureeish Jan 16 '19 at 16:34
  • 1
    But what's the point of introducing this arbitrary restriction to the interface when there's literally nothing an implementation could possibly do to "take advantage" of it!? – Michael Kenzel Jan 16 '19 at 16:35
  • To indicate that it will read everything. It's not about taking advantage of rvalue references. It's about using rvalue references to express the intent in a very clever way that's possible to perform a check on at compile time. – Fureeish Jan 16 '19 at 16:39
  • By passing a reference to a non-const `std::istream`, the caller already has to assume that whoever gets the stream will read from it. If it reads beyond the end of the stream, the stream will be in an EOF state. If an error occurs, the stream will be in a bad state. In no case will the stream object itself be in an "unusable" state. That's what the whole concept of an `std::istream` is about… – Michael Kenzel Jan 16 '19 at 16:44
  • 1
    "very clever way" which a number of us think is *confusing*. And it isn't a great check, because you can fairly easily *lie*, see `parse(std::move(std::cin))` – Caleth Jan 16 '19 at 16:45
  • While it's not perfect (and a lot of discussions could fire off, since it's actually the first time I see something like that) and potentially tricky to 'get', I still think it actually expresses the intent very clearly. Also, I don't consider `parse(std::move(std::cin))` a *lie*. Go ahead and parse everything from `std::cin`. I *could* obviously do something with it afterwads (you *can* do pretty much everything in `C++`), but I know I shouldn't. If you want to read from `std::cin` in other way than `parse` does, then... don't use `parse`? – Fureeish Jan 16 '19 at 16:48
  • What if it is something more than a regular array of characters or a file. It could be, for example, a network connection, or an input-output pipe, or .... Just because `parse` will read everything, doesn't mean that it becomes useless to the caller. It could, for example, do something asynchronously in another thread. For example, reads from the stream could block current thread, waiting for another thread to write to the same object. But if you force it to be `std::move`-d, you hint that it shouldn't be used that way. – CygnusX1 Jan 16 '19 at 17:02
  • I will repeat this once again, since I already provided the answer to that question - if `parse` does something you don't want to do, then simply **don't** use `parse`. – Fureeish Jan 16 '19 at 17:08
  • @Fureeish If you are refering to my examples - I don't see `parse` doing anything that I don't want it to do, *except* for taking my `istream` by r-value reference - and that is the OP's question - or at least close to it - "is it ok?", or as I interpret - "do I lose something by doing so?" – CygnusX1 Jan 16 '19 at 17:17
  • `parse` could spawn a thread that will wait for any signal from given stream and in that case you would want to ensure that *nobody else will ever use such stream*. There are a few examples, but I think it's not the best place to discuss this idea. You certainly have valid points and I am leaning towards your position, but I still have some doubts. Let's wait for other possible answers. – Fureeish Jan 16 '19 at 17:21
  • Whatever `parse` does, the premise of the question is that it renders the stream unreadable. If you can't agree that the premise might be true then replace `std::istream` with `foo` which can be anything compatible with the premise. – François Andrieux Jan 16 '19 at 17:21
  • @Fureeish Watch out though. That example may not be appropriate here. If it performs asynchronous operations it either has to actually take ownership of the stream somehow to preserve it's lifetime or the function caller has a huge burden of making the the stream outlives the asynchronous operation. – François Andrieux Jan 16 '19 at 17:22

4 Answers4

6

An std::istream isn't moveable, so there's no practical benefit to this.

This already signals that the thing may be "modified", without the confusion of suggesting you're transferring ownership of the std::istream object (which you're not doing, and can't do).

I can kind of see the rationale behind using this to say that the stream is being logically moved, but I think you have to make a distinction between "ownership of this thing is being transferred", and "I keep ownership of this thing but I will let you consume all of its services". Ownership transfer is quite well understood as a convention in C++, and this is not really it. What will a user of your code think when they have to write parse(std::move(std::cin))?

Your way isn't "dangerous" though; you won't be able to do anything with that rvalue reference that you can't with an lvalue reference.

It would be far more self-documenting and conventional to just take an lvalue reference.

Lightness Races in Orbit
  • 358,771
  • 68
  • 593
  • 989
  • 3
    Taking `rvalue` reference and moving is not necessarily married to each other. – SergeyA Jan 16 '19 at 16:01
  • @SergeyA Not seeing the purpose of an rvalue reference if you're not going to transfer ownership via a move (and you don't want to limit calls to rvalue expressions for arcane overloading reasons). Can you give an example of when this would be useful? – Lightness Races in Orbit Jan 16 '19 at 16:02
  • 2
    "*Can you give an example*", well, the code in the question... At least in my opinion, the fact that one will *transfer ownership* of the input stream (which passing by rvalue reference means) will indicate that noone should try to extract from it afterwards, which I consider a neat way of using `C++` syntax to express the intent. But that's just my opinion. – Fureeish Jan 16 '19 at 16:04
  • You gave one yourself - you want your call be limited to rvalues. For example, (I am not saying this OPs case, I think the question isn't clear and closable) one might want to request that object provided to the function is a "scratchpad". I am using it myself in a certain case. – SergeyA Jan 16 '19 at 16:05
  • If the function must make the received object unusable it might as well end it's lifetime. You might not actually move in a sink function, you might just put that object in a limited functionality state (similar or identical to a moved-from state). But functionally, it looks exactly like it took ownership of the object. In a way it did even if it doesn't keep that ownership. – François Andrieux Jan 16 '19 at 16:11
  • 1
    Transferring ownership is quite a well-defined context in C++ and it usually involves a move. You simply cannot do that with a `std::istream`. You can make your function accept only an rvalue to signify that it's going to read all the data from the stream, sure, but that's not what transferring ownership has ever meant in C++, so this would be quite strange IMO! I mean I can _kind of_ see it, but I can't see that it's worthwhile. – Lightness Races in Orbit Jan 16 '19 at 16:45
  • Are you sure you can’t move `std::istream`s? [This suggests otherwise](https://en.cppreference.com/w/cpp/io/basic_istream/basic_istream) and I’m fairly certain that I’ve done this before. – templatetypedef Jan 16 '19 at 16:59
  • @LightnessRacesinOrbit There are places in the standard library where an xvalue is taken, but not necessarily moved from, e.g. `std::string operator +(std::string&&, std::string&&)` (I know it's actually a template, I'm following KISS.) As such, I don't think rvalues *imply* moving. – Arne Vogel Jan 16 '19 at 17:26
  • 1
    @templatetypedef Only from within derived types - the move ctor is protected. – Lightness Races in Orbit Jan 16 '19 at 18:05
6

std::move just produces an rvalue reference from an object and nothing more. The nature of an rvalue is such that you can assume nobody else will care about it's state after you're done with it. std::move then is used to allow developpers to make that promise about objects with other value categories. In other words calling std::move in a meaningful context is equivalent to saying "I promise I don't care about this object's state anymore".

Since you will be making the object essentially unusable and you want to make sure the caller won't use the object anymore using an rvalue reference enforces this expectation to some extent. It forces the caller to make that promise to your function. Failure to make the promise will result in a compiler error (assuming there isn't another valid overload). It does not matter if you actually move from the object or not, only that the original owner has agreed to forfeit it's ownership.

François Andrieux
  • 24,129
  • 6
  • 46
  • 72
  • Mostly agreed but I still don't see why you'd want to do that to a stream. It's unnecessarily constrictive unless there are domain constraints the OP hasn't mentioned. Michael's answer puts it quite nicely. – Lightness Races in Orbit Jan 16 '19 at 18:08
2

What you're trying to do here is not "dangerous" in the sense that, given the current std::istream interface, there don't seem to be any circumstances under wich taking an rvalue reference here would necessarily lead to undefined behavior when taking an lvalue reference wouldn't have. But the semantics of this whole contraption are IMHO very questionable at best. What does it mean for the calling code to "give away ownership" but at the same time "not transferring it"? Who "owns" the stream after parse() returns!? In what way exactly does parse() make the stream "unusable"? What if parsing fails due to some error before the entire stream is "consumed"? Is the stream "unusable" then!? Is no one allowed to try read the rest? Is "ownership" somehow "given back" to the calling code in this case?

A stream is an abstract concept. The purpose of the stream abstraction is to serve as an interface through which someone can consume input without having to know where the data comes from, lives, or how it is accessed and managed. If the purpose of parse() is to parse input from arbitrary sources, then it should not be concerned with the nature of the source. If it is concerned with the nature of the source, then it should request a specific kind of source. And this is where, IMHO, your interface contradicts itself. Currently, parse() takes an arbitrary source. The interface says: I take whatever stream you give me, I don't care how it's implemented. As long as it's a stream, I can work with it. At the same time, it requires the caller to relinquish the object that actually implements the stream. The interface requires the caller to hand over something that the interface itself prevents any implementation behind the interface to ever know about, access, or use in any way. For example, how would I have parse() read from an std::ifstream? Who closes the file afterwards? If can't be the parser. It also can't be me because calling the parser forced me to hand over the object. At the same time, I know that the parser could never even know it had to close the file I handed over…

It'll still do the right thing in the end because there was no way an implementation of the interface could've actually done what the interface suggested it would do and so my std::ifstream destructor will just run and close the file. But what exactly did we gain by lying to each other like that!? You promised to take over the object when you were never going to, I promised to never touch the object again when I knew I'll always have to…

Michael Kenzel
  • 14,512
  • 1
  • 25
  • 36
2

Your assumption that rvalue reference parameter imply "taking ownership" is completely incorrect. Rvalue reference is just a specific kind of reference, which comes with its own initialization rules and overload resolution rules. No more, no less. Formally, it has no special affinity with "moving" or "taking ownership" of the referenced object.

It is true that support for move semantics is considered one of the primary purposes of rvalue references, but still you should not assume that this is their only purpose and that these features are somehow inseparable. Just like any other language feature, it might allow a significant number of well-developed alternative idiomatic uses.

An example quote similar to what you just quoted is actually present in the standard library itself. It is the extra overloads introduced in C++11 (and C++17, depending on some nuances)

template< class CharT, class Traits, class T >
basic_ostream< CharT, Traits >& operator<<( basic_ostream<CharT,Traits>&& os, 
                                            const T& value );

template< class CharT, class Traits, class T >
basic_istream<CharT,Traits>& operator>>( basic_istream<CharT,Traits>&& st, T&& value );

Their main purpose is to "bridge" the difference in the behavior between member and non-member overloads of operator <<

#include <string>
#include <sstream>

int main() 
{
  std::string s;
  int a;

  std::istringstream("123 456") >> a >> s;
  std::istringstream("123 456") >> s >> a;
  // Despite the obvious similarity, the first line is well-formed in C++03
  // while the second isn't. Both lines are well-formed in C++11
}

It takes advantage of the fact that rvalue reference can bind to temporary objects and still see them as modifiable objects. In this case rvalue reference is used for purposes that have nothing to do with move semantics. This is perfectly normal.

AnT
  • 291,388
  • 39
  • 487
  • 734
  • What are those purposes? I think that's the key point to this question. If you can explain what purpose this holds in that example, and how it relates to the OP's situation, that's probably the answer. – Lightness Races in Orbit Jan 16 '19 at 18:06