9

I have a medium complex C++ class which holds a set of data read from disc. It contains an eclectic mix of floats, ints and structures and is now in general use. During a major code review it was asked whether we have a custom assignment operator or we rely on the compiler generated version and if so, how do we know it works correctly? Well, we didn't write a custom assignment and so a unit test was added to check that if we do:

CalibDataSet datasetA = getDataSet();
CalibDataSet dataSetB = datasetA;

then datasetB is the same as datasetA. A couple of hundred lines or so. Now the customer inists that we cannot rely on the compiler (gcc) being correct for future releases and we should write our own. Are they right to insist on this?

Additional info:

I'm impressed by the answers/comments already posted and the response time.Another way of asking this question might be: When does a POD structure/class become a 'not' POD structure/class?

ExpatEgghead
  • 415
  • 4
  • 15
  • 3
    Note that your code example does not invoke the copy assignment operator. It invokes the copy constructor. The `=` here is not an assignment, it's an initialization. You would need something along the lines of `CalibDataSet dataSetB; dataSetB = datasetA;` to invoke the copy assignment operator. – James McNellis Aug 26 '10 at 04:35
  • Well it depends. Show the definition of CalibDataSet. Does it manage resource? – Martin York Aug 26 '10 at 04:40
  • 4
    Rule of thumb: If everything in your class can and should be copied by 'a.thing = b.thing', then the auto-generated operator is probably fine. – Michael Kohne Aug 26 '10 at 04:51
  • 3
    One note: Find out WHY your customer insists you can't rely on gcc. Are they planning to take the class in a direction where the generated operator would be wrong? Have they had a bad experience and decreed an internal rule due to some previous screw-up? And always remember the golden rule: The one with the gold makes the rules. – Michael Kohne Aug 26 '10 at 04:54
  • 2
    You can trust gcc to always generate the operator you request it to. But you can't necessarily trust yourself to ask it to generate the right one. So at the very least, your unit tests are worth while, because hopefully they'll catch you if the compiler generated one stops being sufficient. – Dennis Zickefoose Aug 26 '10 at 05:01
  • No resources of any type. Just data and no strings. – ExpatEgghead Aug 26 '10 at 06:18

5 Answers5

14

It is well-known what the automatically-generated assignment operator will do - that's defined as part of the standard and a standards-compliant C++ compiler will always generate a correctly-behaving assignment operator (if it didn't, then it would not be a standards-compliant compiler).

You usually only need to write your own assignment operator if you've written your own destructor or copy constructor. If you don't need those, then you don't need an assignment operator, either.

Dean Harding
  • 67,567
  • 11
  • 132
  • 174
  • 9
    This is called the Rule of Three. If your class manages any resource (an indicator that is does being you release some resource manually in the destructor), you need to make sure copy-semantics work. Use the [copy-and-swap idiom](http://stackoverflow.com/questions/3279543/what-is-the-copy-and-swap-idiom). – GManNickG Aug 26 '10 at 04:26
9

The most common reason to explicitly define an assignment operator is to support "remote ownership" -- basically, a class that includes one or more pointers, and owns the resources to which those pointers refer. In such a case, you normally need to define assignment, copying (i.e., copy constructor) and destruction. There are three primary strategies for such cases (sorted by decreasing frequency of use):

  1. Deep copy
  2. reference counting
  3. Transfer of ownership

Deep copy means allocating a new resource for the target of the assignment/copy. E.g., a string class has a pointer to the content of the string; when you assign it, the assignment allocates a new buffer to hold the new content in the destination, and copies the data from the source to the destination buffer. This is used in most current implementations of a number of standard classes such as std::string and std::vector.

Reference counting used to be quite common as well. Many (most?) older implementations of std::string used reference counting. In this case, instead of allocating and copying the data for the string, you simply incremented a reference count to indicate the number of string objects referring to a particular data buffer. You only allocated a new buffer when/if the content of a string was modified so it needed to differ from others (i.e., it used copy on write). With multithreading, however, you need to synchronize access to the reference count, which often has a serious impact on performance, so in newer code this is fairly unusual (mostly used when something stores so much data that it's worth potentially wasting a bit of CPU time to avoid such a copy).

Transfer of ownership is relatively unusual. It's what's done by std::auto_ptr. When you assign or copy something, the source of the assignment/copy is basically destroyed -- the data is transferred from one to the other. This is (or can be) useful, but the semantics are sufficiently different from normal assignment that it's often counterintuitive. At the same time, under the right circumstances, it can provide great efficiency and simplicity. C++0x will make transfer of ownership considerably more manageable by adding a unique_ptr type that makes it more explicit, and also adding rvalue references, which make it easy to implement transfer of ownership for one fairly large class of situations where it can improve performance without leading to semantics that are visibly counterintuitive.

Going back to the original question, however, if you don't have remote ownership to start with -- i.e., your class doesn't contain any pointers, chances are good that you shouldn't explicitly define an assignment operator (or dtor or copy ctor). A compiler bug that stopped implicitly defined assignment operators from working would prevent passing any of a huge number of regression tests.

Even if it did somehow get released, your defense against it would be to just not use it. There's no real room for question that such a release would be replaced within a matter of hours. With a medium to large existing project, you don't want to switch to a compiler until it's been in wide use for a while in any case.

Jerry Coffin
  • 437,173
  • 71
  • 570
  • 1,035
2

If the compiler isn't generating the assignment properly, then you have bigger problems to worry about than implementing the assignment overload (like the fact that you have a broken compiler). Unless your class contains pointers, it is not necessary to provide your own overload; however, it is reasonable to request an explicit overload, not because the compiler might break (which is absurd), but rather to document your intention that assignment be permitted and behave in that manner. In C++0x, it will be possible to document intent and save time by using = default for the compiler-generated version.

Michael Aaron Safyan
  • 87,518
  • 14
  • 130
  • 194
  • Interesting. I look forward to C++0x – ExpatEgghead Aug 26 '10 at 06:18
  • "it is reasonable to request an explicit overload" - ummm... no. A comment would suffice, is less error prone, and may be more efficient. Current C++ best practice works exactly the other way - a private assignment operator should be used to ensure it isn't intended for public use. C++0x may well improve on that, but that's not relevant today. – Tony Delroy Aug 26 '10 at 06:29
  • @Tony, I generally agree, but it is not unusual to explicitly define it for documentation purposes. Also, when using Doxygen or other automatic documentation generator, a comment usually does not cut it. Also, if you explicitly define it, then you would also unit test it, which would catch breakages if, for example, a pointer field was added. – Michael Aaron Safyan Aug 27 '10 at 01:34
1

If you have some resources in your class that you are supposed to manage then writing your own assignment operator makes sense else rely on compiler generated one.

Michael Kohne
  • 11,516
  • 3
  • 43
  • 71
Learner
  • 587
  • 1
  • 5
  • 16
0

I guess the problem of shallow copy deep copy may appear in case you are dealing with strings. http://www.learncpp.com/cpp-tutorial/912-shallow-vs-deep-copying/

but i think it is always advisable to write assignment overloads for user defined classes.

ckv
  • 9,567
  • 17
  • 85
  • 137
  • 4
    I'm going to disagree with you here - you should know your classes and write the operators when needed. Writing them unnecessarily makes them code that you have to maintain - if the contents of the class don't need a custom assignment operator, then writing one will create more bugs (long term) than not writing one. – Michael Kohne Aug 26 '10 at 04:50