4

Say I have a class Foo, which contains some kind of container, say a vector<Bar *> bars. I want to allow the user to iterate through this container, but I want to be flexible so that I might change to a different container in the future. I'm used to Java, where I could do this

public class Foo
{
    List<Bar> bars;         // may change to a different collection

    // User would use this
    public Iterator<Bar> getIter()
    {
        return bars.iterator();    // can change without user knowing
    } 
}

C++ iterators are designed to look like raw C++ pointers. How do I get the equivalent functionality? I could do the following, which returns the beginning and end of the collection as an iterator that the user can walk himself.

class Foo
{
    vector<Bar *> bars;

public:
    // user would use this
    std::pair<vector<Bar *>::iterator , vector<Bar *>::iterator > getIter()
    {
        return std::make_pair(bars.begin(), bars.end()); 
    }
}

It works, but I feel I must be doing something wrong.

  1. Function declaration exposes the fact that I'm using a vector. If I change the implementation, I need to change the function declaration. Not a huge deal but kind of goes against encapsulation.

  2. Instead of returning a Java-like iterator class that can do its own bounds check, I need to return both the .begin() and .end() of the collection. Seems a bit ugly.

Should I write my own iterator class?

user3240688
  • 924
  • 2
  • 10
  • 22
  • [This article](https://www.artima.com/cppsource/type_erasure.html) could be quite useful. It develops an `any_iterator` to hide the kinds of implementation details you're worried about. – juanchopanza Aug 16 '15 at 22:11
  • 1
    you can create your own iterator class and use it. just make sure it follows [iterator requirements](http://en.cppreference.com/w/cpp/concept/Iterator) – Bryan Chen Aug 16 '15 at 22:30
  • Creating an iterator that functions in terms of the implementation's iterator is probably best. It preserves encapsulation and expresses the iterable semantics. So, similar to the Java iterator, it also allows using the class in `for (const auto& x : xs)` loops. – Jason Aug 16 '15 at 22:37
  • _"C++ iterators are designed to look like raw C++ pointers"_ Since when? – Lightness Races in Orbit Aug 16 '15 at 22:56
  • 2
    @LightnessRacesinOrbit I think they're talking about the common operator overloads for iterator classes (e.g. `operator*`, `operator+`, `operator-`, etc...) looking similar. Overloading these, does not a raw pointer make. – Jason Aug 16 '15 at 23:11
  • @Jason - what is a good source on how to writing my own iterator? Josuttis book? – user3240688 Aug 16 '15 at 23:16
  • I can't think of a single book, although Scott Meyers' *Effective C++* does cover the main ideas. @BryanChen's link should give enough information to implement a valid iterator. – Jason Aug 16 '15 at 23:27
  • Just as you would in Java, create an interface (abstract class) that all the iterators implement, then use that class type for all the iterators – smac89 Aug 16 '15 at 23:49
  • @user3240688: Google "boost iterator" - they have a wonderful library that makes it extremely easy to create your own iterator. – Lightness Races in Orbit Aug 17 '15 at 00:13

2 Answers2

4

You could adapt the vector behaviour and provide the same interface:

class Foo
{
    std::vector<Bar *> bars;

public:
    typedef std::vector<Bar*>::iterator iterator;

    iterator begin() {
        return bars.begin();
    }

    iterator end() {
        return bars.end();
    }
};

Use Foo::iterator as the iterator type outside of the container.

However, bear in mind that hiding behind the typedef offers less than it seems. You can swap the internal implementation as long as it provides the same guarantees. For example, if you treat Foo::iterator as a random access iterator, then you cannot swap a vector for a list internally at a later date without a comprehensive refactoring because list iterators are not random access.

You could refer to Scott Meyers Effective STL, Item 2: beware the illusion of container independent code for a comprehensive argument as to why it might be a bad idea to assume that you can change the underlying container at any point in future. One of the more serious points is iterator invalidation. Say you treat your iterators as bi-directional, so that you could swap a vector for a list at some point. An insertion in the middle of a vector will invalidate all of its iterators, while the same does not hold for list. In the end, the implementation details will leak, and trying to hide them might be Sisyphus work...

Maksim Solovjov
  • 3,069
  • 15
  • 27
  • How does that solve the problem? `Foo::iterator` is still a `std::vector::iterator`. The implementation details are leaked out. – juanchopanza Aug 16 '15 at 22:18
  • 1
    It provides an easy way to switch a `vector` for a `deque` or for a custom iterator. As easy as it gets, depending on how OP will threat these iterators – Maksim Solovjov Aug 16 '15 at 22:20
  • 1
    The implementation details are indeed leaked, but it should not matter, since you don't have to include vector or mention it in any fashion in the client code. `Foo::iterator` can be made any other random access iterator without rewriting the client code – Maksim Solovjov Aug 16 '15 at 22:22
  • 1
    It seems the OP really wants dynamic rather than static polymorphism here - which rather flies in the face of the approach the STL adopts. – marko Aug 16 '15 at 22:24
  • It does matter. The client code depends on the iterator being of a particular type. Changing it might break client code. Of course, the typedef is quite handy, but it is also mainly cosmetic. – juanchopanza Aug 16 '15 at 22:24
  • Changing vector to list would break the client code; however, I can't imagine how changing from vector to deque would. Surely if an iterator provides the same interface then the client code should still work? – Maksim Solovjov Aug 16 '15 at 22:27
  • @BryanChen, you will not get a compiler error, because the `Foo` class would include `vector` and you would include `Foo.h` or sth. If OP later changes it to `deque`, the interface that is used within client code does not change. You could `pimpl` the vector away, and even then `Foo.h` wouldn't have to include it, and the code would still compile – Maksim Solovjov Aug 16 '15 at 22:30
  • @juanchopanza, actually, after a careful thought, you are right, iterator invalidation would be one thing that could bite even given the same interface... – Maksim Solovjov Aug 16 '15 at 22:45
  • No, not if you state your iterator's semantics clearly and carefully. You would be able to change it to anything else as long as you don't change those semantics. Or, if you do change the semantics, it's the responsibility of the person using your library to update their own code to match. There's no problem here. – Lightness Races in Orbit Aug 16 '15 at 22:58
3

You are looking for type erasure. Basically you want an iterator with vector erased from it. This is roughly what it looks like:

#include <vector>
#include <memory>
#include <iostream>

template<class T>
class Iterator{ //the class that erases the iterator type
    //private stuff that the user should not care about
    struct Iterator_base{
        virtual void increment() = 0;
        virtual T &dereference() = 0;
        virtual ~Iterator_base() = default;
    };
    std::unique_ptr<Iterator_base> iter;
    template<class Iter>
    class Iterator_helper : public Iterator_base{
        void increment() override{
            ++iter;
        }
        T &dereference() override{
            return *iter;
        }
        Iter iter;
    public:
        Iterator_helper(const Iter &iter) : iter(iter){}
    };
public:
    template<class Iter>
    Iterator(const Iter &iter) : iter(new Iterator_helper<Iter>(iter)){}
    //iterator functions for the user
    Iterator &operator ++(){
        iter->increment();
        return *this;
    }
    T &operator *(){
        return iter->dereference();
    }
};

struct Bar{
    Bar(int i) : i(i){};
    int i;
};

class Foo
{
    std::vector<Bar> bars;

public:
    Foo(){ //just so we have some elements to point to
        bars.emplace_back(1);
        bars.emplace_back(2);
    }
    // user would use this
    Iterator<Bar> begin()
    {
        return bars.begin();
    }
};

int main(){
    Foo f;
    auto it = f.begin();
    std::cout << (*it).i << '\n'; //1
    ++it; //increment
    std::cout << (*it).i << '\n'; //2
    (*it).i++; //dereferencing
    std::cout << (*it).i << '\n'; //3
}

You can now pass any iterator (actually anything) to Iterator that support pre-increment, dereferencing and copy constuction, completely hiding the vector inside. You can even assign Iterators that have a vector::iterator inside to an Iterator that has a list::iterator inside, though that may not be a good thing.

This is a very bare-bone implementation, you would want to also implement operators ++ for post-increment, --, ->, ==, =, <, >, <=, >=, != and possibly []. Once you are done with that you need to duplicate the code into a Const_Iterator. If you don't want to do that yourself consider using boost::type_erasure.

Also note that you are paying for this encapsulation with unnecessary dynamic memory allocations, cache misses, virtual function calls that probably cannot be inlined and triply redundant code (same functions in Iterator, Iteratr_base and Iterator_helper).

vector is still present in the private part of Foo, you can get rid of that with a pimpl, adding another level of indirection.

I feel like this bit of encapsulation is not worth the cost, but your mileage may vary.

Community
  • 1
  • 1
nwp
  • 8,897
  • 2
  • 32
  • 67
  • - thank you so much for the detailed explanation. Is type erasure how Java implements it under the hood? – user3240688 Aug 17 '15 at 00:24
  • @user3240688 I don't know, but I guess so. In Java it doesn't cost extra though, since you have dynamic memory allocation anyways and the code redundancy is hidden. And maybe a JVM can do some JIT magic to speed it up. – nwp Aug 17 '15 at 10:16