486

I'm iterating over a vector and need the index the iterator is currently pointing at. AFAIK this can be done in two ways:

  • it - vec.begin()
  • std::distance(vec.begin(), it)

What are the pros and cons of these methods?

cairol
  • 7,723
  • 8
  • 25
  • 25

10 Answers10

622

I would prefer it - vec.begin() precisely for the opposite reason given by Naveen: so it wouldn't compile if you change the vector into a list. If you do this during every iteration, you could easily end up turning an O(n) algorithm into an O(n^2) algorithm.

Another option, if you don't jump around in the container during iteration, would be to keep the index as a second loop counter.

Note: it is a common name for a container iterator,std::container_type::iterator it;.

TankorSmash
  • 11,146
  • 5
  • 55
  • 96
UncleBens
  • 38,655
  • 6
  • 51
  • 88
  • 3
    Agreed. I'd say that the minus sign is best, but it would be better to keep a second loop counter than to use std::distance, precisely because this function could be slow. – Steven Sudit Feb 04 '10 at 19:41
  • 33
    @Steinfeld its an iterator. `std::container_type::iterator it;` – Matt Munson Jul 06 '14 at 16:33
  • 2
    Adding a second loop counter is such an obvious solution that I'm embarrassed I didn't think of it. – Mordred Dec 19 '17 at 23:24
  • @UncleBeans Why we cannot use the - operator for List? – Swapnil Sep 22 '18 at 13:43
  • 4
    @Swapnil because `std::list` does not offer direct access to elements by their position, so if you cannot do `list[5]`, you shouldn't be able to do `list.begin() + 5`. – José Tomás Tocino Nov 05 '18 at 10:50
147

I would prefer std::distance(vec.begin(), it) as it will allow me to change the container without any code changes. For example, if you decide to use std::list instead of std::vector which doesn't provide a random access iterator your code will still compile. Since std::distance picks up the optimal method depending on iterator traits you'll not have any performance degradation either.

Naveen
  • 69,046
  • 43
  • 164
  • 225
  • 53
    When you're using a container without random access iterators, it's best *not to* compute such distances because it's inefficient – Eli Bendersky Jan 28 '10 at 07:47
  • 7
    @Eli: I agree with that, but in a very special case if it is really required, then still that code will work. – Naveen Jan 28 '10 at 09:38
  • 9
    I think the code should be changed anyway if the container changes - having a std::list variable named `vec` is bad news. If the code were re-written to be generic, taking the container type as a template parameter, that's when we can (and should) talk about handling non-random-access iterators ;-) – Steve Jessop Jan 28 '10 at 12:50
  • 1
    And specialisation for certain containers. – ScaryAardvark Feb 02 '10 at 16:39
  • 25
    @SteveJessop : Having a vector named `vec` is pretty bad news, too. – River Tam Jun 25 '14 at 17:42
  • 1
    @EliBendersky It's better to have working code first, then optimize. Inducing a compilation failure because the code *may* be slow is plainly wrong. – BartoszKP Sep 30 '16 at 17:27
77

As UncleBens and Naveen have shown, there are good reasons for both. Which one is "better" depends on what behavior you want: Do you want to guarantee constant-time behavior, or do you want it to fall back to linear time when necessary?

it - vec.begin() takes constant time, but the operator - is only defined on random access iterators, so the code won't compile at all with list iterators, for example.

std::distance(vec.begin(), it) works for all iterator types, but will only be a constant-time operation if used on random access iterators.

Neither one is "better". Use the one that does what you need.

jalf
  • 229,000
  • 47
  • 328
  • 537
12

I like this one: it - vec.begin(), because to me it clearly says "distance from beginning". With iterators we're used to thinking in terms of arithmetic, so the - sign is the clearest indicator here.

Eli Bendersky
  • 231,995
  • 78
  • 333
  • 394
  • 20
    It's more clear to use subtraction to find the distance than to use, quite literally, the word `distance` ? – Travis Gockel Jan 28 '10 at 07:59
  • 5
    @Travis, to me it is. It's a matter of taste and custom. We say `it++` and not something like `std::increment(it)`, don't we? Wouldn't that also count as less clear? – Eli Bendersky Jan 28 '10 at 08:03
  • 3
    The `++` operator is defined as part of the STL sequences as how we increment the iterator. `std::distance` calculates the number of elements between the first and last element. The fact that the `-` operator works is merely a coincidence. – Travis Gockel Jan 28 '10 at 08:09
  • 3
    @MSalters: and yet, we use ++ :-) – Eli Bendersky Jan 28 '10 at 11:11
10

If you are already restricted/hardcoded your algorithm to using a std::vector::iterator and std::vector::iterator only, it doesn't really matter which method you will end up using. Your algorithm is already concretized beyond the point where choosing one of the other can make any difference. They both do exactly the same thing. It is just a matter of personal preference. I would personally use explicit subtraction.

If, on the other hand, you want to retain a higher degree of generality in your algorithm, namely, to allow the possibility that some day in the future it might be applied to some other iterator type, then the best method depends on your intent. It depends on how restrictive you want to be with regard to the iterator type that can be used here.

  • If you use the explicit subtraction, your algorithm will be restricted to a rather narrow class of iterators: random-access iterators. (This is what you get now from std::vector)

  • If you use distance, your algorithm will support a much wider class of iterators: input iterators.

Of course, calculating distance for non-random-access iterators is in general case an inefficient operation (while, again, for random-access ones it is as efficient as subtraction). It is up to you to decide whether your algorithm makes sense for non-random-access iterators, efficiency-wise. It the resultant loss in efficiency is devastating to the point of making your algorithm completely useless, then you should better stick to subtraction, thus prohibiting the inefficient uses and forcing the user to seek alternative solutions for other iterator types. If the efficiency with non-random-access iterators is still in usable range, then you should use distance and document the fact that the algorithm works better with random-access iterators.

AnT
  • 291,388
  • 39
  • 487
  • 734
4

According to http://www.cplusplus.com/reference/std/iterator/distance/, since vec.begin() is a random access iterator, the distance method uses the - operator.

So the answer is, from a performance point of view, it is the same, but maybe using distance() is easier to understand if anybody would have to read and understand your code.

Stéphane
  • 6,654
  • 2
  • 40
  • 50
3

I'd use the - variant for std::vector only - it's pretty clear what is meant, and the simplicity of the operation (which isn't more than a pointer subtraction) is expressed by the syntax (distance, on the other side, sounds like pythagoras on the first reading, doesn't it?). As UncleBen points out, - also acts as a static assertion in case vector is accidentially changed to list.

Also I think it is much more common - have no numbers to prove it, though. Master argument: it - vec.begin() is shorter in source code - less typing work, less space consumed. As it's clear that the right answer to your question boils down to be a matter of taste, this can also be a valid argument.

Alexander Gessler
  • 42,787
  • 5
  • 78
  • 120
1

Beside int float string etc., you can put extra data to .second when using diff. types like:

std::map<unsigned long long int, glm::ivec2> voxels_corners;
std::map<unsigned long long int, glm::ivec2>::iterator it_corners;

or

struct voxel_map {
    int x,i;
};

std::map<unsigned long long int, voxel_map> voxels_corners;
std::map<unsigned long long int, voxel_map>::iterator it_corners;

when

long long unsigned int index_first=some_key; // llu in this case...
int i=0;
voxels_corners.insert(std::make_pair(index_first,glm::ivec2(1,i++)));

or

long long unsigned int index_first=some_key;
int index_counter=0;
voxel_map one;
one.x=1;
one.i=index_counter++;

voxels_corners.insert(std::make_pair(index_first,one));

with right type || structure you can put anything in the .second including a index number that is incremented when doing an insert.

instead of

it_corners - _corners.begin()

or

std::distance(it_corners.begin(), it_corners)

after

it_corners = voxels_corners.find(index_first+bdif_x+x_z);

the index is simply:

int vertice_index = it_corners->second.y;

when using the glm::ivec2 type

or

int vertice_index = it_corners->second.i;

in case of the structure data type

  • When using large amounts of data the gained speed without the it - vec.begin() or std::distance(vec.begin(), it) using the index inserted with the make_pair is more than 100 times... makes you think, "witch one is better?" using an index in the .second field along with the other data you want to store with another data type / structure. –  Dec 29 '20 at 20:27
0

Here is an example to find "all" occurrences of 10 along with the index. Thought this would be of some help.

void _find_all_test()
{
    vector<int> ints;
    int val;
    while(cin >> val) ints.push_back(val);

    vector<int>::iterator it;
    it = ints.begin();
    int count = ints.size();
    do
    {
        it = find(it,ints.end(), 10);//assuming 10 as search element
        cout << *it << " found at index " << count -(ints.end() - it) << endl;
    }while(++it != ints.end()); 
}
0

I just discovered this: https://greek0.net/boost-range/boost-adaptors-indexed.html

    for (const auto & element : str | boost::adaptors::indexed(0)) {
        std::cout << element.index()
                  << " : "
                  << element.value()
                  << std::endl;
    }

Spongman
  • 8,493
  • 7
  • 34
  • 55