Wednesday, March 28, 2012

Know your containers: a danger with std::vector

Believe it or not, the following code contains a bug:

int f(int); // function f defined elsewhere

vector<int> vec(1,1); // start with singleton value 1
// Call f on each element. If result is positive, add it to vector.
for (auto i = vec.begin(); i != vec.end(); ++i) {
   const int res = f(*i);
   if (res > 0) {
      vec.push_back(res);
   }
}

Can you spot the bug? (And no, it's not a trick question. f is defined elsewhere, that's not the problem. Neither is it the fact that the algorithm may be non-terminating, we may for example assume that f will only return positive values a finite number of times.)

The problem is this: as we are traversing the vector, we are sometimes adding new elements, which we then want to also apply f on when the time comes. However, adding new elements to a vector may invalidate the iterators. This is because when a vectors capacity is reached, it will need to grow, which means reallocating all elements to another place in memory, which would necessarily invalidate all references and iterators.

One way to fix the problem is to use indices instead of iterators:

for (decltype(vec.size()) i = 0; i != vec.size(); ++i) { // using indices
   const int res = f( vec[i] ); // ok, dereferencing using index


Another way to fix the problem, if you know an upper bound on the number of elements that are going to be added, is to call vector::reserve first. As long as no reallocations take place, you are safe in using iterators when inserting.

Or, you could use std::list, for which no insertion invalidates the iterators:
list<int> vec(1,1);
// rest of code looks the same as above:

for (auto i = vec.begin(); i != vec.end(); ++i) {
// push_back will not invalidate iterator i here, since i iterates a list

Bottom line:
Don't add elements while traversing a vector. Usually, this indicates a design flaw (or possibly a bug) in your code to begin with. If you really have to, reserve enough space so that the vector will not be reallocated (this assumes you have a reasonable upper bound to begin with). Indices could work, although they don't work with STL algorithms. Otherwise, just switch to list (or forward_list).

No comments:

Post a Comment