Chapter 19: The STL Generic Algorithms

Don't hesitate to send in feedback: send an e-mail if you like the C++ Annotations; if you think that important material was omitted; if you find errors or typos in the text or the code examples; or if you just feel like e-mailing. Send your e-mail to Frank B. Brokken.

Please state the document version you're referring to, as found in the title (in this document: 8.1.0~pre2) and please state chapter and paragraph name or number you're referring to.

All received mail is processed conscientiously, and received suggestions for improvements will usually have been processed by the time a new version of the Annotations is released. Except for the incidental case I will normally not acknowledge the receipt of suggestions for improvements. Please don't interpret this as me not appreciating your efforts.

19.1: The Generic Algorithms

In the previous chapter the Standard Template Library (STL) was introduced. An important element of the STL, the generic algorithms, was not covered in that chapter as it covers a fairly extensive part of the STL. Over time the STL has grown considerably, mainly as a result of a growing importance and appreciation of templates. Covering generic algorithm in the STL chapter itself would turn that chapter into an unwieldy one and so the generic algoritmhs were moved to a chapter of their own.

Generic algorithms perform an amazing task: due to the strenght of templates algorithms could be developed that can be applied to a wide range of different data types while maintaining type safety. The prototypical example of this is the sort generic algorithm (cf. section 19.1.58). To contrast: while C requires programmers to write callback functions in which type-unsafe void const * parameters have to be used, internally forcing the programmer to use casts, the STL sort algorithm allows the programmer in many cases to state something akin to

    sort(first-element, last-element)
to sort a series of elements in a type-safe way.

Generic algoritms should be used wherever possible. Avoid the urge to design your own code for commonly encountered algorithms. Make it a habit to first thoroughly search the generic algorithms for an available candidate. The generic algorithms should become your weapon of choice when writing code: acquire full familiarity with them and turn their use into your `second nature'.

The following sections describe the STL's generic algorithms in alphabetical order. For each algorithm the following information is provided:

In the prototypes of the algorithms Type is used to specify a generic data type. Also, the particular type of iterator (see section 18.2) that is required is mentioned, as well as other generic types that might be required (e.g., performing BinaryOperations, like plus<Type>()).

Almost every generic algorithm expects an iterator range [first, last), defining the range of elements on which the algorithm operates. The iterators point to objects or values. When an iterator points to a Type value or object, function objects used by the algorithms usually receive Type const & objects or values: function objects can therefore not modify the objects they receive as their arguments. This does not hold true for modifying generic algorithms, which are (of course) able to modify the objects they operate upon.

Generic algorithms may be categorized. In the C++ Annotations the following categories of generic algorithms are distinguished:

19.1.1: accumulate()

19.1.2: adjacent_difference()

19.1.3: adjacent_find()

19.1.4: binary_search()

19.1.5: copy()

19.1.6: copy_backward()

19.1.7: count()

19.1.8: count_if()

19.1.9: equal()

19.1.10: equal_range()

19.1.11: fill()

19.1.12: fill_n()

19.1.13: find()

19.1.14: find_end()

19.1.15: find_first_of()

19.1.16: find_if()

19.1.17: for_each()

The example also shows that the for_each algorithm may be used with functions defining const and non-const parameters. Also, see section 19.1.63 for differences between the for_each() and transform() generic algorithms.

The for_each() algorithm cannot directly be used (i.e., by passing *this as the function object argument) inside a member function to modify its own object as the for_each() algorithm first creates its own copy of the passed function object. A wrapper class whose constructor accepts a pointer or reference to the current object and possibly to one of its member functions solves this problem. In section 23.8 the construction of such wrapper classes is described.

19.1.18: generate()

19.1.19: generate_n()

19.1.20: includes()

19.1.21: inner_product()

19.1.22: inplace_merge()

19.1.23: iter_swap()

19.1.24: lexicographical_compare()

19.1.25: lower_bound()

19.1.26: max()

19.1.27: max_element()

19.1.28: merge()

19.1.29: min()

19.1.30: min_element()

19.1.31: mismatch()

19.1.32: next_permutation()

19.1.33: nth_element()

19.1.34: partial_sort()

19.1.35: partial_sort_copy()

19.1.36: partial_sum()

19.1.37: partition()

19.1.38: prev_permutation()

19.1.39: random_shuffle()

19.1.40: remove()

19.1.41: remove_copy()

19.1.42: remove_copy_if()

19.1.43: remove_if()

19.1.44: replace()

19.1.45: replace_copy()

19.1.46: replace_copy_if()

19.1.47: replace_if()

19.1.48: reverse()

19.1.49: reverse_copy()

19.1.50: rotate()

19.1.51: rotate_copy()

19.1.52: search()

19.1.53: search_n()

19.1.54: set_difference()

19.1.55: set_intersection()

19.1.56: set_symmetric_difference()

19.1.57: set_union()

19.1.58: sort()

19.1.59: stable_partition()

19.1.60: stable_sort()

Note that the example implements a solution to an often occurring problem: how to sort using multiple hierarchal criteria. The example deserves some additional attention:
  1. First, a typedef is used to reduce the clutter that occurs from the repeated use of pair<string, string>.
  2. Next, operator<< is overloaded to be able to insert a pair into an ostream object. This is merely a service function to make life easy. Note, however, that this function is put in the std namespace. If this namespace wrapping is omitted, it won't be used, as ostream's operator<< operators must be part of the std namespace.
  3. Then, a class sortby is defined, allowing us to construct an anonymous object which receives a pointer to one of the pair data members that are used for sorting. In this case, as both members are string objects, the constructor can easily be defined: its parameter is a pointer to a string member of the class pair<string, string>.
  4. The operator()() member will receive two pair references, and it will then use the pointer to its members, stored in the sortby object, to compare the appropriate fields of the pairs.
  5. In main(), first some data is stored in a vector.
  6. Then the first sorting takes place. The least important criterion must be sorted first, and for this a simple sort() will suffice. Since we want the names to be sorted within cities, the names represent the least important criterion, so we sort by names: sortby(&pss::first).
  7. The next important criterion, the cities, are sorted next. Since the relative ordering of the names will not be altered anymore by stable_sort(), the ties that are observed when cities are sorted are solved in such a way that the existing relative ordering will not be broken. So, we end up getting Goldberg in Eugene before Hampson in Eugene, before Moran in Eugene. To sort by cities, we use another anonymous sortby object: sortby(&pss::second).

19.1.61: swap()

19.1.62: swap_ranges()

19.1.63: transform()

the following differences between the for_each() (section 19.1.17) and transform() generic algorithms should be noted:

19.1.64: unique()

19.1.65: unique_copy()

19.1.66: upper_bound()

19.1.67: Heap algorithms

A heap is a kind of binary tree which can be represented by an array. In the standard heap, the key of an element is not smaller than the key of its children. This kind of heap is called a max heap. A tree in which numbers are keys could be organized as shown in figure 19.

Figure 19 is shown here.
Figure 19: A binary tree representation of a heap


Such a tree may also be organized in an array:

        12, 11, 10, 8, 9, 7, 6, 1, 2, 4, 3, 5
In the following description, keep two pointers into this array in mind: a pointer node indicates the location of the next node of the tree, a pointer child points to the next element which is a child of the node pointer. Initially, node points to the first element, and child points to the second element. Since child now points beyond the array, the remaining nodes have no children. So, nodes 6, 1, 2, 4, 3 and 5 don't have children.

Note that the left and right branches are not ordered: 8 is less than 9, but 7 is larger than 6.

The heap is created by traversing a binary tree level-wise, starting from the top node. The top node is 12, at the zeroth level. At the first level we find 11 and 10. At the second level 6, 7, 8 and 9 are found, etc.

Heaps can be created in containers supporting random access. So, a heap is not, for example, constructed in a list. Heaps can be constructed from an (unsorted) array (using make_heap()). The top-element can be pruned from a heap, followed by reordering the heap (using pop_heap()), a new element can be added to the heap, followed by reordering the heap (using push_heap()), and the elements in a heap can be sorted (using sort_heap(), which invalidates the heap, though).

The following subsections show the prototypes of the heap-algorithms, the final subsection provides a small example in which the heap algorithms are used.

19.1.67.1: The `make_heap()' function

19.1.67.2: The `pop_heap()' function

19.1.67.3: The `push_heap()' function

19.1.67.4: The `sort_heap()' function

19.1.67.5: An example using the heap functions

Here is an example showing the various generic algorithms manipulating heaps:
    #include <algorithm>
    #include <iostream>
    #include <functional>
    #include <iterator>

    void show(int *ia, char const *header)
    {
        std::cout << header << ":\n";
        std::copy(ia, ia + 20, std::ostream_iterator<int>(std::cout, " "));
        std::cout << std::endl;
    }

    using namespace std;

    int main()
    {
        int ia[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
                    11, 12, 13, 14, 15, 16, 17, 18, 19, 20};

        make_heap(ia, ia + 20);
        show(ia, "The values 1-20 in a max-heap");

        pop_heap(ia, ia + 20);
        show(ia, "Removing the first element (now at the end)");

        push_heap(ia, ia + 20);
        show(ia, "Adding 20 (at the end) to the heap again");

        sort_heap(ia, ia + 20);
        show(ia, "Sorting the elements in the heap");


        make_heap(ia, ia + 20, greater<int>());
        show(ia, "The values 1-20 in a heap, using > (and beyond too)");

        pop_heap(ia, ia + 20, greater<int>());
        show(ia, "Removing the first element (now at the end)");

        push_heap(ia, ia + 20, greater<int>());
        show(ia, "Re-adding the removed element");

        sort_heap(ia, ia + 20, greater<int>());
        show(ia, "Sorting the elements in the heap");

        return 0;
    }
    /*
        Generated output:

        The values 1-20 in a max-heap:
        20 19 15 18 11 13 14 17 9 10 2 12 6 3 7 16 8 4 1 5
        Removing the first element (now at the end):
        19 18 15 17 11 13 14 16 9 10 2 12 6 3 7 5 8 4 1 20
        Adding 20 (at the end) to the heap again:
        20 19 15 17 18 13 14 16 9 11 2 12 6 3 7 5 8 4 1 10
        Sorting the elements in the heap:
        1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
        The values 1-20 in a heap, using > (and beyond too):
        1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
        Removing the first element (now at the end):
        2 4 3 8 5 6 7 16 9 10 11 12 13 14 15 20 17 18 19 1
        Re-adding the removed element:
        1 2 3 8 4 6 7 16 9 5 11 12 13 14 15 20 17 18 19 10
        Sorting the elements in the heap:
        20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
    */