Don't hesitate to send in feedback: send an e-mail if you like the C++ Annotations; if you think that important material was omitted; if you find errors or typos in the text or the code examples; or if you just feel like e-mailing. Send your e-mail to Frank B. Brokken.Please state the document version you're referring to, as found in the title (in this document: 8.1.0~pre2) and please state chapter and paragraph name or number you're referring to.
All received mail is processed conscientiously, and received suggestions for improvements will usually have been processed by the time a new version of the Annotations is released. Except for the incidental case I will normally not acknowledge the receipt of suggestions for improvements. Please don't interpret this as me not appreciating your efforts.
The
Standard Template Library
(
STL) is a
general purpose library
consisting of
containers,
generic algorithms,
iterators,
function objects,
allocators,
adaptors and
data structures. The data structures used
in the algorithms are abstract in the sense that the algorithms can be
used on (practically) every data type.
The algorithms can work on these abstract data types due to the fact that they are template based algorithms. In this chapter the construction of templates is not discussed (see chapter 20 for that). Rather, this chapter focuses on the use of these template algorithms.
Several parts of the standard template library have already been discussed in the C++ Annotations. In chapter 12 the abstract containers were discussed, and in section 10.10 function objects were introduced. Also, iterators were mentioned at several places in this document.
The remaining components of the STL will be covered in this and the next chapter. Iterators, adaptors, smart pointers, multi threading and other features of the STL will be discussed in the coming sections. Generic algorithms are covered in the next chapter (19).
Allocators take care of the memory allocation within the STL. The default allocator class suffices for most applications, and is not further discussed in the C++ Annotations.
All elements of the STL are defined in the standard namespace. Therefore, a
using namespace std
or comparable directive is required unless it is
preferred to specify the required namespace explicitly. This occurs in at
least one situation: in header files no using
directive should be used,
so here the std::
scope specification should always be specified when
referring to elements of the STL.
sort()
expecting two iterators defining the range of objects that should be sorted,
as well as a function object calling the appropriate comparison operator for
two objects. Let's take a quick look at this situation. Assume strings are
stored in a vector, and we want to sort the vector in descending order. In
that case, sorting the vector stringVec
is as simple as:
sort(stringVec.begin(), stringVec.end(), greater<std::string>());The last argument is recognized as a constructor: it is an instantiation of the
greater<>()
class template, applied to
strings
. This object is called as a
function object by the sort()
generic algorithm. It will call the
operator>()
of the provided data type
(here
std::string
) whenever its
operator()()
is called. Eventually,
when sort()
returns, the first element of the vector will be the greatest
element.
The operator()()
(
function call operator) itself is not visible
at this point: don't confuse the parentheses in greater<string>()
with
calling operator()()
. When that operator is actually used inside
sort()
, it receives two arguments: two strings to compare for
`greaterness'. Internally, the
operator>()
of the data type to which the
iterators point (i.e., string
) is called by greater<string>
's function
operator (operator()()
) to compare the two objects. Since greater<>
's
function call operator is defined
inline, the call itself is not actually
present in the code. Rather, sort()
calls string::operator>()
,
thinking it called greater<>::operator()()
.
Now that we know that a constructor is passed as argument to (many) generic
algorithms, we can design our own function objects. Assume we want to sort our
vector case-insensitively. How do we proceed? First we note that the default
string::operator<()
(for an incremental sort) is not appropriate, as it
does
case sensitive comparisons. So, we provide our own case_less
class, in which the two strings are compared case insensitively. Using the
standard C function
strcasecmp()
, the following program performs the
trick. It sorts its command-line arguments in ascending alphabetic order:
#include <iostream> #include <string> #include <algorithm> using namespace std; class case_less { public: bool operator()(string const &left, string const &right) const { return strcasecmp(left.c_str(), right.c_str()) < 0; } }; int main(int argc, char **argv) { sort(argv, argv + argc, case_less()); for (int idx = 0; idx < argc; ++idx) cout << argv[idx] << " "; cout << endl; }The default constructor of the
class case_less
is used with
sort()
's final argument. Therefore, the only member function that must be
defined with the class case_less
is the function object operator
operator()()
. Since we know it's called with string
arguments, we
define it to expect two string
arguments, which are used in the
strcasecmp()
function. Furthermore, the operator()()
function is made
inline, so that it does not produce overhead when called by the sort()
function. The sort()
function calls the function object with various
combinations of strings
, i.e., it thinks it does so. However, in fact
it calls strcasecmp()
, due to the inline-nature of
case_less::operator()()
.
The comparison function object is often a predefined function object, since these are available for many commonly used operations. In the following sections the available predefined function objects are presented, together with some examples showing their use. At the end of the section about function objects function adaptors are introduced. Before predefined function objects can be used the following preprocessor directive must have been specified:
#include <functional>Predefined function objects are used predominantly with generic algorithms. Predefined function objects exists for arithmetic, relational, and logical operations. In section 23.5 predefined function objects are developed performing bitwise operations.
plus<Type>
is available. If we set type
to size_t
then the +
operator for size_t
values is used, if we set type
to
string
, then the +
operator for strings is used. For example:
#include <iostream> #include <string> #include <functional> using namespace std; int main(int argc, char **argv) { plus<size_t> uAdd; // function object to add size_ts cout << "3 + 5 = " << uAdd(3, 5) << endl; plus<string> sAdd; // function object to add strings cout << "argv[0] + argv[1] = " << sAdd(argv[0], argv[1]) << endl; } /* Generated output with call: a.out going 3 + 5 = 8 argv[0] + argv[1] = a.outgoing */Why is this useful? Note that the function object can be used with all kinds of data types (not only with the predefined datatypes), in which the particular operator has been overloaded. Assume that we want to perform an operation on a common variable on the one hand and, on the other hand, in turn on each element of an array. E.g., we want to compute the sum of the elements of an array; or we want to concatenate all the strings in a text-array. In situations like these the function objects come in handy. As noted before, the function objects are heavily used in the context of the generic algorithms, so let's take a quick look ahead at one of them.
One of the generic algorithms is called
accumulate()
. It visits all
elements implied by an iterator-range, and performs a requested binary
operation on a common element and each of the elements in the range, returning
the accumulated result after visiting all elements.
For example, the following program accumulates all command line arguments,
and prints the final string:
#include <iostream> #include <string> #include <functional> #include <numeric> using namespace std; int main(int argc, char **argv) { string result = accumulate(argv, argv + argc, string(), plus<string>()); cout << "All concatenated arguments: " << result << endl; }The first two arguments define the (iterator) range of elements to visit, the third argument is
string()
. This anonymous string object provides an
initial value. It could as well have been initialized to
string("All concatenated arguments: ")in which case the
cout
statement could have been a simple
cout << result << endl;Then, the operator to apply is
plus<string>()
. Note here that a
constructor is called: it is not plus<string>
, but rather
plus<string>()
. The final concatenated string is returned.
Now we define our own class Time
, in which the
operator+()
has been overloaded. Again, we can apply the predefined
function object plus
, now tailored to our newly defined datatype, to add
times:
#include <iostream> #include <sstream> #include <string> #include <vector> #include <functional> #include <numeric> using namespace std; class Time { friend ostream &operator<<(ostream &str, Time const &time) { return cout << time.d_days << " days, " << time.d_hours << " hours, " << time.d_minutes << " minutes and " << time.d_seconds << " seconds."; } size_t d_days; size_t d_hours; size_t d_minutes; size_t d_seconds; public: Time(size_t hours, size_t minutes, size_t seconds) : d_days(0), d_hours(hours), d_minutes(minutes), d_seconds(seconds) {} Time &operator+=(Time const &rValue) { d_seconds += rValue.d_seconds; d_minutes += rValue.d_minutes + d_seconds / 60; d_hours += rValue.d_hours + d_minutes / 60; d_days += rValue.d_days + d_hours / 24; d_seconds %= 60; d_minutes %= 60; d_hours %= 24; return *this; } }; Time const operator+(Time const &lValue, Time const &rValue) { return Time(lValue) += rValue; } int main(int argc, char **argv) { vector<Time> tvector; tvector.push_back(Time( 1, 10, 20)); tvector.push_back(Time(10, 30, 40)); tvector.push_back(Time(20, 50, 0)); tvector.push_back(Time(30, 20, 30)); cout << accumulate ( tvector.begin(), tvector.end(), Time(0, 0, 0), plus<Time>() ) << endl; } /* produced output: 2 days, 14 hours, 51 minutes and 30 seconds. */Note that all member functions of
Time
in the above source are inline
functions. This approach was followed in order to keep the example relatively
small and to show explicitly that the operator+=()
function may be an
inline function. On the other hand, in real life Time
's operator+=()
should probably not be made inline, due to its size.
Considering the previous discussion of the plus
function object, the
example is pretty straightforward. The class Time
defines a constructor,
it defines an insertion operator and it defines its own operator+()
,
adding two time objects.
In main()
four Time
objects are stored in a
vector<Time>
object. Then, the accumulate()
generic algorithm is
called to compute the accumulated time. It returns a Time
object, which
is inserted in the cout ostream
object.
While the first example did show the use of a named function object,
the last two examples showed the use of
anonymous objects which were
passed to the (accumulate()
) function.
The following arithmetic objects are available as predefined objects:
plus<>()
: as shown, this object's operator()()
member calls
operator+()
as a
binary operator, passing it its two parameters,
returning operator+()
's return value.
minus<>()
: this object's operator()()
member calls
operator-()
as a binary operator, passing it its two parameters and
returning operator-()
's return value.
multiplies<>()
: this object's operator()()
member calls
operator*()
as a binary operator, passing it its two parameters and
returning operator*()
's return value.
divides<>()
: this object's operator()()
member calls
operator/()
, passing it its two parameters and
returning operator/()
's return value.
modulus<>()
: this object's operator()()
member calls
operator%()
, passing it its two parameters and
returning operator%()
's return value.
negate<>()
: this object's operator()()
member calls
operator-()
as a unary operator, passing it its parameter and
returning the unary operator-()
's return value.
operator-()
follows, in
which the
transform()
generic algorithm is used to toggle the signs of all
elements in an array. The transform()
generic algorithm expects two
iterators, defining the range of objects to be transformed, an iterator
defining the begin of the destination range (which may be the same iterator as
the first argument) and a function object defining a unary operation for the
indicated data type.
#include <iostream> #include <string> #include <functional> #include <algorithm> using namespace std; int main(int argc, char **argv) { int iArr[] = { 1, -2, 3, -4, 5, -6 }; transform(iArr, iArr + 6, iArr, negate<int>()); for (int idx = 0; idx < 6; ++idx) cout << iArr[idx] << ", "; cout << endl; } /* Generated output: -1, 2, -3, 4, -5, 6, */
==, !=, >, >=, <
and <=
. The
following objects are available:
equal_to<>()
: this object's operator()()
member calls
operator==()
as a binary operator, passing it its two parameters and
returning operator==()
's return value.
not_equal_to<>()
: this object's operator()()
member calls
operator!=()
as a binary operator, passing it its two parameters and
returning operator!=()
's return value.
greater<>()
: this object's operator()()
member calls
operator>()
as a binary operator, passing it its two parameters and
returning operator>()
's return value.
greater_equal<>()
: this object's operator()()
member calls
operator>=()
as a binary operator, passing it its two parameters and
returning operator>=()
's return value.
less<>()
: this object's operator()()
member calls
operator<()
as a binary operator, passing it its two parameters and
returning operator<()
's return value.
less_equal<>()
: this object's operator()()
member calls
operator<=()
as a binary operator, passing it its two parameters and
returning operator<=()
's return value.
sort()
is:
#include <iostream> #include <string> #include <functional> #include <algorithm> using namespace std; int main(int argc, char **argv) { sort(argv, argv + argc, greater_equal<string>()); for (int idx = 0; idx < argc; ++idx) cout << argv[idx] << " "; cout << endl; sort(argv, argv + argc, less<string>()); for (int idx = 0; idx < argc; ++idx) cout << argv[idx] << " "; cout << endl; }The
sort()
generic algorithm expects an iterator range and a
comparator of the data type to which the iterators point. The example shows
the
alphabetic sorting of strings and the
reversed sorting of
strings. By passing greater_equal<string>()
the strings are sorted in
decreasing order (the first word will be the 'greatest'), by passing
less<string>()
the strings are sorted in increasing order (the first
word will be the 'smallest').
Note that the type of the elements of argv
is char *
, and that the
relational function object expects a string
. The relational object
greater_equal<string>()
will therefore use the >=
operator of strings,
but will be called with char *
variables. The promotion from char const
*
to string
is performed silently.
and, or,
and not
. The following objects are
available:
logical_and<>()
: this object's operator()()
member calls
operator&&()
as a binary operator, passing it its two parameters and
returning operator&&()
's return value.
logical_or<>()
: this object's operator()()
member calls
operator||()
as a binary operator, passing it its two parameters and
returning operator||()
's return value.
logical_not<>()
: this object's operator()()
member calls
operator!()
as a unary operator, passing it its parameter and
returning the unary operator!()
's return value.
operator!()
is provided in the following trivial
program, in which the
transform()
generic algorithm is used to transform
the logicalvalues stored in an array:
#include <iostream> #include <string> #include <functional> #include <algorithm> using namespace std; int main(int argc, char **argv) { bool bArr[] = {true, true, true, false, false, false}; size_t const bArrSize = sizeof(bArr) / sizeof(bool); for (size_t idx = 0; idx < bArrSize; ++idx) cout << bArr[idx] << " "; cout << endl; transform(bArr, bArr + bArrSize, bArr, logical_not<bool>()); for (size_t idx = 0; idx < bArrSize; ++idx) cout << bArr[idx] << " "; cout << endl; } /* generated output: 1 1 1 0 0 0 0 0 0 1 1 1 */
minus<int>()
function object, which is a
binary function object, the
first argument may be bound to 100, meaning that the resulting value will
always be 100
minus the value of the second argument. Either the first or
the second argument may be bound to a specific value. To bind the first
argument to a specific value, the function object
bind1st()
is used. To
bind the second argument of a binary function to a specific value
bind2nd()
is used. As an example, assume we want to count all elements of
a vector of Person
objects that exceed (according to some criterion) some
reference Person
object. For this situation we pass the following binder
and
relational function object to the
count_if()
generic algorithm:
bind2nd(greater<Person>(), referencePerson)What would such a binder do? First of all, it's a function object, so it needs
operator()()
. Next, it expects two arguments: a reference to another
function object and a fixed operand. Although binders are defined as
templates, it is illustrative to have a look at their implementations,
assuming they were straight functions. Here is such a pseudo-implementation of
a binder:
class bind2nd { FunctionObject const &d_object; Operand const &d_operand; public: bind2nd(FunctionObject const &object, Operand const &operand); ReturnType operator()(Operand const &lvalue); }; inline bind2nd::bind2nd(FunctionObject const &object, Operand const &operand) : d_object(object), d_operand(operand) {} inline ReturnType bind2nd::operator()(Operand const &lvalue) { return d_object(lvalue, d_rvalue); }When its
operator()()
member is called the binder merely passes the
call to the object's operator()()
, providing it with two arguments: the
lvalue
it itself received and the fixed operand it received via its
constructor. Note the simplicity of these kind of classes: all its members can
usually be implemented inline.
The count_if()
generic algorithm visits all the elements in an
iterator range, returning the number of times the
predicate specified as
its final argument returns
true
. Each of the elements of the iterator
range is given to the predicate, which is therefore a
unary function. By
using the binder the binary function object greater()
is adapted to a
unary function object, comparing each of the elements in the range to the
reference person. Here is, to be complete, the call of the count_if()
function:
count_if(pVector.begin(), pVector.end(), bind2nd(greater<Person>(), referencePerson))
not1()
is
the negator used with
unary function objects,
not2()
is the
negator used with
binary function objects.
vector<Person>
vector
not exceeding a certain reference person, we may, among other approaches,
use either of the following alternatives:
count_if(pVector.begin(), pVector.end(), bind2nd(less_equal<Person>(), referencePerson))
not2
combined with the greater()
predicate:
count_if(pVector.begin(), pVector.end(), bind2nd(not2(greater<Person>()), referencePerson))Note that
not2()
is a negator negating the truth value of a binary
operator()()
member: it must be used to wrap the binary predicate
greater<Person>()
, negating its truth value.
not1()
combined with the bind2nd()
predicate:
count_if(pVector.begin(), pVector.end(), not1(bind2nd(greater<Person>(), referencePerson)))Note that
not1()
is a negator negating the truth value of a unary
operator()()
member: it is used to wrap the unary predicate bind2nd()
,
negating its truth value.
The following little example illustrates the use of negator function adaptors, completing the section on function objects:
#include <iostream> #include <functional> #include <algorithm> #include <vector> using namespace std; int main(int argc, char **argv) { int iArr[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; cout << count_if(iArr, iArr + 10, bind2nd(less_equal<int>(), 6)) << endl; cout << count_if(iArr, iArr + 10, bind2nd(not2(greater<int>()), 6)) << endl; cout << count_if(iArr, iArr + 10, not1(bind2nd(greater<int>(), 6))) << endl; } /* produced output: 6 6 6 */
count_if()
:
operator()()
is called;
<=
is performed for int
values.
not2
negator is used to
negate the truth value of the complementary logicalfunction adaptor, three
actions must be performed for each iteration by count_if()
:
operator()()
is called;
operator()()
is called;
>
is performed for int
values.
not1
negator is used to
negate the truth value of the binder, three
actions must be performed for each iteration by count_if()
:
operator()()
is called;
operator()()
is called;
>
is performed for int
values.
g++
compiler on an old, 166 MHz pentium, performing 3,000,000
count_if()
calls for each variant, shows the first approach requiring
about 70% of the time needed by the other two approaches to complete.
However, these differences disappear if the compiler is instructed to
optimize for speed (using the
-O6
compiler
flag). When interpreting these results one should keep in mind that multiple
nested function calls are merged into a single function call if the
implementations of these functions are given inline and if the compiler
follows the suggestion to implement these functions as true inline functions
indeed. If this is happening, the three approaches all merge to a single
operation: the comparison between two int
values. It is likely that the
compiler does so when asked to optimize for speed.
==
and
!=
operators. Note that the ordering operators (e.g., >
, <
)
normally cannot be used.
iter
, *iter
represents the object the
iterator points to (alternatively, iter->
can be used to reach the members
of the object the iterator points to).
++iter
or iter++
advances the iterator to the next
element. The notion of advancing an iterator to the next element is
consequently applied: several containers have a
reversed_iterator type, in
which the iter++
operation actually reaches a
previous element in a
sequence.
vector
and
deque
. For these containers iter + 2
points to the second element
beyond the one to which iter
points.
#include <vector> #include <iostream> using namespace std; int main() { vector<int>::iterator vi; cout << &*vi << endl; // prints 0 }
iterator
) using member functions
begin()
and
end()
and, in the
case of reversed iterators (type reverse_iterator
),
rbegin()
and
rend()
. Standard practice requires the iterator range to be left
inclusive: the notation
[left, right)
indicates that left
is an
iterator pointing to the first element that is to be considered, while
right
is an iterator pointing just beyond the last element to be
used. The iterator-range is said to be
empty when left == right
.
Note that with
empty containers
the begin- and
end-iterators are equal to each other.
The following example shows a situation where all elements of a vector of
strings are written to cout
using the iterator range [begin(),
end())
, and the iterator range [rbegin(), rend())
. The following
example shows that the for-loops
for both ranges are
identical. Furthermore it nicely illustrates how the
auto
keyword can be
used to define the type of the loop control variable instead of using a much
more verbose explicit variable definition like vector<string>::iterator
(see also section 3.3.5):
#include <iostream> #include <vector> #include <string> using namespace std; int main(int argc, char **argv) { vector<string> args(argv, argv + argc); for (auto iter = args.begin(); iter != args.end(); ++iter) cout << *iter << " "; cout << endl; for (auto iter = args.rbegin(); iter != args.rend(); ++iter) cout << *iter << " "; cout << endl; return 0; }
Furthermore, the STL defines const_iterator types to be able to visit
a series of elements in a constant container. Whereas the elements of the
vector in the previous example could have been altered, the elements of the
vector in the next example are immutable, and const_iterator
s are
required:
#include <iostream> #include <vector> #include <string> using namespace std; int main(int argc, char **argv) { vector<string> const args(argv, argv + argc); for ( vector<string>::const_iterator iter = args.begin(); iter != args.end(); ++iter ) cout << *iter << " "; cout << endl; for ( vector<string>::const_reverse_iterator iter = args.rbegin(); iter != args.rend(); ++iter ) cout << *iter << " "; cout << endl; return 0; }The examples also illustrates that plain pointers can be used instead of iterators. The initialization
vector<string> args(argv, argv + argc)
provides the args
vector with a pair of pointer-based iterators: argv
points to the first element to initialize sarg
with, argv + argc
points just beyond the last element to be used, argv++
reaches the next
string. This is a general characteristic of pointers, which is why they too
can be used in situations where iterators
are expected.
The STL defines five types of iterators. These types recur in the generic algorithms, and in order to be able to create a particular type of iterator yourself it is important to know their characteristics. In general, iterators must define:
operator==()
, testing two iterators for equality,
operator++()
, incrementing the iterator, as
prefix operator,
operator*()
, to access the element the iterator refers to,
InputIterators can read from a container. The dereference operator is guaranteed to work asrvalue
in expressions. Instead of an InputIterator it is also possible to (see below) use a Forward-, Bidirectional- or RandomAccessIterator. With the generic algorithms presented in this chapter. Notations likeInputIterator1
andInputIterator2
may be observed as well. In these cases, numbers are used to indicate which iterators `belong together'. E.g., the generic functioninner_product()
has the following prototype:Type inner_product(InputIterator1 first1, InputIterator1 last1, InputIterator2 first2, Type init);HereInputIterator1 first1
andInputIterator1 last1
are a set of input iterators defining one range, whileInputIterator2 first2
defines the beginning of a second range. Analogous notations like these may be observed with other iterator types.
OutputIterators can be used to write to a container. The dereference operator is guaranteed to work as anlvalue
in expressions, but not necessarily asrvalue
. Instead of an OutputIterator it is also possible to use, see below, a Forward-, Bidirectional- or RandomAccessIterator.
ForwardIterators combine InputIterators and OutputIterators. They can be used to traverse containers in one direction, for reading and/or writing. Instead of a ForwardIterator it is also possible to use a Bidirectional- or RandomAccessIterator.
BidirectionalIterators can be used to traverse containers in both directions, for reading and writing. Instead of a BidirectionalIterator it is also possible to use a RandomAccessIterator. For example, to traverse a list or a deque a BidirectionalIterator may be useful.
RandomAccessIterators provide
random access to container
elements. An algorithm such as
sort()
requires a RandomAccessIterator, and
can therefore not be used with lists or maps, which only provide
BidirectionalIterators.
copy()
algorithm
has three parameters, the first two defining the range of visited elements,
and the third parameter defines the first position where the results of the
copy operation should be stored. With the copy()
algorithm the number of
elements that are copied are usually available beforehand, since the number is
usually determined using pointer arithmetic. However, there are situations
where pointer arithmetic cannot be used. Analogously, the number of resulting
elements sometimes differs from the number of elements in the initial
range. The generic algorithm unique_copy()
is a case in
point: the number of elements which are copied to the destination container is
normally not known beforehand.
In situations like these, an
inserter
adaptor function may be used to
create elements in the destination container when they are needed.
There are three types of inserter()
adaptors:
back_inserter()
: calls the container's
push_back()
member to add
new elements at the end of the container. E.g., to copy all elements of
source
in reversed order to the back of destination
:
copy(source.rbegin(), source.rend(), back_inserter(destination));
front_inserter()
calls the container's
push_front()
member to add
new elements at the beginning of the container. E.g., to copy all elements of
source
to the front of the destination container (thereby also reversing
the order of the elements):
copy(source.begin(), source.end(), front_inserter(destination));
inserter()
calls the container's
insert()
member to add new
elements starting at a specified starting point. E.g., to copy all elements of
source
to the destination container, starting at the beginning of
destination
, shifting existing elements beyond the newly inserted
elements:
copy(source.begin(), source.end(), inserter(destination, destination.begin()));
back_inserter()
, this iterator expects the name
of a container having a member push_back()
. This member is called by the
inserter's operator()()
member. When a class (other than the abstract
containers) supports a push_back()
container, its objects can also be
used as arguments of the back_inserter()
if the class defines a
typedef DataType const &const_reference;in its interface, where
DataType const &
is the type of the parameter
of the class's member function push_back()
. For example, the following
program defines a (compilable) skeleton of a class IntStore
, whose objects
can be used as arguments of the back_inserter
iterator:
#include <algorithm> #include <iterator> using namespace std; class Y { public: typedef int const &const_reference; void push_back(int const &) {} }; int main() { int arr[] = {1}; Y y; copy(arr, arr + 1, back_inserter(y)); }
istream_iterator<Type>()
can be used to define an iterator (pair) for
istream
objects. The general form of the istream_iterator<Type>()
iterator is:
istream_iterator<Type> identifier(istream &inStream)Here,
Type
is the type of the data elements that are read from
the istream
stream. Type
may be any type for which operator
>> is
defined with istream
objects.
The default constructor defines the end of the iterator pair, corresponding to end-of-stream. For example,
istream_iterator<string> endOfStream;Note that the actual stream object which was specified for the begin-iterator is not mentioned here.
Using a back_inserter()
and a set of
istream_iterator<>()
adaptors, all strings could be read from cin
as
follows:
#include <algorithm> #include <iterator> #include <string> #include <vector> using namespace std; int main() { vector<string> vs; copy(istream_iterator<string>(cin), istream_iterator<string>(), back_inserter(vs)); for ( vector<string>::iterator from = vs.begin(); from != vs.end(); ++from ) cout << *from << " "; cout << endl; return 0; }In the above example, note the use of the anonymous versions of the
istream_iterator
adaptors. Especially note the use of the anonymous
default constructor. The following (non-anonymous) construction could have
been used instead of istream_iterator<string>()
:
istream_iterator<string> eos; copy(istream_iterator<string>(cin), eos, back_inserter(vs));Before
istream_iterators
can be used the following preprocessor
directive must have been specified:
#include <iterator>This is implied when
iostream
is included.
streambuf
objects.
Before
istreambuf_iterators
can be used the following preprocessor
directive must have been specified:
#include <iterator>The
istreambuf_iterator
is available for reading from streambuf
objects supporting
input operations. The standard operations
that are available for
istream_iterator
objects are also available
for istreambuf_iterators
. There are three
constructors:
istreambuf_iterator<Type>()
:This constructor represents the end-of-stream iterator while extracting values of typeType
from thestreambuf
.
istreambuf_iterator<Type>(istream)
:This constructor constructs anistreambuf_iterator
accessing thestreambuf
of theistream
object, used as the constructor's argument.
istreambuf_iterator<Type>(streambuf *)
:This constructor constructs anistreambuf_iterator
accessing thestreambuf
whose address is used as the constructor's argument.
istreambuf_iterators
and
ostreambuf_iterators
.
ostream_iterator<Type>()
can be used to define a destination iterator
for an
ostream
object. The general forms
of the ostream_iterator<Type>()
iterator are:
ostream_iterator<Type> identifier(ostream &outStream), // and: ostream_iterator<Type> identifier(ostream &outStream, char const *delim);
Type
is the type of the data elements that should be written to the
ostream
stream. Type
may be any type for which operator
<< is defined
in combinations with ostream
objects. The latter form of the
ostream_iterators
separates the individual Type
data elements by
delimiter
strings. The former definition does not use any delimiters.
The following example shows how
istream_iterators
and an ostream_iterator
may
be used to copy information of a file to another file. A subtlety is the
statement in.unsetf(ios::skipws)
: it resets the
ios::skipws
flag. The
consequence of this is that the default behavior of operator
>>, to skip
whitespace, is modified. White space characters are simply returned by the
operator, and the file is copied unrestrictedly. Here is the program:
Before
ostream_iterators
can be used the following preprocessor
directive must have been specified:
#include <iterator>
ostreambuf_iterator
can be used the following preprocessor
directive must have been specified:
#include <iterator>The
ostreambuf_iterator
is available for writing to streambuf
objects supporting
output operations. The standard operations
that are available for
ostream_iterator
objects are also available
for ostreambuf_iterators
. There are two
constructors:
ostreambuf_iterator<Type>(ostream)
:This constructor constructs anostreambuf_iterator
accessing thestreambuf
of theostream
object, used as the constructor's argument, to insert values of typeType
.
ostreambuf_iterator<Type>(streambuf *)
:This constructor constructs anostreambuf_iterator
accessing thestreambuf
whose address is used as the constructor's argument.
istreambuf_iterators
and an ostreambuf_iterator
, showing yet another
way to copy a stream:
#include <iostream> #include <algorithm> #include <iterator> using namespace std; int main() { istreambuf_iterator<char> in(cin.rdbuf()); istreambuf_iterator<char> eof; ostreambuf_iterator<char> out(cout.rdbuf()); copy(in, eof, out); return 0; }
fun()
, a memory leak is created by calling
fun()
: the allocated int
value remains inaccessibly allocated:
void fun() { new int; }To prevent memory leaks strict bookkeeping is required: the programmer has to make sure that the memory pointed to by a pointer is deleted just before the pointer variable goes out of scope. The above example could easily be repaired:
void fun() { delete new int; }Now
fun()
merely wastes some time.
When a pointer variable points to a single value or object, the
bookkeeping requirements may be relaxed when the pointer variable is defined
as a std::unique_ptr
object. Unique_ptrs are objects,
masquerading as pointers. Since they're objects, their destructors are called
when they go out of scope, and because of that, their destructors will take
the responsibility of deleting the dynamically allocated memory.
Before unique_ptrs
can be used the following preprocessor directive must
have been specified:
#include <memory>Normally, an
unique_ptr
object is initialized with a dynamically
created value or object.
The following should be considered when using unique_ptr
s:
unique_ptr
to another move semantics is
used. If move semantics are not available compilation will fail. On the other
hand, if compilation succeeds then the used containers or generic algorithms
support the use of unique_ptr
s. Here is an example:
std::unique_ptr<int> up1(new int); std::unique_ptr<int> up2 = up1; // compilation errorThe second definition fails to compile since
unique_ptr
's copy
constructor is private (the same holds true for the assignment
operator). The unique_ptr
class does offer facilities to initialize
and assign from rvalue references:
class unique_ptr // interface partially shown { public: unique_ptr(unique_ptr &&other); // rvalues bind here private: unique_ptr(const unique_ptr &other); };Consequently, in the above example move semantics should be used. E.g.,
std::unique_ptr<int> up1(new int); std::unique_ptr<int> up2 = std::move(up3);
unique_ptr
object should only point to memory that was made
available dynamically, as only
dynamically allocated memory can be deleted.
unique_ptr
objects should not be allowed to point to the
same block of dynamically allocated memory. The unique_ptr
's interface was
designed to prevent this from happening. Once an unique_ptr
object goes out
of scope, it deletes the memory it points to, immediately changing any other
object also pointing to the allocated memory into a
wild
pointer.
class unique_ptr
defines several member functions
to access the pointer itself or to have the unique_ptr
point to another
block of memory. These member functions and ways to construct unique_ptr
objects are discussed in the next sections.
A unique_ptr
(as well as a shared_ptr
, see section 18.4) can
be used as a safe alternative to the now deprecated auto_ptr
. It also
augments auto_ptr
as it can be used with containers and algorithms; as it
adds customizable deleters and as it adds the possibility to handle arrays.
unique_ptr
objects. Each definition contains the usual <type>
specifier between
angle brackets. Concrete examples are given in the coming sections, but an
overview of the various possibilities is presented here:
unique_ptr
object to point to a block
of memory allocated by the new
operator:
unique_ptr<type> identifier (new-expression [, deleter]);This form is discussed in section 18.3.2.
unique_ptr
object using move
semantics, either supported by the data type itself or forced using
std::move
:
// type must support move semantics or std::move(other) must be used unique_ptr<type> identifier(another unique_ptr for type);This form is discussed in section 18.3.4.
unique_ptr
object that
does not point to a particular block of memory:
unique_ptr<type> identifier;This form is discussed in section 18.3.5.
unique_ptr
object is to provide its constructor with a block of memory allocated
by
operator new
operator. The generic form is:
unique_ptr<type [, deleter_type]> identifier(new-expression [, deleter = deleter_type()]);The second (template) argument (
deleter(_type)
) is optional and may
refer to a class/object handling the destruction of the allocated memory. This
is used, e.g., in situations where a double pointer is allocated and the
destruction must visit each nested pointer to destroy the memory it points
at (see below).
Here is an example initializing an unique_ptr
pointing to a string
object:
unique_ptr<string> strPtr(new string("Hello world"));To initialize an
unique_ptr
to point to a double
value the
following construction can be used:
unique_ptr<double> dPtr(new double(123.456));Note the use of
operator new
in the above expressions. Using new
ensures the dynamic nature of the memory pointed to by the unique_ptr
objects and allows the deletion of the memory once unique_ptr
objects go
out of scope. Also note that type
does not mention the pointer:
the
type used in the unique_ptr
construction
is the same as used in new
expressions.
The next example shows how an explicitly defined deleter may be used to delete a dynamically allocated array of pointers to strings. Instead of using a template value parameter the deleter's constructor could of course also be given a parameter initializing the deleter with the number of elements to delete:
#include <iostream> #include <memory> using namespace std; template <typename Type, size_t size> struct D2 { void operator()(Type * ptr) const { for (size_t idx = 0; idx < size; ++idx) delete ptr[idx]; delete ptr; } }; int main() { unique_ptr<int *, D2<int *, 10>> sp2(new int *[10]); D2<int *, 10> &obj = sp2.get_deleter(); }
In the example allocating an int
values given in section 18.3,
the memory leak can be avoided using an unique_ptr
object:
#include <memory> using namespace std; void fun() { unique_ptr<int> ip(new int); }All member functions available for objects allocated by the
new
expression can be reached via the
unique_ptr
as if it was a plain pointer to the dynamically allocated
object. For example, in the following program the text `C++
' is inserted
behind the word `hello
':
#include <iostream> #include <memory> #include <cstring> using namespace std; int main() { unique_ptr<string> sp(new string("Hello world")); cout << *sp << endl; sp->insert(strlen("Hello "), "C++ "); cout << *sp << endl; } /* produced output: Hello world Hello C++ world */
unique_ptr
s to stored arrays the dereferencing operator makes
little sense. Conversely, when arrays are handled by smart pointers they will
benefit from an index operator.
When using smart pointers (unique_ptr
and shared_ptr
, see section
18.4) to handle arrays the following additional syntax is available:
[]
notation is
used. E.g.,
unique_ptr<int[]> intArr(new int[3]);
intArr[2] = intArr[0]; it() The default deleter will use tt(delete[]).
unique_ptr
may also be initialized
by another unique_ptr
object for the same type provided type
supports
move semantics or move semantics is forced using
std::move
.
The generic form is:
unique_ptr<type> identifier(other unique_ptr object);For example, to initialize an
unique_ptr<string>
, given the variable
sp
defined in the previous section, the following construction can be
used:
unique_ptr<string> sp2(move(sp));This move-construction can be used when the data type specified with
unique_ptr
implements its move-constructor.
Analogously, the assignment operator can
be
used. An unique_ptr
object may be assigned to another unique_ptr
object of the same type, again using move-semantics. For example:
#include <iostream> #include <memory> #include <string> using namespace std; int main() { unique_ptr<string> hello1(new string("Hello world")); unique_ptr<string> hello2(move(hello1)); unique_ptr<string> hello3; hello3 = move(hello2); cout << // *hello1 << endl << // would segfault // *hello2 << endl << // same *hello3 << endl; } /* Produced output: Hello world */Looking at the above example, we see that
hello1
is initialized as described in the previous section.
hello2
is defined, and it receives its value from
hello1
using a
move constructor initialization. This
effectively changes hello1
into a 0-pointer.
hello3
is defined as a default unique_ptr<string>
, but
it receives its value through a move-assignment from hello2
, which then
becomes a 0-pointer too.
hello1
or hello2
would be inserted
into cout
a
segmentation fault would be generated. The reason for
this will now be clear: it is caused by dereferencing 0-pointers. In the end,
only hello3
actually points to a string
.
unique_ptr
object: Without arguments an empty unique_ptr
object is
constructed not pointing to a particular block of memory:
unique_ptr<type> identifier;In this case the underlying pointer is set to
0
(zero). Although the unique_ptr
object itself is not the pointer,
its value can be compared to 0
to see if it has been
initialized. E.g., code like
unique_ptr<int> ip; if (!ip) cout << "0-pointer with a unique_ptr object" << endl;will produce the shown text. Alternatively, the member
get()
is available. This member function,
as well as the other member functions of the class unique_ptr
are
described in the next section.
unique_ptr
:
unique_ptr &unique_ptr<Type>operator=(unique_ptr<Type> &other)
:This operator will transfer the memory pointed to by the rvalueunique_ptr
object to the lvalueunique_ptr
object using move semantics. So, the rvalue object loses the memory it pointed at, and turns into a 0-pointer.
unique_ptr &unique_ptr<Type>operator bool() const
:This operator returnsfalse
if theunique_ptr
does not point to memory (i.e., itsget()
member, see below, returns 0). Otherwise,true
is returned.
Type &unique_ptr<Type>operator*()
:This operator returns a reference to
the information stored in the unique_ptr
object. It acts like a normal
pointer
dereference operator.
Type *unique_ptr<Type>operator->()
:This operator returns a pointer to the information stored in theunique_ptr
object. Through this operator members of a stored object can be selected. For example:unique_ptr<string> sp(new string("hello")); cout << sp->c_str() << endl;
unique_ptr
objects:
Type *unique_ptr<Type>::get()
:This member does the same asoperator->()
: it returns a pointer to the information stored in theunique_ptr
object. This pointer can be inspected: if it's zero theunique_ptr
object does not point to any memory. This member cannot be used to let theunique_ptr
object point to (another) block of memory.
Deleter &unique_ptr<Type>::get_deleter()
:This member returns a
reference to the
deleter class object used by the unique_ptr
.
Type *unique_ptr<Type>::release()
:This member returns a pointer to the information stored in theunique_ptr
object, which loses the memory it pointed at (and changes into a 0-pointer). The member can be used to transfer the information stored in theunique_ptr
object to a plainType
pointer. It is the responsibility of the programmer to delete the memory returned by this member function.
void unique_ptr<Type>::reset(Type *)
:This member may also be called without argument, to delete the memory stored in theunique_ptr
object, or with a pointer to a dynamically allocated block of memory, which will thereupon be the memory accessed by theunique_ptr
object. This member function can be used to assign a new block of memory (new content) to anunique_ptr
object.
void unique_ptr<Type>::swap(unique_ptr<Type> &&)
:This member is used to swap two identically typed unique_ptr
s.
std::auto_ptr<Type>
. This class did
not support
move semantics although when one auto_ptr
object was
assigned to another, the right-hand object lost the information it originally
pointed at.
The std::unique_ptr
, available since the C++0x standard does not have
auto_ptr
's drawbacks and consequently using auto_ptr
is now
deprecated.
The following restrictions
apply to
auto_ptrs
:
auto_ptr
class does not support move semantics;
auto_ptr
object cannot be used to point to
arrays of objects;
auto_ptr
cannot be used as a data type of abstract
containers.
No further coverage of the auto_ptr
class is offered by the C++
Annotations. Existing software should be modified to use unique_ptr
s,
newly designed software should avoid using auto_ptr
objects.
std::unique_ptr
the C++0x standard offers a reference
counting smart pointer by the name of
std::shared_ptr<Type>
.
The shared pointer will automatically destroy its contents once its reference count has decayed to zero.
Different from unique_ptr
s shared_ptr
s support copy constructors and
standard overloaded assignment operators. Since shared_ptr
s share the
memory they point at
move semantics is not a requirement.
As with unique_ptr
a shared_ptr
may refer to a dynamically allocated
array.
shared_ptr
objects. Each definition contains the usual <type>
specifier between
angle brackets. Concrete examples are given in the coming sections, but an
overview of the various possibilities is presented here:
shared_ptr
object to point to a block
of memory allocated by the new
operator:
shared_ptr<type> identifier (new-expression [, deleter]);This form is discussed in section 18.4.2.
shared_ptr
objects point at the same memory, and the
object's reference count is incremented.
shared_ptr
object using move
semantics, either supported by the data type itself or forced using
std::move
:
// type must support move semantics or std::move(other) must be used shared_ptr<type> identifier(a temporary shared_ptr for type);This form is discussed in section 18.4.3.
shared_ptr
object that
does not point to a particular block of memory:
shared_ptr<type> identifier;This form is discussed in section 18.4.4.
shared_ptr
object is to provide its constructor with a block of memory allocated
by
operator new
operator. The generic form is:
shared_ptr<type [, deleter_type]> identifier(new-expression [, deleter = deleter_type()]);The second (template) argument (
deleter(_type)
) is optional and may
refer to a class/object handling the destruction of the allocated memory. It
is used in situations comparable to those encountered with unique_ptr
(cf. section 18.3.2).
Here is an example initializing an shared_ptr
pointing to a string
object:
shared_ptr<string> strPtr(new string("Hello world"));Note the use of
operator new
in the above expression. The type
specified for the shared_ptr
is identical to the type used in new
expression.
All
member functions available for
objects allocated by the new
expression can be reached via the
shared_ptr
as if it was a plain pointer to the dynamically allocated
object. For example, in the following program the text `C++
' is inserted
behind the word `hello
'. This example also shows the use of a copy
constructed shared pointer and it shows that following the copy construction
both objects point at the same information:
#include <iostream> #include <memory> #include <cstring> using namespace std; int main() { shared_ptr<string> sp(new string("Hello world")); shared_ptr<string> sp2(sp); sp->insert(strlen("Hello "), "C++ "); cout << *sp << '\n' << *sp2 << endl; } /* produced output: Hello C++ world Hello C++ world */
shared_ptr
may also be initialized by an r-value reference (a temporary
value) initialized an returned by a function. E.g.,
std::shared_ptr<std::string> &&makeString(char const *text) { return std::move(shared_ptr<string>(new std::string(text))); } std::shared_ptr<std::string> shared(makeString("hello world"));
shared_ptr
object: Without arguments an empty shared_ptr
object is
constructed not pointing to a particular block of memory:
shared_ptr<type> identifier;In this case the underlying pointer is set to
0
(zero). Like a std::unique_ptr
a shared_ptr
object can be
compared to 0
to see if it has been initialized. E.g., code like
shared_ptr<int> ip; if (!ip) cout << "0-pointer with a shared_ptr object" << endl;will produce the shown text. Alternatively, the member
get()
is available. This member function,
as well as the other member functions of the class shared_ptr
are
described in the next section.
shared_ptr
:
shared_ptr &shared_ptr<Type>operator=(shared_ptr<Type> &other)
:This operator reduces the reference count of the left-hand side object, deleting its memory when its count decays to zero, and sets its pointer to the memory pointed at by right-hand side object, incrementing its reference count.
shared_ptr &shared_ptr<Type>operator bool() const
:This operator returnsfalse
if theshared_ptr
does not point to memory (i.e., itsget()
member, see below, returns 0). Otherwise,true
is returned.
Type &shared_ptr<Type>operator*()
:This operator returns a reference to
the information stored in the shared_ptr
object. It acts like a normal
pointer
dereference operator.
Type *shared_ptr<Type>operator->()
:This operator returns a pointer to the information stored in theshared_ptr
object. Through this operator members of a stored object can be selected. For example:shared_ptr<string> sp(new string("hello")); cout << sp->c_str() << endl;
The following
member functions are defined for shared_ptr
objects:
Type *shared_ptr<Type>::get()
:This member does the same asoperator->()
: it returns a pointer to the information stored in theshared_ptr
object. This pointer can be inspected: if it's zero theshared_ptr
object does not point to any memory. This member cannot be used to let theshared_ptr
object point to (another) block of memory.
Deleter &shared_ptr<Type>::get_deleter()
:This member returns a
reference to the
deleter class object used by the shared_ptr
.
Type *shared_ptr<Type>::release()
:This member returns a pointer to the information stored in theshared_ptr
object, which loses the memory it pointed at (and changes into a 0-pointer). The member can be used to transfer the information stored in theshared_ptr
object to a plainType
pointer. It is the responsibility of the programmer to delete the memory returned by this member function.
void shared_ptr<Type>::reset(Type *)
:This member may also be called without argument, to delete the memory stored in theshared_ptr
object, or with a pointer to a dynamically allocated block of memory, which will thereupon be the memory accessed by theshared_ptr
object. This member function can be used to assign a new block of memory (new content) to anshared_ptr
object.
void shared_ptr<Type>::swap(shared_ptr<Type> &&)
:This member is used to swap two identically typed shared_ptr
s.
bool shared_ptr<Type>unique() const
:This member returns.true
if the memory is referenced by the current object only and returnsfalse
otherwise
size_t shared_ptr<Type>use_count() const
:This member returns the number of objects sharing the memory pointed at by their data pointers.
shared_ptr
's main features have been described, consider
the following simple class:
// required #includes class Map { std::map<string, Data> *d_map; public: Map(char const *filename) throw(std::exception); };The class's constructor
Map()
performs the following tasks:
std::map
object;
Map::Map(char const *fname) : d_map(new std::map<std::string, Data>) throw(std::exception) { ifstream istr(fname); if (!istr) throw std::exception("can't open the file"); fillMap(istr); }What's wrong with this implementation? Its main weakness is that it hosts a potential memory leak. The memory leak only occurs when the exception is actually thrown. In all other cases, the function operates perfectly well. When the exception is thrown, the map has just been dynamically allocated. However, even though the class's destructor will dutifully call
delete d_map
, the destructor is actually never called, as the destructor
will only be called to destroy
objects that were constructed completely. Since the constructor terminates in
an exception, its associated object is not constructed completely, and
therefore that object's destructor is never called.
Shared_ptr
s (as well as unique_ptr
s, cf. section 18.3)
may be used to prevent these kinds of problems. By defining d_map
as
std::shared_ptr<std::map<std::string, Data> >it suddenly changes into an object. Now,
Map
's constructor may safely
throw an exception. As d_map
is an object itself, its destructor will be
called by the time the (however incompletely constructed) Map
object goes
out of scope.
As a
rule of thumb: classes should use shared_ptr
or unique_ptr
objects, rather than plain pointers for their pointer data members if there's
any chance that their constructors will end prematurely in an exception.
The Annotations don't aim at introducing the concepts behind multi threading. It is a topic by itself and many good reference sources exist (cf. Nichols, B, et al.'s Pthreads Programming, O'Reilly for a good introduction to multi-threading).
Multi threading facilities are offered through the class
std::thread
. Its
constructor and assignment operator accept a function (object) that will
handle the thread created by the thread
object.
Thread synchronization is handled by objects of the class
std::mutex
and
condition variables are implemented by the class
std::condition_variable
.
In order to use multi threading in C++ programs the Gnu g++
compiler
requires the use of the
-pthread
flag. E.g., to compile a multi-threaded program defined in the source file
multi.cc
the compiler is called as follows:
g++ --std=c++0x -pthread -Wall multi.cc
Threads in C++ are very much under development. It is likely that in the near future features will be added and possibly redefined. The sections below should therefore be read with this in mind.
std::thread
class implements a new thread. It can be constructed empty
(in which case no new thread is started yet) but it may also be given a
function or function object, in which case the thread is started immediately.
Alternatively, a function or function object may be assigned to an
existing thread
object causing a new thread to be launched using that
function (object).
There are several ways to end a launched thread. Currently a thread ends when
the function called from the thread object ends. Since the thread
object not only accepts functions but also function objects as its argument,
a
local context may be passed on to the the thread. Here are two examples
of how a thread may be started: in the first example a function is passed on
to the thread, in the second example a function object:
#include <iostream> #include <thread> #include <cstdlib> using namespace std; void hello() { cout << "hello world\n"; } class Zero { int *d_data; size_t d_size; public: Zero(int *data, size_t size) : d_data(data), d_size(size) {} void operator()(int arg) const { for (int *ptr = d_data + d_size; ptr-- != d_data; ) *ptr = 0; } }; int main() { int data[30]; Zero zero(data, 30); int value = 0; std::thread t1(zero, value); std::thread t2(hello); };
Thread objects do not implement a copy constructor, but a move constructor is provided. Threads may be swapped as well, even if they're actually running a thread. E.g.,
std::thread t1(Thread()); std::thread t2(Thread()); t1.swap(t2);
According to the current specifications of the thread
class its
constructor should also be able to accept additional arguments (in addition to
the function (object) handled by the thread object), but currently that
facility does not appears to be very well implemented. Any objects otherwise
passed to the function call operator could of course also be passed using
separate members or via the object's constructors, which is probably an
acceptable makeshift solution until multiple arguments can be passed to the
thread constructor itself (using perfect forwarding, cf. section
21.5.2).
The thread's overloaded assigment operator can also be used to start a thread. If the current thread object actually runs a thread it is stopped, and the function (object) assigned to the thread object becomes the new running thread. E.g.,
std::thread thr; // no thread runs from thr yet thr = Thread(); // a thread is launched
Threads (among which the thread represented by main
) may be forced to wait
for another threds's completion by calling the other thread's join()
member. E.g., in the following example main
launches two threads and wait
for the completion of both:
int main() { std::thread t1(Thread()); std::thread t2(Thread()); t1.join(); // wait for t1 to complete t2.join(); // same, t2 }
The thread's member detach()
can be called to disassociate the current
thread from its starting thread. Ending a thread other than ending the
activities of the function defining the thread's activities appears currently
very well implemented.
std::mutex
class a
std::recursive_mutex is
offered: when called multiple times by the same thread it will increase its
lock-count. Before other threads may access the protected data the recursive
mutex must be unlocked again that number of times. In addition
std::timed_mutex
and
recursive_timed_mutex
is offered: their locks
will time out after a preset amount of time.
In many situations locks will be released at the end of some action
block. To simplify locking additional template classes
std::unique_lock<>
and
std::lock_guard<>
are provided. As their
constructor locks the data and their destructor unlocks the data they can be
defined as local variables, unlocking their data once their scope terminates.
Here is a simple example showing its use; at the end of safeProcess guard
is destroyed, thereby releasing the lock on data
:
std::mutex dataMutex; Data data; void safeProcess() { std::lock_guard<std::mutex> guard(dataMutex); process(data); }A
std::unique_lock
is used similarly, but is used when timeouts must be
considered as well:
std::timed_mutex dataMutex; Data data; void safeProcess() { std::unique_lock<std::timed_mutex> guard(dataMutex, std::chrono::milliseconds(3)); if (guard) process(data); }In the above example
guard
tries to obtain the lock during three
milliseconds. If guard
's bool
conversion operator returns true
the
lock was obtained and data
can be processed safely.
A common cause of problems with multi threaded programs is the occurrence of
deadlocks. If a deadlock may occur when two locks are required to process
data, but one thread obtains the first lock and another thread obtains the
second lock. The C++0x standard defines a generic
std::lock
function that
may be used to prevent problems like these. The std::lock
function
can be used to lock multiple mutexes in one atomic action. Here is an example:
struct SafeString { std::mutex d_mutex; std::string d_text; }; void calledByThread(SafeString &first, SafeString &second) { std::unique_lock<std::mutex> // 1 lock_first(first.d_mutex, std::defer_lock); std::unique_lock<std::mutex> // 2 lock_second(second.d_mutex, std::defer_lock); std::lock(lock_first, lock_second); // 3 safeProcess(first.d_text, second.d_text); }At 1 and 2 unique_locks are created. Locking is deferred until calling
std::lock
in 3. Having obtained the lock, the two SafeString
text
members can both be safely processed by calledByThread
.
Another problematic issue with threads involves initialization. If multiple threads are running and only the first thread encountering the initialization code should perform the initialization then this problem should not be solved by mutexes. Although the synchronization is accomplished, it will needlessly be accomplished time and again once the initialization phase has been completed. The C++0x standard offers several ways to perform a proper initialization:
g++
compiler and its
discussion is therefore postponen.
#include <iostream> struct Cons { Cons() { std::cout << "Cons called\n"; } }; void called(char const *time) { std::cout << time << "time called() activated\n"; static Cons cons; } int main() { std::cout << "Pre-1\n"; called("first"); called("second"); std::cout << "Pre-2\n"; Cons cons; } /* Generated output: Pre-1 firsttime called() activated Cons called secondtime called() activated Pre-2 Cons called */This feature causes a thread to wait automatically if another thread is still initializing the static data (note that non-static data never cause problems, as each non-static local variables have lifes that are completely restricted to their own threads).
std::call_once
and std::once_flag
result in the one-time execution of
a specified function as illustrated by the next example:
std::string *global; std::once_flag globalFlag; void initializeGlobal() { global = new std::string("Hello world (why not?)"); } void safeUse() { std::call_once(globalFlag, initializeGlobal); process(*global); }
Mutexes may be used in C++0x after including the header file
mutex
.
At some point the producer will therefore have to wait until the client has consumed enough and there's space available in the producer's storage. Similarly, the client will have to wait until the producer has produced some more items.
Locking and polling the amount of available items/storage at fixed time intervals usually isn't a good option as it is in principle a wasteful scheme: threads continue to wait during the full interval even though the condition to continue may already have been met; reducing the interval, on the other hand, isn't an attractive option either as it results in a relative increase of the overhead associated with handling the associated mutexes.
Condition variables may be used to solve these kinds of problems. A thread simply sleeps until it is notified by another thread. In general terms this may be accomplished as follows:
producer loop: - produce the next item - wait until there's room to store the item, then reduce the available room - store the item - increment the number of items in store consumer loop: - wait until there's an item in store, then reduce the number of items in store - remove the item from the store - increment the number of available storage locations - do something with the retrieved item
It is important that the two storage administrative tasks (registering the number of available items and available storage locations) are either performed by the client or by the producer. `Waiting' in this case means:
condition_variable
is used. The variable
containing the actual count is called semaphore
; and it can be protected
by a mutex sem_mutex
. In addition a condition_variable condition
is
defined. The waiting process is defined in the following function down
implemented as follows:
void down() { unique_lock<mutex> lk(sem_mutex); // get the lock while (semaphore == 0) condition.wait(lk); // see 1, below. --semaphore; // dec. semaphore count } // the lock is releasedAt 1 the condition variable's
wait()
member internally releases the
lock, wait for a notification to continue, and re-acquires the lock just
before returning. Consequently, the down
's code always has complete and
unique control over semaphore
.
What about notifying the condition variable? This is handled by the
`increment the number ...' parts in the abovementioned produced and consumer
loops. These parts are defined by the following up()
function, implemented
as follows:
void up() { lock_guard<std::mutex> lk(sem_mutex); // get the lock if (semaphore++ == 0) condition.notify_one(); // see 2, below } // the lock is releasedAt 2
semaphore
is always incremented. However, by using a postfix
increment it can be tested for being zero at the same time and if it was zero,
initially then semaphore
is now one. Consequently, the thread waiting for
semaphore
being unequal to zero may now
continue. Condition.notify_one()
will notify a waiting
thread (see down
's implementation above). If situations where multiple
threads are waiting `
notify_all()
' can be used.
Handling semaphore
can very well be implemented in a class
Semaphore
, offering members down()
and up()
. For a more extensive
discussion of semaphores see
Tanenbaum, A.S. (2006)
Structured Computer Organization, Pearson Prentice-Hall.
Using the semaphore facilities, embedded in a class Semaphore
whose
constructor expects an initial value of its semaphore
data member, the
classic producer and consumer case can now easily be implemented in the
following multi-threaded program (A more elaborate example of the
producer-consumer program is found in the yo/stl/examples/events.cc
file
in the C++ Annotations's source archive):
Semaphore available(10); Semaphore filled(0); std::queue itemQueue; void producer() { size_t item = 0; while (true) { ++item; available.down(); itemQueue.push(item); filled.up(); } } void client() { while (true) { filled.down(); size_t item = itemQueue.front(); itemQueue.pop(); available.up(); process(item); // not implemented here } } int main() { thread produce(producer); thread consume(consumer); produce.join(); return 0; }
To use condition_variable
objects the header file
condition_variable
must be included.
g++
compiler this section will only cover the concept, the proposed
syntax and some examples.
As we've seen the generic algorithms often accept an argument which can either
be a function object or a plain function. Examples are the sort
and
find_if
generic algorithms. When the function called by the generic
algorithm must remember its state a function object is appropriate, otherwise
a plain function will do.
The function or function object is usually not readily available. Often it must be defined in or near the function using the generic algorithm. Often the software engineer will resort to anonymous namespaces in which a class or function is defined that is thereupon used in the function calling the generic algorithm. If that function is itself a member function the need may be felt to access other members of its class from the function called by the generic algorithm. Often this results in a significant amount of code (defining the class), in complex code (to make available software elements that aren't native to the called function (object)), and -at the level of the source file- code that is irrelevant at the current level of specification. Nested classes don't solve these problems and, moreover, nested classes can't be used in templates.
Solutions to the above problems exist. In section 23.8 a general approach is discussed allowing the use of members and/or local variables in the context of a function or function object called from generic algorithms.
However, the solutions given in section 23.8 are in fact makeshift solutions that need to be used as long as the language doesn't offer lambda functions. A lambda function is an anonymous function. Such a function may be defined on the spot, and will exist only during the lifetime of the statement of which it is a part.
Here is an example of the definition of a lambda function:
[](int x, int y) { return x * y; }This particular function expects two
int
arguments and returns their
product. It could be used e.g., in combination with the accumulate
generic
algorithm to compute the product of a series of int
values stored in a
vector:
cout << accumulate(vi.begin(), vi.end(), 1, [](int x, int y) { return x * y; });The above lambda function will use the implicit return ih(return type: implicit) type
decltype(x * y)
. An implicit return type can
only be used if the lambda function has a single statement of the form
return expression;
.
Alternatively, the return type can be explicitly specified using a late-specified return type, (cf. section 3.3.5):
[](int x, int y) -> int { int z = x + y; return z + x; }A lambda function not returning a value (i.e., a void-function) does not have to specify a return type at all.
Variables having the same scope as the lambda function can be accessed from the lambda function using references. This allows passing the local context to a lambda function. Such variables are called a closure. Here is an example:
void showSum(vector<int> &vi) { int total = 0; for_each( vi.begin(), vi.end(), [&total](int x) { total += x; } ); std::cout << total; }The variable
int total
is passed to the lambda function as a reference
([&total]
) and can directly be accessed by the function. Its parameter
list merely defines an int x
, which is initialized in turn by each of the
values stored in vi
. Once the generic algorithm has completed
showSum
's variable total
has received a value equal to the sum of all
the vector's values. It has outlived the lambda function and its value is
displayed.
If a closure variable is defined without the reference symbol it becomes a
simple value which is initialized by the local variable when the lambda
function is passed to the generic algorithm. Usually closure variables are
passed by reference. If all local variables are to be passed by reference
the notation
[&]
can be used (to pass the full closure by value use
[=]
):
void show(vector<int> &vi) { int sum = 0; int prod = 1; for_each( vi.begin(), vi.end(), [&](int x) { sum += x; prod *= x; } ); std::cout << sum << ' ' << prod; }
Combinations of variables passed by value, but some by reference or passed by reference but some by value is possible. In that case the default is followed by a list of variables passed differently. E.g.,
[&, value](int x) { total += x * value; };In this case
total
will be passed by reference, value
by value.
Within a class context, members may define lambda functions as well. In those cases the lambda function has full access to all the class's members. In other words, it is defined as a friend of the class. For example:
class Data { std::vector<std::string> d_names; public: void show() const; { int count = 0; std::for_each(d_names.begin(), d_end(), [this, &count](std::string const &name) { std::cout << ++count << this->capitalized(name) << endl; } ) } private: std::string capitalized(std::string const &name); }Note the use of the
this
pointer: inside the lambda function it must
explicitly be used (so,this->capitalized(name)
is used rather than
capitalized(name)
). In addition, in situations like the above this
will automatically be available when [&]
or [=]
is used. So, the above
lambda function could have been defined as:
[&](std::string const &name) { std::cout << ++count << this->capitalized(name) << endl; }When lambda functions are compared to, e.g., the function wrappers discussed in section 23.8 it probably won't come as a surprise that lambda functions are function objects. It is possible to assign a lambda function to a variable. An example of such an assignment (using
auto
to define the type of the variable) is:
auto lambdaFun = [this]() { this->member(); };
Lambda functions are not yet supported by the g++
compiler.
To accomodate for these situations the C++0x standard introduces polymorphous (function object) wrappers. Polymorphous wrappers can refer to function pointers, member functions or functors, as long as they match in type and number of their parameters.
Polymorphous wrappers are not yet fully supported by the g++
compiler.
Using these distrbutions is based on new randum number generator functions, differing from the traditional rand(3) function provided by the C standard library. These random number generators are used to produce pseudo-random numbers, which are then filtered through a particular distribution to obtain values that are randomly selected from the specified distributino.
Class template | Integral/Floating point | Quality | Speed | Size of state |
linear_congruential | Integral | Medium | Medium | 1 |
subtract_with_carry | Both | Medium | Fast | 25 |
mersenne_twister | Integral | Good | Fast | 624 |
The linear_congruential
random number generator computes
value
i+1 = a * value
i + c % m
a
, the additive constant c
and the modulo value
m
. E.g.,
linear_congruential<int, 10, 3, 13> lc;The
linear_congruential
generator may also be seeded by providing its
constructor with a seeding-argument. E.g., lc(time(0))
.
The subtract_with_carry
random number generator computes
value
i = value
i-s - value
i-r - carry
i-1 % m
m
, the subtractive constants s
and r
,
respectively. E.g.,
subtract_with_carry<int, 13, 3, 13> sc;The
subtract_with_carry
generator may also be seeded by providing its
constructor with a seeding-argument. E.g., sc(time(0))
.
The predefined mersenne_twister mt19937
(predefined using a typedef
defined by the random
header file) is used in the
examples below. It can be constructed using
std::mt19937
mt
or it can
be seeded using an argument (e.g., std::mt19937 mt(time(0))
). Other ways
to initialize the mersenne_twister
are byond the scope of the C++
Annotations (but see Lewis
et al. (
Lewis, P.A.W., Goodman, A.S., and Miller, J.M. (1969), A pseudorandom
number generator for the System/360, IBM Systems Journal, 8, 136-146.) (1969)).
The random number generators may also be seeded by calling their members
seed()
which accepts an unsigned long
or a generator function (e.g.,
lc.seed(lc), lc.seed(mt)
).
The random number generators implement members min()
and max()
returning, respectively, their minimum and maximum values (inclusive). If a
reduced range is required the generators can be nested in a function or class
changing its range.
All distributions offer the following members (result_type being the
type name of the values returned by the distribution, distribution-name
being the type name of the mathematical distribution, e.g.,
bernoulli_distribution
):
result_type operator()(RandomNumberGenerator):
The next randomly generated value is returned.
std::istream &operator<<(std::istream &in,
distribution-name &object):
A parameters of the distribution are extracted from an
std::istream
;
std::ostream &operator<<(std::ostream &out,
bernoulli_distribution const &bd):
A parameters of the distribution are inserted into an
std::ostream
Distributions:
bernoulli_distribution
true
values are generated with a certain probability
p
.bernoulli_distribution(double p = 0.5)
binomial_distribution<IntType = int, RealType = double>
IntType
: the type of the generated random value (by default
int
). RealType
: the probability parameter of the binomial
distribution.binomial_distribution<IntType, RealType>(IntType t,
RealType p =
RealType(0.5))
gamma_distribution<Type = double>
Type
: the parameter of the gamma distribution.gamma_distribution<IntType, Type>(Type alpha = Type(1))
geometric_distribution<IntType = int, RealType = double>
IntType
: the type of the generated random value (by default
int
). RealType
: the probability parameter of the geometric
distribution.geometric_distribution<IntType, RealType>(RealType p =
RealType(0.5))
exponential_distribution<Type = double>
Type
: the parameter of the exponential distribution.exponential_distribution<IntType, Type>(Type lambda =
Type(1))
normal_distribution<Type = double>
Type
: the type of the generated random value (by default
double
). normal_distribution<Type>(ReadType mean = Type(0),
Type sigma = Type(1))
poisson_distribution<IntType = int, RealType = double>
IntType
: the type of the generated random value (by default
int
). RealType
: the type of the parameter of the distribution (by default
double
). mean
.
Constructor:
poisson_distribution<IntType, RealType>(RealType mean =
RealType(1))
uniform_int<Type = int>
Type
: the type of the generated random value (by default
int
). uniform_int<Type>(Type min = 0, Type max = 9)
Type operator()(RandomNumberGenerator), Type upper
uniform_real<Type = real>
Type
: the type of the generated random value (by default
real
). uniform_real<Type>(Type min = Type(0), Type max = Type(1))
GRN * (max - min) + minwhich might not be what you'd expect.
Some of the distributions mentioned below appear not yet to be fully
operational in the STL. This appears to be the case with the
binomial_distribution
, the gamma_distribution
, the
normal_distribution
, and the poisson_distribution
. These distributions
will likely become fulle operational in the near future.
Here is an example showing the outcome of a statistical experiment consisting of throwing an honest coin 20 times:
#include <iostream> #include <random> #include <ctime> using namespace std; int main() { std::mt19937 mt(time(0)); bernoulli_distribution bd; for (int idx = 0; idx < 20; ++idx) cout << (bd(mt) ? "heads" : "tail") << (idx + 1 == 10 ? '\n' : ' '); cout << endl; }