There are a couple of sorting algorithms in C++ from std::sort to lamdbas, each tailored to different use cases. As a programmer, you may or may not want to delve into the depths of sort algorithms. That’s a domain left for experts. Luckily, it’s quite easy to get started with sorting data.
Contents
- 1Sorting methods in C++
- 2Sorting "Hello World!"
- 3Sorting structs and classes
- 3.1Member operator < overload
- 3.2Non-member operator < overload
- 3.3Sort predicate examples
- 3.3.1Lambda as a sort predicate
- 3.3.2Free method (non-member method) sort predicate
- 3.3.3Member method sort predicate
- 3.3.4Sorting with template predicates
- 4Sorting std::list
Sorting methods in C++
With C++, there is a standardized sort method readily available. It’s called std::sort
and works by calling it with a range it should sort, usually a begin
-iterator and an end
-iterator. Normal use cases are sorting a vector or an array, but any container which implements the RandomAccess
-iterator can be sorted. Containers like std::list
can’t be sorted with std::sort
. std::sort
is available in the <algorithm>
header.
If you try to sort std::list
with std::sort
, you’ll get a compile error. The error will be a variation of not being able to use operator -
on the std::list
iterator type. If you want to sort a list, you’ll have to use std::list::sort
.
Using qsort
to sort structs or classes will not be covered.
What algorithm std::sort
will use, is left to the implementation, but the complexity should not exceed O(N log N)
where N
is number of elements to sort. Implementations are usually implementing std::sort
by combining different sort algorithms with different strengths and weaknesses. One such algorithm is IntroSort.
IntroSort is a hybrid sort algorithm, invented in 1997 by David Musser. The goal of IntroSort is to provide a sorting algorithm, which has a fast average performance and an optimal worst case performance. It combines Quicksort and Heapsort, and combines the good parts from both algorithms. Due to this, IntroSort has an average performance and a worst case performance of O(n log n)
, where N
is number of elements.
In theory, sorting numbers and sorting any data is not different. They use the same algorithm, the same range, and the same method. The only difference is the comparison between elements. But that’s standardized too. The real difference is the method comparing the elements.
To sort a collection of any data (also known as POD, Plain Old Data) with STL we need to tell it how to sort these objects. This chapter will go through many of those methods.
Sorting “Hello World!"
Most tutorials about sorting starts with sorting some numbers. But any container with RandomIterator
can be sorted with std::sort
. And as it happens to be, std::string
implements RandomIterator
.
<span class="ot">#include <algorithm></span>
<span class="ot">#include <string></span>
<span class="ot">#include <iostream></span>
std::string hello{ <span class=“st”>“Hello World!"</span> };
std::sort(hello.begin(), hello.end());
std::cout << hello << <span class=“st”>"</span><span class=“ch”>\n</span><span class=“st”>"</span>;
Output will be !HWdellloor
. Notice the space in front, which is the first character.
Sorting structs and classes
If something can be compared by the less-than operator <
, it can be sorted when the data is in a container supporting random access.
Creating a container with your data is simple enough. C++ is all about ‘pay for what you use’.
Given the following data,
<span class="kw">struct</span> some_data
{
some_data() = <span class="kw">default</span>;
some_data(<span class="dt">int</span> val) : a(val) {}
<span class="dt">int</span> a;
};
Creating a container like so is straight forward with an initializer_list
.
std::vector<some_data> elements = { <span class="dv">0</span>,<span class="dv">9</span>,<span class="dv">2</span>,<span class="dv">7</span>,<span class="dv">3</span>,<span class="dv">5</span>,<span class="dv">7</span>,<span class="dv">3</span> };
When you build and run this code, you’ll get a std::vector
with 8 elements of some_data
. When you try to sort this using std::sort
, you’ll get an error stating the compiler could not find a suitable operator <
for comparing two instances of some_data
.
There are three ways of solving this,
- In-class definition, commonly known as member overload.
- Out of class definition, commonly known as non-member overload.
- By a predicate, and that predicate can be any method matching the signature of the predicate.
Member operator < overload
The member operator <
overload is the simplest and easiest if you’re able to change the class/struct itself. This will not work with other classes/structures, whose you have no control over (be classes from other libraries) or when sharing structs between C and C++.
Defining your own operators, is also known as operator overloading. And in this case, it’s relational operator overloading.
Some may argue one must use friend
to overload operators, but that is not true. It’s only necessary to use friend
method if and only if the comparison should have access to members declared private
in the struct or class.
Using some_data
from above, make a method with the following signature in the struct: bool operator<(const some_data & rhs) const
. The abbreviation rhs
means right hand side
, while lhs
is left hand side
.
The struct will then look like:
<span class="kw">struct</span> some_data
{
some_data() = <span class="kw">default</span>;
some_data(<span class="dt">int</span> val) : a(val) {}
<span class="dt">int</span> a;
<span class="dt">bool</span> <span class="kw">operator</span><(<span class="dt">const</span> some_data & rhs) <span class="dt">const</span>
{
<span class="kw">return</span> a < rhs.a;
}
};
A call to std::sort
will then compile and call the inline and overloaded less-than operator.
std::sort(elements.begin(), elements.end());
Non-member operator < overload
When you’re not able to change the class or struct itself, a non-member operator <
overload can be used.
Outside of the class, define this method:
<span class="dt">bool</span> <span class="kw">operator</span><(<span class="dt">const</span> some_data & lhs, <span class="dt">const</span> some_data & rhs)
{
<span class="kw">return</span> lhs.a < rhs.a;
}
Using std::sort
have the exact same syntax, and the results are identical.
std::sort(elements.begin(), elements.end());
Sort predicate examples
In a written or spoken language, a predicate is a part of a sentence, which “states something” or “tells something” about a subject (or a thing). In computer science and C++, a predicate is much of the same thing. Given n inputs, it can tell something about the object or objects being compared .
With sorting, a predicate states if object A comes before object B, thus a search predicate must have two arguments of same type.
Continuing the use of some_data
above, a search predicate could be the operator <
-overload (both member and non-member). However, when using a matching operator <
overload in the same namespace as the struct itself, all sort operations will use that overload. If you want to customize the sort order in certain cases, predicates will let you define a case-by-case sort orders.
Pretend you’re writing a system for a car dealership. A car has certain properties like make, model, year and price.
<span class="kw">struct</span> car
{
std::string make;
std::string model;
<span class="dt">int</span> year;
<span class="dt">double</span> price;
};
Sorting a car against another car does not make any sense, but sorting cars based on what make and model makes sense, also by year and how much the car costs.
There are a couple ways of sorting the car
struct, here are using a lambda or a method.
Lambda as a sort predicate
Using a lambda as a sort predicate is probably the easiest and most efficient way of implementing predicates. The reason behind that logic, is that a lambda is an anonymous method which is implemented at the call site. As a bonus, it’s easier for the compiler to inline and optimize the lambda than a method defined elsewhere. It’s both easier to read and more efficient than the alternate methods.
Using the the car
-struct, a c++ predicate lambda for sorting by make and model may be implemented like this.
<span class="kw">auto</span> sortByMakeAndModel =
[](<span class="dt">const</span> car & a, <span class="dt">const</span> car & b) -> <span class="dt">bool</span>
{
<span class="kw">if</span> (a.make < b.make) <span class="kw">return</span> <span class="kw">true</span>;
<span class="kw">if</span> (a.make > b.make) <span class="kw">return</span> <span class="kw">false</span>;
<span class="kw">if</span> (a.model < b.model) <span class="kw">return</span> <span class="kw">true</span>;
<span class="kw">if</span> (a.model > b.model) <span class="kw">return</span> <span class="kw">false</span>;
<span class="kw">return</span> <span class="kw">false</span>;
};
There are two more ways of creating a predicate with multiple elements to sort with, one where operator <
is always used and the third where everything is a big boolean expression.
Using only operator <
.
<span class="kw">auto</span> sortByMakeAndModel_v2 =
[](<span class="dt">const</span> car & a, <span class="dt">const</span> car & b) -> <span class="dt">bool</span>
{
<span class="kw">if</span> (a.make < b.make) <span class="kw">return</span> <span class="kw">true</span>;
<span class="kw">if</span> (b.make < a.make) <span class="kw">return</span> <span class="kw">false</span>;
<span class="kw">if</span> (a.model < b.model) <span class="kw">return</span> <span class="kw">true</span>;
<span class="kw">if</span> (b.model < a.model) <span class="kw">return</span> <span class="kw">false</span>;
<span class="kw">return</span> <span class="kw">false</span>;
};
Big boolean expression.
<span class="kw">auto</span> sortByMakeAndModel_v3 =
[](<span class="dt">const</span> car & a, <span class="dt">const</span> car & b) -> <span class="dt">bool</span>
{
<span class="kw">return</span>
(
(a.make < b.make) ||
(a.make == b.make && a.model < b.model)
);
};
All three lambdas generate the same sort order and have identical semantics. The difference is how they are compiled to machine code. During my testing in Release builds, the lambdas sortByMakeAndModel
and sortByMakeAndModel_v2
got merged into one lambda, while the third, sortByMakeAndModel_v3
got completely inlined.
Inlining in computer science, means that the contents of the method is taken out and inserted at the call site. It’s as if there is no function, only the contents of the method is copied every place the inlined method is used.
std::sort(first, last, sortByMakeAndModel);
std::sort(first, last, sortByMakeAndModel_v2);
std::sort(first, last, sortByMakeAndModel_v3);
The output in any case is:
Sort by make and model
1. Ford Mondeo 2010 (25000)
2. Porsche 911 1999 (65000)
3. Volvo V70 2007 (20000)
4. Volvo V90 2016 (55000)
Sorting by the other variables, year
and price
are straight forward comparison. And during my testing, both methods were completely inlined.
Sort by year.
<span class="kw">auto</span> sortByYear = [](<span class="dt">const</span> car & a, <span class="dt">const</span> car & b) -> <span class="dt">bool</span>
{
<span class="kw">return</span> a.year < b.year;
};
Used like:
std::sort(first, last, sortByYear);
After sorting by year, the list looks like this.
Sort by year
1. Porsche 911 1999 (65000)
2. Volvo V70 2007 (20000)
3. Ford Mondeo 2010 (25000)
4. Volvo V90 2016 (55000)
Sort by price.
<span class="kw">auto</span> sortByPrice = [](<span class="dt">const</span> car & a, <span class="dt">const</span> car & b) -> <span class="dt">bool</span>
{
<span class="kw">return</span> a.price < b.price;
};
Used like:
std::sort(first, last, sortByPrice);
After sorting by price, the vector looks like this:
Sort by price
1. Volvo V70 2007 (20000)
2. Ford Mondeo 2010 (25000)
3. Volvo V90 2016 (55000)
4. Porsche 911 1999 (65000)
Free method (non-member method) sort predicate
A free method (non-member method) is any method not defined in a struct or a class. It behaves much like a lambda, but can also be present in multiple compilation units. For performance reasons, using a non-member method should be inline. We can give the compiler a hint to inline the method by using inline
.
Sorting the car
structure by year is much like the sortByYear
-lambda.
<span class="kw">inline</span> <span class="dt">bool</span> sortYear(<span class="dt">const</span> car & a, <span class="dt">const</span> car & b)
{
<span class="kw">return</span> a.year < b.year;
}
Using the non-member method is also almost identical to the lambda.
std::sort(first, last, &sortYear);
And as expected, the cars are sorted by year.
Sort by year, non-member method
1. Porsche 911 1999 (65000)
2. Volvo V70 2007 (20000)
3. Ford Mondeo 2010 (25000)
4. Volvo V90 2016 (55000)
Member method sort predicate
Using a member method requires an instance of the class or struct. C++11 introduced std::bind
(previously boost::bind
). It makes it simpler to bind to a method in a class, but care must be taken when constructing the std::bind
object.
We want to use a C++ sort predicate member function in the following class.
<span class="kw">struct</span> car_sort
{
<span class="dt">bool</span> sortYear(<span class="dt">const</span> car & a, <span class="dt">const</span> car & b) <span class="dt">const</span>
{
<span class="kw">return</span> a.year < b.year;
}
};
There must be an instance of the class available, car_sort sort_instance;
, and the sortYear
method must be public.
Then is it possible to bind to the method.
std::sort(first, last, std::bind(&car_sort::sortYear,
sort_instance, std::placeholders::_1,
std::placeholders::_2));
The syntax may look frightening, but there is a reason for each part. The descriptions are on the following source listing, where each element is on it’s own line.
std::sort( <span class="co">// (1) Call to std::sort</span>
first, <span class="co">// (2) First iterator</span>
last, <span class="co">// (3) End iterator</span>
std::bind( <span class="co">// (4) Predicate, which is std::bind</span>
&car_sort::sortYear, <span class="co">// (5) Pointer to member method</span>
sort_instance, <span class="co">// (6) Instance of struct</span>
std::placeholders::_1, <span class="co">// (7) Placeholder for argument 1</span>
std::placeholders::_2 <span class="co">// (8) Placeholder for argument 2</span>
)
);
While 1, 2 and 3 are the same as before, 4-8 are new and requires some explanation. Item 4 tells the compiler the predicate is an object produced by std::bind, and the predicate when called will use method 5, with instance 6, and put the first argument into placeholder _1
(7) and the second argument into placeholder _2
(8).
The output is still valid.
Sort by year, member method
1. Porsche 911 1999 (65000)
2. Volvo V70 2007 (20000)
3. Ford Mondeo 2010 (25000)
4. Volvo V90 2016 (55000)
Using a free method, std::bind
or std::function
is more likely to incur a runtime cost, because the sort predicate method to be called is essentially a function pointer. And as with any pointer, it may be changed during runtime. The compiler may be smart and optimize away the pointer, if and only if it can see the pointer will not be changed and can not be changed during runtime. This is a prime example for dynamic binding.
With lambdas, you’re more inclined to get static binding. That means the function to be called may and will be resolved at compile time. It’s not possible to change the static binding during runtime.
Sorting with template predicates
To the std::sort method we simply supply a template predicate! It’s just a method taking 2 arguments of that type and apply our logic to tell if the first argument is lesser than the last argument. With templates it becomes a little more complex, but still it’s very simple when you know how it works.
<span class="ot">#include <iostream> </span><span class="co">// Provides cout</span>
<span class="ot">#include <vector> </span><span class="co">// Provides our basic container</span>
<span class="ot">#include <algorithm> </span><span class="co">// Provides std::sort</span>
<span class=“co”>// Structure to be sorted</span>
<span class=“kw”>template</span><<span class=“kw”>typename</span> T>
<span class=“kw”>struct</span> generic_data
{
generic_data(size_t i, size_t t, T data)
: id(i)
, time(t),
other_data(data)
{}
size_t id;
size_t time;
T other_data;
};
<span class=“kw”>using</span> A = generic_data<<span class=“dt”>bool</span>>;
<span class=“kw”>using</span> B = generic_data<std::string>;
<span class=“kw”>using</span> C = generic_data<<span class=“dt”>char</span>*>;
<span class=“co”>// Our predicate</span>
<span class=“co”>//</span>
<span class=“kw”>template</span><<span class=“kw”>typename</span> CLASS>
<span class=“dt”>bool</span> timeSortPredicate(<span class=“dt”>const</span> CLASS &a, <span class=“dt”>const</span> CLASS &b)
{
<span class=“kw”>return</span> a.time < b.time;
}
<span class=“kw”>template</span><<span class=“kw”>typename</span> CLASS>
<span class=“dt”>bool</span> idSortPredicate(<span class=“dt”>const</span> CLASS &a, <span class=“dt”>const</span> CLASS &b)
{
<span class=“kw”>return</span> a.id < b.id;
}
<span class=“co”>// Our collection printer</span>
<span class=“co”>//</span>
<span class=“kw”>template</span><<span class=“kw”>typename</span> CLASS>
<span class=“dt”>void</span> print(<span class=“dt”>const</span> CLASS &c)
{
<span class=“kw”>for</span> (<span class=“dt”>const</span> <span class=“kw”>auto</span> & item : c)
{
std::cout << <span class=“st”>“Id: “</span> << item.id <<
<span class=“st”>”, time: “</span> << item.time <<
<span class=“st”>”, data: “</span> << item.other_data << <span class=“st”>"</span><span class=“ch”>\n</span><span class=“st”>"</span>;
}
}
The method template_predicate_sort
is a test method of some sort, showing how to use the templated predicates. The clue is to explicitly tell the compiler what type the predicate should have. In this case, either of type A
, B
or C
.
<span class="dt">void</span> template_predicate_sort()
{
<span class="co">// Construct collections by initalizer lists</span>
std::vector<A> alist = {
{<span class="dv">1</span>, <span class="dv">15</span>, <span class="kw">false</span>},
{<span class="dv">2</span>,<span class="dv">20</span>, <span class="kw">true</span>},
{<span class="dv">3</span>,<span class="dv">14</span>, <span class="kw">false</span>}
};
std::vector<B> blist = {
{<span class="dv">3</span>, <span class="dv">4</span>, <span class="st">"first"</span>},
{<span class="dv">2</span>, <span class="dv">10</span>, <span class="st">"second"</span>},
{<span class="dv">1</span>, <span class="dv">12</span>, <span class="st">"third"</span>}
};
std::vector<C> clist = {
{<span class="dv">11</span>, <span class="dv">0</span>, <span class="st">"char literal 1"</span>},
{<span class="dv">12</span>, <span class="dv">0</span>, <span class="st">"char literal 2"</span>},
{<span class="dv">13</span>, <span class="dv">0</span>, <span class="st">"char literal 3"</span>}
};
<span class="co">// Print stuff</span>
std::cout << <span class="st">"Original collections.</span><span class="ch">\n</span><span class="st">"</span>;
std::cout << <span class="st">"Collection A</span><span class="ch">\n</span><span class="st">"</span>;
print(alist);
std::cout << <span class="st">"Collection B</span><span class="ch">\n</span><span class="st">"</span>;
print(blist);
std::cout << <span class="st">"Collection C</span><span class="ch">\n</span><span class="st">"</span>;
print(clist);
std::cout << <span class="st">"Sorting collections by time</span><span class="ch">\n</span><span class="st">"</span>;
sort(alist.begin(), alist.end(), timeSortPredicate<A>);
sort(blist.begin(), blist.end(), timeSortPredicate<B>);
sort(clist.begin(), clist.end(), timeSortPredicate<C>);
<span class="co">// Print stuff</span>
std::cout << <span class="st">"Collection A</span><span class="ch">\n</span><span class="st">"</span>;
print(alist);
std::cout << <span class="st">"Collection B</span><span class="ch">\n</span><span class="st">"</span>;
print(blist);
std::cout << <span class="st">"Collection C</span><span class="ch">\n</span><span class="st">"</span>;
print(clist);
std::cout << <span class="st">"Sorting collections by id</span><span class="ch">\n</span><span class="st">"</span>;
sort(alist.begin(), alist.end(), idSortPredicate<A>);
sort(blist.begin(), blist.end(), idSortPredicate<B>);
sort(clist.begin(), clist.end(), idSortPredicate<C>);
<span class="co">// Print stuff</span>
std::cout << <span class="st">"Collection A</span><span class="ch">\n</span><span class="st">"</span>;
print(alist);
std::cout << <span class="st">"Collection B</span><span class="ch">\n</span><span class="st">"</span>;
print(blist);
std::cout << <span class="st">"Collection C</span><span class="ch">\n</span><span class="st">"</span>;
print(clist);
}
Compile with:
g++ sortpredicate.cpp -o sortpredicate
The output should be:
Original collections.
Collection A
Id: 1, time: 15, data: 0
Id: 2, time: 20, data: 1
Id: 3, time: 14, data: 0
Collection B
Id: 3, time: 4, data: first
Id: 2, time: 10, data: second
Id: 1, time: 12, data: third
Collection C
Id: 11, time: 0, data: char literal 1
Id: 12, time: 0, data: char literal 2
Id: 13, time: 0, data: char literal 3
Sorting collections by time
Collection A
Id: 3, time: 14, data: 0
Id: 1, time: 15, data: 0
Id: 2, time: 20, data: 1
Collection B
Id: 3, time: 4, data: first
Id: 2, time: 10, data: second
Id: 1, time: 12, data: third
Collection C
Id: 11, time: 0, data: char literal 1
Id: 12, time: 0, data: char literal 2
Id: 13, time: 0, data: char literal 3
Sorting collections by id
Collection A
Id: 1, time: 15, data: 0
Id: 2, time: 20, data: 1
Id: 3, time: 14, data: 0
Collection B
Id: 1, time: 12, data: third
Id: 2, time: 10, data: second
Id: 3, time: 4, data: first
Collection C
Id: 11, time: 0, data: char literal 1
Id: 12, time: 0, data: char literal 2
Id: 13, time: 0, data: char literal 3
Sorting std::list
Sorting a std::list
is almost the same as sorting a vector or a string, except with std::list
the sort method is a member function. Except this difference, the std::list::sort
method doesn’t accept a range, and it comes with two overloads. One without any predicate, and one with a predicate. The usage is identical to what was covered earlier.
Given a list of numbers, just call sort()
with no arguments to sort by the default operator <
.
std::list<<span class="dt">int</span>> numbers = { <span class="dv">0</span>,<span class="dv">9</span>,<span class="dv">1</span>,<span class="dv">2</span>,<span class="dv">8</span>,<span class="dv">4</span>,<span class="dv">6</span> };
numbers.sort();
The output is:
Numbers sorted: 0 1 2 4 6 8 9
The same applies to sorting a struct. For simplicity, we’ll use the car
struct from before and the sortYear
predicate from before. Using lambdas would be
std::list<car> cars = {
{ <span class="st">"Volvo"</span>, <span class="st">"V70"</span>, <span class="dv">2007</span>, <span class="fl">20000.0</span> },
{ <span class="st">"Porsche"</span>, <span class="st">"911"</span>, <span class="dv">1999</span>, <span class="fl">65000.0</span> },
{ <span class="st">"Ford"</span>, <span class="st">"Mondeo"</span>, <span class="dv">2010</span>, <span class="fl">25000.0</span> },
{ <span class="st">"Volvo"</span>, <span class="st">"V90"</span>, <span class="dv">2016</span>, <span class="fl">55000.0</span> }
};
<span class=“co”>// Sort by free method predicate</span>
cars.sort(&sortYear);
<span class=“co”>// Sorting a list with a lambda as predicate</span>
<span class=“kw”>auto</span> sortYearPredicate = [](<span class=“dt”>const</span> car & a, <span class=“dt”>const</span> car & b) -> <span class=“dt”>bool</span>
{
<span class=“kw”>return</span> a.year < b.year;
};
cars.sort(sortYearPredicate);
The output is:
Cars sorted in a list
1. Porsche 911 1999 (65000)
2. Volvo V70 2007 (20000)
3. Ford Mondeo 2010 (25000)
4. Volvo V90 2016 (55000)
See Also
- C++ Concepts: New keywords
- C++ variadic template example
- C++ variadic templates
- Default explicit equality operator
- Mistakes