C++11 Stream Iterators feel like Python

July 9, 2013

The C++11 standard has plenty of new goodies that make programmer’s life easier without sacrificing rigor.

Here is a neat example from the book “The C++ Programming Language” by Bjarne Stroustrup,
that in its 4th edition covers the C++11 standard.

Found in Page 107, Section 4.5.3 Stream Iterators.

Let’s consider the tasks of:

Reading a list of words from an input file
Sorting them
Eliminating duplicates and
Writing them down into an output file

This can be done with the classic Unix shell commands:

cat inputfile | sort | uniq > outputfile

Where we expect “inputfile” to contain a list of words separated by new lines.

How to write the equivalent in a C++ program using C++11 ?

Let the code (and Stroustrup) speak:

#include <iterator>
#include <string>
#include <fstream>
#include <iostream>
#include <vector>
#include <algorithm>

int main() {

 std::string inputfilename, outputfilename;

 std::cin >> inputfilename >> outputfilename;

 std::ifstream inputfile { inputfilename };
 std::ofstream outputfile { outputfilename };

 std::istream_iterator< std::string > isitr { inputfile };
 std::ostream_iterator< std::string > ositr { outputfile, "\n" };

 std::istream_iterator< std::string > eos {};

 std::vector< std::string > str { isitr, eos };

 std::sort( str.begin(), str.end() );

 std::unique_copy( str.begin(), str.end(), ositr );

 return !inputfile.eof() || !outputfile;

}

This code

Reads two filenames from the standard input
Opens one file for input
Opens one file for output
Associates one iterator to the input file
Associates one iterator to the output file, along with a separator “\n”
Creates a vector to contain strings and attach it to the iterator of the input file

At this point the full file is read into memory and placed into the vector of strings

Calls std::sort and in the process trigger the read.
Once sorted, copies unique entries to the output file, using the output iterator

This can be rewritten a bit shorter (from the same Book, page 108) as:

#include <iterator>
#include <string>
#include <fstream>
#include <iostream>
#include <set>
#include <algorithm>

int main(int argc, const char * argv [] ) {

  std::ifstream inputfile  { argv[1] };
  std::ofstream outputfile { argv[2] };

  using istritr = std::istream_iterator< std::string >;
  using ostritr = std::ostream_iterator< std::string >;

  std::set< std::string > words { istritr { inputfile }, istritr {} };

  std::copy( words.begin(), words.end(), ostritr { outputfile, "\n" } );

  return !inputfile.eof() || !outputfile;
}

Using an std::set instead of an std::vector, we get simultaneously the uniqueness property and the sorted property.

In particular, both of these properties are enforced as we go inserting new elements in the set.

…and for the offended Pythonists out there…

OK, you are right,

it is not quite as short as it could be in Python,

Here is an attempt to write the same in a Python script:

import sys

infile = open(str(sys.argv[1]))
outfile = open(str(sys.argv[2]),’w’)

outfile.writelines(sorted(list(set( infile.readlines() ))))

infile.close()
outfile.close()

Somehow it seems relevant to cite here the output of the Python command

import this

that returns the Zen of Python, by Tim Peters:

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren’t special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one– and preferably only one –obvious way to do it.
Although that way may not be obvious at first unless you’re Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it’s a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea — let’s do more of those!

There is a lot in here that C++ and Python developers can agree upon.

5 comments to C++11 Stream Iterators feel like Python

Jean-Christophe Fillion-Robin says:

July 9, 2013 at 8:13 am

Very nice.

An extra line can be removed :p See https://github.com/luisibanez/Cxx11/pull/1

Reply
Luis Ibanez says:

July 9, 2013 at 8:19 am

You are quite right !

Your improvement has now been merged:
https://github.com/luisibanez/Cxx11/commit/680cdfe72d65314af5f240aa058523fed26af20e

Thanks !

Reply
Jean-Christophe Fillion-Robin says:

July 9, 2013 at 8:25 am

And ‘string’ header can also be removed. See https://github.com/luisibanez/Cxx11/pull/2

Reply
Gert Wollny says:

July 9, 2013 at 9:52 am

I really love C++11, but above examples barely show something new. Stream iterators existed before, (see, e.g. Josuttis “The C++ Standart Library” Addison-Wesley 1999).

Replace the “using” by “typedef” and the “{}” by (), i.e. the initializer lists by the constructors, and you get a well-formed c++98 program that doesn’t look very different, i.e.

#include
#include
#include
#include
#include
#include

int main(int argc, const char * argv [] )
{
std::ifstream inputfile ( argv[1] );
std::ofstream outputfile ( argv[2] );

typedef std::istream_iterator< std::string > istritr;
typedef std::ostream_iterator< std::string > ostritr;

std::set< std::string >
words( (istritr(inputfile)), istritr() );

std::copy( words.begin(), words.end(),
(ostritr( outputfile, “n” )));

return !inputfile.eof() || !outputfile;
}

regards,

Reply
Luis Ibanez says:

July 10, 2013 at 8:17 pm

Gert,

Thanks for the clarification and your C++98 vs C++11 correction.

Your point is well taken.

Reply

C++11 Stream Iterators feel like Python

5 comments to C++11 Stream Iterators feel like Python

Leave a ReplyCancel reply