[Beowulf] C vs C++ challenge

Andrew Shewmaker shewa at inel.gov
Mon Jan 26 20:17:35 EST 2004


Jakob Oestergaard wrote:

 > Let me throw out a challenge:
 >
 > "Write a program in C that reads from standard input until EOF, and
 >  prints the number of distinct words found in that input"
 >
<snip>
 >
 > I use an RB-tree to store the words I find, and print the number of
 > elements in the final tree.
 >
 > A comparable C solution shouldn't be more than a few thousand lines ;)
 >
 > Oh, right, I cheated and used STL.  Well, STL is a big part of why C++
 > is really handy, but go right ahead and use "glib" or whatever.
 >
 > I still guarantee you two things:
 > 1) Your code will be longer
 > 2) Your program will be slower
 >
 > As always, I love to be proven wrong  ;)

Here's a similar program...word frequency...instead of word count.

http://www.cs.bell-labs.com/cm/cs/pearls/sec151.html

His C++ code is shorter, but his C code is faster.

With a trivial modification, his C code can meet your challenge too.

http://www.cs.bell-labs.com/cm/cs/pearls/wordfreq.c

I downloaded the KJV of the Bible, Shakespeare's Complete Works,
War and Peace, part of the human genome, and a few other files off of
Project Gutenberg.  The above C code was much faster on small files
and twice as fast when I lumped everything together into a 35 MB file
with 460000+ distinct words.

The only test I ran where C++ beat C (by 20+%) was a 187 MB file with over
2.5 million distinct words, where I used PGI's hash_set instead of its
regular set.

Have a great day,

Andrew

-- 
Andrew Shewmaker, Associate Engineer
Phone: 1-208-526-1415
Idaho National Eng. and Environmental Lab.
P.0. Box 1625, M.S. 3605
Idaho Falls, Idaho 83415-3605

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list