[Beowulf] C vs C++ challenge
Andrew Shewmaker
shewa at inel.gov
Mon Jan 26 20:17:35 EST 2004
Jakob Oestergaard wrote:
> Let me throw out a challenge:
>
> "Write a program in C that reads from standard input until EOF, and
> prints the number of distinct words found in that input"
>
<snip>
>
> I use an RB-tree to store the words I find, and print the number of
> elements in the final tree.
>
> A comparable C solution shouldn't be more than a few thousand lines ;)
>
> Oh, right, I cheated and used STL. Well, STL is a big part of why C++
> is really handy, but go right ahead and use "glib" or whatever.
>
> I still guarantee you two things:
> 1) Your code will be longer
> 2) Your program will be slower
>
> As always, I love to be proven wrong ;)
Here's a similar program...word frequency...instead of word count.
http://www.cs.bell-labs.com/cm/cs/pearls/sec151.html
His C++ code is shorter, but his C code is faster.
With a trivial modification, his C code can meet your challenge too.
http://www.cs.bell-labs.com/cm/cs/pearls/wordfreq.c
I downloaded the KJV of the Bible, Shakespeare's Complete Works,
War and Peace, part of the human genome, and a few other files off of
Project Gutenberg. The above C code was much faster on small files
and twice as fast when I lumped everything together into a 35 MB file
with 460000+ distinct words.
The only test I ran where C++ beat C (by 20+%) was a 187 MB file with over
2.5 million distinct words, where I used PGI's hash_set instead of its
regular set.
Have a great day,
Andrew
--
Andrew Shewmaker, Associate Engineer
Phone: 1-208-526-1415
Idaho National Eng. and Environmental Lab.
P.0. Box 1625, M.S. 3605
Idaho Falls, Idaho 83415-3605
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list