[Beowulf] C vs C++ challenge

Trent Piepho xyzzy at speakeasy.org
Sun Feb 1 05:57:37 EST 2004

> I could easily optimize it more (do the work on a larger buffer at a
> once), but I think enough waste heat has been created here.  This is a
> simple 2500+ Athlon XP box (nothing fancy) running 2.4.24-pre3.

Enough time wasted on finding different solutions to a simple problem?  Surely
not.  Let me toss my hat into the ring:

              Awk     Perl       C    My program (C)
wrnpc10.txt  1.771   1.125   0.506     0.164
shaks12.txt  3.055   1.877   0.955     0.243
big.txt     20.339  12.792   5.858     1.196
vbig.txt   101.466  63.770  29.079     5.666

All times are from a dual PIII-1GHz on a ServerWorks board with 1GB dual
channel PC133 ram.  Each time is the best of three runs and is wall time.

The awk version is by Selva Nair, Perl by Joe Landman, C version by Robert G
Brown.  The Java version isn't portable enough for me to run (go Java!) and I
didn't see the source for a C++/STL version.  Compiler used was gcc 2.96,
awk was 3.1.0, and perl was 5.6.1.

The actual results for shaks12.txt, which are of course never the same:
version  total   unique
awk     902299    31384
perl              23903
C       902299    37499
My      906912    27321
wc      901325

I considered words to be formed from 0-9, a-z, A-Z, and '.  Everything is
lower cased.  The shaks12.txt is complicated by the use of the single quote
for as both for quotations and for contractions.  I also have the list of
words and counts, sorted no less, but do not print it.

I'll give you guys a few days, and see if anyone finds a solution before I
reveal my secrets.

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list