[Beowulf] C vs C++ challenge (java version)
Robert G. Brown
rgb at phy.duke.edu
Thu Jan 29 17:56:41 EST 2004
On Thu, 29 Jan 2004, Joe Landman wrote:
> On Thu, 2004-01-29 at 09:55, dc wrote:
> > file size C++ j client j server
> > wrnpc10.txt 3282452 0m2.746s 0m1.941s 0m1.643s
> > shaks12.txt 5582655 0m4.476s 0m3.321s 0m2.842s
> > big.txt 39389424 0m29.120s 0m13.972s 0m12.776s
> > vbig.txt 196947120 2m23.882s 1m5.707s 1m2.350s
> Where did these files come from? Would be nice to try out
> non-C++/non-Java solutions with.
At least shaks12.txt is from project gutenberg and shows up as a direct
link to there, first thing, from Google on a "shaks12 txt" search. I
haven't tried the rest of them -- if a tool works for Shakespeare it
SHOULD work for any of the rest. The only dicey point will arise when
the filesize starts to compete with available free memory, as one of the
rules of the "competition" is to save the list of words in addressable
form. At least my algorithm should scale boringly linearly up to where
the system is forced to really clear buffers and cache and then swap.
Theirs appear to as well.
So I'd suggest using Shakespeare from Gutenberg and maybe
/usr/share/dict/linux_words as a baseline, although the latter might
vary from version to version of linux.
I'd also suggest that each contestant run their code at least once or
twice with commented out lines that demonstrate that their code actually
is parsing the text and saving the words (and that what it saves ARE
words, or at least word-like tokens) in an addressable retrievable form.
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf