what is a flop
Daniel Kidger
Daniel.Kidger at quadrics.com
Tue Jun 17 12:28:54 EDT 2003
Dear Saurav,
Actually the third term should be 'Floating point operations per cycle'
Usualy further refined to mean specifically 64-bit floats
('double-precision')
In the past most processor took many cycles to do each floating point
operation. Now with pipelining most processors can issue at least one
floating point operation every machine cycle (note though that very few can
issue divides every cycle)
The P4 can normally do one flop per cycle - but it can do two (using its
SSE2 unit) but only if both are multiplies (or both adds) *and* both pairs
of operands are contiguous in memory.
Some architectures have two seperate floating point units: one just adds,
the other multiplies (can add too) and hence also have a '2' here
Some architectures most notably the vector machines have 16 or more floating
point units.
Many architectures have fused mutiply-add units (Alpha, Itanium, Mips,
Power3, etc.) They can add the result of a multiply directly to another
register. Thus these also have '2' flops per cycle.
Some archtectures have 2 (or more) 'muladd' units. Hence iirc Power3/4 and
Itanium2 can yield 4 flops per cycle.
Yours,
Daniel.
--------------------------------------------------------------
Dr. Dan Kidger, Quadrics Ltd. daniel.kidger at quadrics.com
One Bridewell St., Bristol, BS1 2AA, UK 0117 915 5505
----------------------- www.quadrics.com --------------------
-----Original Message-----
From: Saurav Gohain [mailto:way2saurav at rediffmail.com]
Sent: 17 June 2003 11:32
To: beowulf at beowulf.org
Subject: what is a flop
Dear Sir/Friends
I am new to cluster and hence asking this question ( may be a silly one for
u all).
In super computers and cluster, I find Peak Performance rated as TFLOPs,
GFLOP's etc.
Now, the question is what is a flop.
>From a beowulf pdf tutorial, i came to know that it is calculated as
Total Peak Performance= No of Nodes * No of CPU's * Floating point
operations per second * No of Cycles per second.
Although, the first two and the last is easily derivable, what about the
third one.
I checked the specs of Opetron and P4 but there's isn't any mention about
the Flops.
How will i derive then ?
In a comparison graph in Intel's website, I found a term SPEC*fp2000
ratings.
Now this ratings are like 2.4 , 3.4 etc.
Is it the number that I should input in the above formula to calculate the
total peak performance in flops about a cluster.
Kindly let me know the facts...
Regards,
Saurav
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list