[Beowulf] Nvidia FERMI/gt300 GPU

Bill Broadley bill at cse.ucdavis.edu
Thu Oct 1 17:42:32 EDT 2009


Craig Tierney wrote:
> Bill Broadley wrote:
>> Impressive:
>> * IEEE floating point, doubles 1/2 as fast as single precision (6 times or
>>   so faster than the gt200).
>> * ECC
> 
> The GDDR5 says it supports ECC, but what is the card going to do?
> Is it ECC just from the memory controller, or is it ECC all the way
> through the chip?  Is it 1-bit correct, 2-bit error message?

Nvidia is pleasingly specific in their white paper:
http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIAFermiComputeArchitectureWhitepaper.pdf

Specifically:
 Fermi supports Single-Error Correct Double-Error Detect (SECDED) ECC codes
 that correct any single bit error in hardware as the data is accessed.
 ...
 Fermi’s register files, shared memories, L1 caches, L2 cache, and DRAM memory
 are ECC protected
 ...
 All NVIDIA GPUs include support for the PCI Express standard for CRC check
 with retry at the data link layer. Fermi also supports the similar GDDR5
 standard for CRC check with retry (aka “EDC”) during transmission of data
 across the memory bus.

Kudos to Nvidia to being very clear.

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list