[Beowulf] GPU diagnostics?
landman at scalableinformatics.com
Mon Mar 30 13:10:17 EDT 2009
David Mathog wrote:
> Have any of you CUDA folks produced diagnostic programs you run during
> "burn in" of new GPU based systems, in order to weed out problem units
> before putting them into service? Minimally, something resembling
> memtest86, to be used to find buggy memory associated with the GPU?
> Optimally, it would also more directly exercise the GPU's capabilities.
> I asked on the NV linux forum if there were any official Nvidia graphics
> card diagnostic programs, and nobody there answered with one. This was
> originally with respect to some VDPAU issues, where it looked at first
> like there might be a hardware problem on a small set of systems,
> including mine, although in the end it turned out to be an uninitialized
> variable (it was not my code). There was no objective way to
> demonstrate for VDPAU based software that "this graphics card is
> functioning normally" to help sort this out. I figured the CUDA folks
> should have something like this, else how could you trust the results
> from the GPU calculations?
Vendors have an nVidia supplied *GEMM based burn in test. Been thinking
about a set of diagnostics end users can run as a sanity check.
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the Beowulf