Hi Gerry,<br><br>&nbsp; I&#39;m by no means an expert on WRF so take the following with a grain of salt, but I&#39;m inclined to think that WRF wouldn&#39;t really run very well on a cluster of PS3s.&nbsp; The problem being that with &lt; 200 MB total, giving ~ 25MB per SPU, you&#39;re limited to a pretty small number of grid points per SPU, which means they&#39;ll fly through all the computations on those very few grid points,... and then very, very slowly communicate through the gigabit network.&nbsp; Even if you can get 2GB on each PS3, that&#39;s still only 256MB per SPU, right?<br>

<br>&nbsp; Again, my WRF experience is admittedly tremendously limited but a recent 3D run I did with a 300x300x200 domain size required a little over 12GB of RAM, I believe.&nbsp; The code had a few custom modifications, but I doubt that changed the run-time characteristics drastically, and the resulting run took something like 12.8 seconds on 8 processors,... and 11.8 seconds on 16 processors.&nbsp; (Two nodes and four nodes in this case.)&nbsp; Speeding up the calculations through smaller grids and the very fast SPUs just means that the communication would be, relatively speaking, even longer.<br>

<br>&nbsp; Since we do have some people who need to run some pretty large WRF models, I&#39;d be happy if this <i>did</i> work, but if you&#39;re interested in novel architectures for WRF, I would think that perhaps a GPU (or FPGA with many FP units) connected to a PCI-Express bus with Infiniband links would be nicer.&nbsp; The IB would hopefully allow you to balance out the extremely fast computations.&nbsp;&nbsp; If I can, once the double precision GPUs are out, I&#39;ll be picking one up for experimentation, but mostly for home-grown codes - WRF may take a bit more effort.&nbsp; The guys are NCAR do seem to have done some work in this area, though, running one of WRF&#39;s physics modules on an NVIDIA 8800 card - you can read about here:&nbsp; <a href="http://www.mmm.ucar.edu/wrf/WG2/michalakes_lspp.pdf">http://www.mmm.ucar.edu/wrf/WG2/michalakes_lspp.pdf</a><br>

<br>&nbsp; My two cents.&nbsp; :-)<br><br>

(PS.&nbsp; Ooh, now, if one could have a &#39;host system&#39; with a large amount

of RAM to pipe in to the GPU, running very large models, I could see

that potentially working well as an <i>accelerator</i>.&nbsp; Say, 32-64GB of RAM, of

which it deals with 2 x 128MB &#39;tiles&#39; at a time - one being cached and written

by the GPU while the other computes - and once all the acceleration

is done, use the host to quickly synchronize via IB with other large

nodes.&nbsp; But that&#39;s probably a fair amount of work!)<br>

<br><br>&nbsp; - Brian<br><br>Brian Dobbins<br>Yale Engineering HPC<br><br>

<br />-- 

<br />This message has been scanned for viruses and

<br />dangerous content by

<a href="http://www.mailscanner.info/"><b>MailScanner</b></a>, and is

<br />believed to be clean.