Parallel BLAST

Steve Gaudet SGaudet at
Mon Apr 15 14:11:55 EDT 2002

> -----Original Message-----
> From: William R. Pearson [mailto:wrp at]
> Sent: Sunday, April 14, 2002 10:32 PM
> To: beowulf at
> Subject: Parallel BLAST
> > Why is it that BLAST is not available for MPI/PVM?  I would think
> > clusters would be the prefect host for such an application.
> > Is it there is no need because BLAST is already so fast and
> > no one wants to break the database out onto node-resident disks?
> > Or is it that BLAST is kept running on single processor or 
> shared memory 
> > machines BLAST so that the DB is always in memory ready to 
> roll without
> > loading and doing the same for a cluster is not worth it
> > because the same trick is difficult to do on a node given 
> the current
> > way clusters are built?  I assume the same is true for FASTA?
> I suspect that BLAST is not available for MPI/PVM because (1) it is
> too fast, and (2) there is not much demand for it.  
> 95% of the time, BLAST is almost an in-memory grep (the other 5% of
> the time it is working on the things it is looking for).  Sequence
> comparison is embarrassingly parallel, and very easily threaded.
> Distributing the sequence databases and collecting results has more
> overhead (there probably aren't many distributed grep programs
> either).  FASTA is 5 - 10X slower than BLAST, and Smith-Waterman is
> another 5-20X slower than FASTA.  Here, the communications overhead is
> low, and distributed systems work OK for FASTA, and great for
> Smith-Waterman (where the overhead fraction is very small).
> Of course, it is a lot easier to compile a threaded program, and just
> run it, than it is to install and configure the MPI or PVM environment
> and the programs to run in it.  Bioinformatics software is often run
> by computer savvy biologists, not high-performance computing folks,
> and not having to install and configure PVM/MPI is a big advantage.
> The NCBI probably does not make a PVM/MPI parallel BLAST because there
> is very little demand for it, and it does not meet their computational
> needs.

There's also a commerical version from Turbogenomics.


1) Ready to go, plug-n-play solution for parallel BLAST
2) Expertise and 20+ years of experience in parallel computing
3) Dynamic database splitting feature to take advantage of computers that
have less memory than the size of the database
4) Smart load balancing - achieve linear to superlinear speedup
5) No modification made to the NCBI BLAST algorithm to ensure identical
results with the non-parallel version
6) Easy drop-in update whenever NCBI releases newer versions of their
7) Excellent support
8) 30-days money back guarantee


Steve Gaudet 
Linux Solutions Engineer
| Turbotek Computer Corp.    tel:603-666-3062 ext. 21             |
| 8025 South Willow St.      fax:603-666-4519                     |
| Building 2, Unit 105       toll free:800-573-5393               |
| Manchester, NH 03103       e-mail:sgaudet at  |
|                            web: |

Beowulf mailing list, Beowulf at
To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list