Difference between revisions of "Programming Tools"

From Cluster Documentation Project
Jump to: navigation, search
(added low level links)
(added erlang, haskell, openmp, OpenACC)
Line 4: Line 4:
 
**a powerful N-dimensional array object
 
**a powerful N-dimensional array object
 
**sophisticated (broadcasting) functions
 
**sophisticated (broadcasting) functions
**tools for integrating C/C++ and Fortran code<br>
+
**tools for integrating C/C++ and Fortran code
**useful linear algebra, Fourier transform, and random number capabilities
+
**useful linear algebra, Fourier transform, and random number capabilities<br>Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases. Numpy is licensed under the BSD license, enabling reuse with few restrictions.
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases. Numpy is licensed under the BSD license, enabling reuse with few restrictions.
 
 
*[http://www.r-project.org/ R] is a language and environment for statistical computing and graphics. R provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible. One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.  
 
*[http://www.r-project.org/ R] is a language and environment for statistical computing and graphics. R provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible. One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.  
 
*[http://julialang.org/ Julia] is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The library, mostly written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, FFTs, and string processing. More libraries continue to be added over time. Julia programs are organized around defining functions, and overloading them for different combinations of argument types (which can also be user-defined).
 
*[http://julialang.org/ Julia] is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The library, mostly written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, FFTs, and string processing. More libraries continue to be added over time. Julia programs are organized around defining functions, and overloading them for different combinations of argument types (which can also be user-defined).
 +
*[http://www.erlang.org/ Erlang] Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability. Some of its uses are in telecoms, banking, e-commerce, computer telephony and instant messaging. Erlang's runtime system has built-in support for concurrency, distribution and fault tolerance.
 +
*[http://www.haskell.org/haskellwiki/Haskell Haskell] is an advanced purely-functional programming language. An open-source product of more than twenty years of cutting-edge research, it allows rapid development of robust, concise, correct software. With strong support for integration with other languages, built-in concurrency and parallelism, debuggers, profilers, rich libraries and an active community, Haskell makes it easier to produce flexible, maintainable, high-quality software.
  
== Lower Level ==
+
== Compiler Enhancements==
 +
 
 +
These enhancements are used with Fortran and C/C++ compilers.
 +
 
 +
*[http://openmp.org/wp/ OpenMP] is a standard for parallel programming on shared memory systems, continues to extend its reach beyond pure HPC to include embedded systems, multicore and real time systems. A new version is being developed that will include support for accelerators, error handling, thread affinity, tasking extensions and Fortran 2003. Note: OpenMP is not a cluster programming tool. It works for multi-core cluster nodes and is supported by virtually all compilers.
 +
*[http://www.openacc-standard.org/ OpenACC] is an Application Program Interface (API) that describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator (e.g. GPUs), providing portability across operating systems, host CPUs and accelerators.
 +
 
 +
== Lower Level Parallel Programming Libraries ==
 +
 
 +
These are programming libraries that can be used with Fortran, C/C++, and Java.
  
 
*[http://www.mcs.anl.gov/research/projects/mpich2/ MPICH2] is a freely available, portable implementation of MPI, the Standard for message-passing libraries.
 
*[http://www.mcs.anl.gov/research/projects/mpich2/ MPICH2] is a freely available, portable implementation of MPI, the Standard for message-passing libraries.
*[http://www.open-mpi.org/ Open MPI] is a project combining technologies and resources from several other projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI) in order to build the best MPI library available.
+
*[http://mvapich.cse.ohio-state.edu/ MVAPICH2] enhanced MPICH2 version that delivers best performance, scalability and fault tolerance for high-end computing systems and servers using InfiniBand, 10GigE/iWARP and RoCE networking technologies.
 +
*[http://www.open-mpi.org/ Open MPI] is a project combining technologies and resources from several other projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI) in order to build the best MPI library available. Support runtime selection of interconnect.
 
*[http://www.jppf.org/ The Java Parallel Processing Framework] is a suite of software libraries and tools providing convenient ways to parallelize CPU-intensive processing. It is written in the Java programming language and is platform independent.
 
*[http://www.jppf.org/ The Java Parallel Processing Framework] is a suite of software libraries and tools providing convenient ways to parallelize CPU-intensive processing. It is written in the Java programming language and is platform independent.
 
*[http://www.csm.ornl.gov/pvm/pvm_home.html PVM] (Parallel Virtual Machine) is a software package that permits a heterogeneous collection of Unix and/or Windows computers hooked together by a network to be used as a single large parallel computer.
 
*[http://www.csm.ornl.gov/pvm/pvm_home.html PVM] (Parallel Virtual Machine) is a software package that permits a heterogeneous collection of Unix and/or Windows computers hooked together by a network to be used as a single large parallel computer.

Revision as of 12:56, 28 March 2012

Higher Level

  • Sage is a free open-source mathematics software system licensed under the GPL. It combines the power of many existing open-source packages into a common Python-based interface. The Sage Mission is to create a viable free open source alternative to Magma, Maple, Mathematica and Matlab.
  • NumPy is the fundamental package for scientific computing with Python. It contains among other things:
    • a powerful N-dimensional array object
    • sophisticated (broadcasting) functions
    • tools for integrating C/C++ and Fortran code
    • useful linear algebra, Fourier transform, and random number capabilities
      Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases. Numpy is licensed under the BSD license, enabling reuse with few restrictions.
  • R is a language and environment for statistical computing and graphics. R provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible. One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.
  • Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The library, mostly written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, FFTs, and string processing. More libraries continue to be added over time. Julia programs are organized around defining functions, and overloading them for different combinations of argument types (which can also be user-defined).
  • Erlang Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability. Some of its uses are in telecoms, banking, e-commerce, computer telephony and instant messaging. Erlang's runtime system has built-in support for concurrency, distribution and fault tolerance.
  • Haskell is an advanced purely-functional programming language. An open-source product of more than twenty years of cutting-edge research, it allows rapid development of robust, concise, correct software. With strong support for integration with other languages, built-in concurrency and parallelism, debuggers, profilers, rich libraries and an active community, Haskell makes it easier to produce flexible, maintainable, high-quality software.

Compiler Enhancements

These enhancements are used with Fortran and C/C++ compilers.

  • OpenMP is a standard for parallel programming on shared memory systems, continues to extend its reach beyond pure HPC to include embedded systems, multicore and real time systems. A new version is being developed that will include support for accelerators, error handling, thread affinity, tasking extensions and Fortran 2003. Note: OpenMP is not a cluster programming tool. It works for multi-core cluster nodes and is supported by virtually all compilers.
  • OpenACC is an Application Program Interface (API) that describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator (e.g. GPUs), providing portability across operating systems, host CPUs and accelerators.

Lower Level Parallel Programming Libraries

These are programming libraries that can be used with Fortran, C/C++, and Java.

  • MPICH2 is a freely available, portable implementation of MPI, the Standard for message-passing libraries.
  • MVAPICH2 enhanced MPICH2 version that delivers best performance, scalability and fault tolerance for high-end computing systems and servers using InfiniBand, 10GigE/iWARP and RoCE networking technologies.
  • Open MPI is a project combining technologies and resources from several other projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI) in order to build the best MPI library available. Support runtime selection of interconnect.
  • The Java Parallel Processing Framework is a suite of software libraries and tools providing convenient ways to parallelize CPU-intensive processing. It is written in the Java programming language and is platform independent.
  • PVM (Parallel Virtual Machine) is a software package that permits a heterogeneous collection of Unix and/or Windows computers hooked together by a network to be used as a single large parallel computer.