Globus Toolkit 4: What's in it for you?

The most significant non-Web services change to GT4 is the new GridFTP server. The GT4 GridFTP server has been rewritten from scratch. (The previous implementation was a modified version of Washington University's wuftpd server.) The new implementation is based on the high-performance Globus XIO communication library and features significantly better performance and capabilities not provided by the older code. It also eliminates a subtle licensing issue that caused concern for some people who wanted to redistribute the older server. The new server is fully compatible with both the GT3.2 server and the published GridFTP protocol, so older clients work with the GT4 server and GT4 clients work with the GT3.2 server. {mosgoogle right}

The GT4 features that have generated the greatest attention are undoubtedly the new Web services capabilities. Like GT3, GT4 includes a programming model and associated tools that allow users to build and host Web services that represent "stateful" resources on the Grid. In other words, users can use Web services development tools and hosting environments to provide Grid interfaces to computation engines, storage systems, legacy applications, instruments and sensors, and other things that make up a Grid application. Once Grid interfaces are provided, users--and their partners and collaborators--can develop all kinds of applications that use them in creative ways.

New Web services features in GT4 include a more stable hosting environment, better performance and efficiency, broader programming language support, and a new state model that implements the WSRF (Web Services Resource Framework) and WSN specifications. (See the March 2005 On the Grid column for more detail on WSRF and WSN and what they mean for the Grid.) All Web services included in GT3.2 are included in GT4, with the same or better capabilities. The WSDL interfaces for these services (used by Web service developers) have changed from their GT3.2 counterparts as a result of the introduction of WSRF and WSN.

GT3.2 provided tools for developing client programs in C or Java and provided Java classes for developing Grid services. GT4 expands both client and server programming support to include C, C++, Java, and Python.

GT4 also includes new system security capabilities. GT4 provides message-level security mechanisms that provide message protection for SOAP messages based on the WS-Security and WS-SecureConversation specifications. Developers can use GT4 to build applications and systems that are compliant with the WS-Interoperability Basic Profile and Basic Security Profile. Transport-level security mechanisms are also supported. A new authorization framework supports a variety of authorization schemes, including the familiar "grid-mapfile" access control list, an access control list defined by a service, a user-supplied authorization handler, and access to an authorization service via the SAML (Security Assertion Markup Language) protocol. Security services distributed with GT4 include the MyProxy online credential repository and the Community Authorization Service (CAS). In addition, GT4 interoperates with the Virtual Organization Management Service (VOMS) and PERMIS.

Finally, the GT4 Monitoring and Discovery Services ("MDS4") provide significantly enhanced functionality. Every GT4 Web services container is pre-configured with an MDS-Registry service that maintains information about services deployed in that container. MDS-Registry services can also be configured to monitor not only GT4 services but also, via a plug-in interface, any network-accessible resource or service. An extensible display component uses XSLT templates to define custom displays of MDS-Registry contents.

Performance Improvements

The GT4 development team placed a high priority on improving performance and scalability, and the results are impressive. This column was written while GT4 performance tuning efforts were still in progress, but significant improvements had been achieved already.

As noted above, GT4 includes development support for C and C++ clients and services. This support is useful for Web service client programs that start up, do one thing, and then quit. For example, the GRAM job submission client is now written in C. By eliminating the need to start a Java Virtual Machine (JVM) to run this program, start-up cost is reduced by 80 percent.

GridFTP has long been the performance star of the Globus Toolkit. As noted above, GT4 includes a completely new GridFTP server implementation. This server consistently performs at roughly 80 percent of the raw iperf performance on a network. This result has proved true on networks ranging up to one gigabit per second end-to-end, where iperf performed at roughly 940 Mbit/s and GridFTP performed at 750 Mbit/s. In our testing, data transfer rates have always been limited by the performance of the disk subsystem or the network interface card, never by the software. In a load test, a server running on a dual-processor Linux system supported 1800 clients simultaneously, sustaining a combined throughput equivalent that of an unloaded server, demonstrating that the server scales extremely well under heavy loads.

The new GT4 GridFTP server also supports striping, where the server runs on a cluster with a shared parallel file system and multiple nodes are used for individual transfers. On the TeraGrid's 30 Gb/s network, a striped transfer using 64 nodes at each end performed at 17 Gbit/s, limited only by the speed of the disk subsystem. (A memory-to-memory striped transfer using 32 nodes at each end sustained a rate of 27 Gbit/s--an amazing 90% of the theoretical limit!)

Performance of the GRAM job submission service was a major (and justified) criticism of GT3.0 and GT3.2. Performance testing of GT4's GRAM service shows significant improvement. Since the initial GT4 implementation in mid-2004, design improvements and profiling activities have improved performance by more than a factor of ten. The GT4 job submission service supports at least 10,000 concurrent job submissions on a reasonably configured system. The service can process up to 70 independent jobs per minute under normal (multiple user) scenarios. For scenarios in which a single user must submit many jobs at a high rate, a new delegation service streamlines security processing to attain faster rates. In addition, GT4 GRAM uses the RFT service (described next) in place of the old GASS service to manage data staging, eliminating redundant code and also giving the system administrator control over the maximum number of concurrent staging operations.

{mosgoogle left} GT3 introduced the Reliable File Transfer (RFT) service. RFT accepts requests for GridFTP transfers between systems and processes them in sequence until they are complete, allowing the client application to continue working during the transfers. At the time this article was written, the development team had accomplished a 300 percent improvement in the number of requests that the RFT service could handle with default configuration settings for the underlying Java Virtual Machine. (At that time, the service could manage about 21,000 concurrent requests.) The development team conducted tests on the RFT service that required transferring more than 500,000 files. As another example of GT4 service scalability, we note that the Replica Location Service (RLS) has been used by the LIGO Scientific Collaboration for some time now to manage 40 million replicas across 10 sites.

The Web services development environment for Java (WS Core) also improved greatly during the GT4 development cycle. From the first implementation in early 2004 to the time this article was written the messaging latency of the Web services environment (the time required to move a Web service message from the network interface to the service handler and return a response to the network interface) was reduced by 80 percent. The development team changed the default authentication method from WS Security (a Web services specification) to HTTPs (an older specification). This change significantly improved the time required to authenticate a service request--particularly in streaming scenarios--which had a major impact on all of the GT4 Web services tools. WS Security continues to be supported for applications that need it.

    Search

    Feedburner

    Login Form

    Share The Bananas


    Creative Commons License
    ©2005-2012 Copyright Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. The Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.