|
Page 1 of 2
In case you did not get enough the last time
When this column was originally written, Jeff was taking a break, supposedly writing his dissertation. Personally, I think he was procrastinating - doing "research" in Disneyland, hiking the Himalayas, working on MPI-3 or some other academic endeavor. In the meantime, Hello, I'm Brian - I'll be your host this month. We have lots of flavors on tap here at the House of MPI, including the new, Atkins-friendly, low-carb MPI_TYPE_CREATE_RESIZED.
A Quick Datatype Review
| Sidebar: Why not MPI_BYTE? |
|
MPI provides the datatype MPI_BYTE to represent a byte of memory. The MPI implementation will not perform any datatype conversion on the buffer. So why not use MPI_BYTE and avoid all the complexity of datatypes?
Using MPI_BYTE prevents MPI from performing any data conversion (as discussed in last month's article). Data padding and alignment issues, normally completely hidden from the user, must be taken into account. In C, this is generally not a problem because C programmers are used to dealing with padding issues. However, Fortran generally does a good job of handling padding and alignment behind the scenes. Using MPI_BYTE forces dangerous assumptions about the sizes of various datatypes.
|
In the last column we examined basic MPI datatypes. Datatypes provide necessary information to the MPI library about data format and location. As we saw last month, MPI provides both basic datatypes (MPI_INT) and the ability to create more advanced user-defined datatypes. MPI can use the type information to perform any format conversion, such as endian or size, necessary to communicate between two peers. Datatypes also simplify sending C structures or arrays of elements.
This month, we expand on our datatype coverage. Without knowledge of the basics of MPI datatypes, this month may be more difficult than the previous articles to follow. So find last month's magazine and read the basics of datatypes before getting started. In addition to performance benefits from letting the MPI do packing and unpacking, datatypes can simplify an application and help ensure messages are received correctly.
How to Avoid Datatypes
Despite what was said in the last column and the remainder of this
column, there are times where using user-defined datatypes are not the
best option. Legacy applications may require explicitly buffers for
sending, as was common with libraries before MPI. Data layout and size
may be dynamic during execution of the application, which makes
defining datatypes difficult. For these situations, MPI provides the
ability to explicitly pack noncontiguous data into user provided
buffers using MPI_PACK, with MPI_UNPACK for
unpacking. Listing 1 shows an example
of using MPI_PACK to send the structure used last month,
rather than creating a matching type.
Listing 1:
Building a buffer using MPI_PACK
1 struct my_struct {
2 int int_value[10];
3 double average;
4 char debug_name[MAX_NAME_LEN];
5 int flag;
6 };
7 void send_data(struct my_struct data, MPI_Comm comm, int rank) {
8 char buf[BUFSIZE];
9 int pos = 0;
10 MPI_Pack(&data.int_value, 10, MPI_INT, buf, BUFSIZE, &pos, comm);
11 MPI_Pack(&data.average, 1, MPI_DOUBLE, buf, BUFSIZE, &pos, comm);
12 MPI_Pack(&data.debug_name,MAX_NAME_LEN, MPI_CHAR, buf, BUFSIZE, &pos, comm);
13 MPI_Pack(&data.flag, 1, MPI_INT, buf, BUFSIZE, &pos, comm);
14 MPI_Send(buf, pos, MPI_PACKED, rank, 0, comm);
15 }
|
Sending Columns of a Matrix
In C, sending a row of a matrix is easy, as the row is stored in
consecutive bytes of memory. A column is more difficult, as the row
must be traversed before arriving at the next element in the
column. This space is often called the stride. Without
user-defined datatypes, there are two ways to send a column to another
process: send each element individually or pack the elements into an
array by hand. The code below shows how to avoid the hassle by
creating an MPI datatype.
Listing 2:
Creating a C matrix column datatype
1 double buf[10][12];
2 MPI_Datatype column;
3 MPI_Type_vector(10, 1, 12, MPI_DOUBLE, &column);
4 MPI_Type_commit(&column);
5 MPI_Send(buf[2], 1, column, 0, 0, MPI_COMM_WORLD);
|
In the listing above , the type is committed and immediately used. Once committed, the datatype can be reused throughout the program. By adjusting the index in the MPI_SEND, any column in the matrix can be sent. Not only is the number of lines of code required to send a column using user-defined datatypes smaller than if packed the buffer by hand, an MPI implementation has the option to avoid packing the data before sending. Some communication channels allow "vectored sends," meaning the ability to send from many data locations and receive into many data locations.
Send Only What Is Needed
Thus far, we have looked at ways to send simple datatypes, an entire
matrix, parts of a matrix, and an entire structure. It is also
possible to send only part of a structure. Listing 3 provides an example of
sending selected elements of a structure using datatypes. For example,
in a simple traffic simulation, a local vehicle may only need to know
the position and velocity of a remote vehicle. Locally, fuel and
destination are also tracked.
Listing 3:
Using parts of a structure
1 struct vehicle {
2 double position[3];
3 double destination[3];
4 double velocity[3];
5 double fuel;
6 }
7 struct vehicle cars[10];
8 MPI_Datatype tmp_car_type, car_type;
9 int i, counts[2]={ 3, 3 };
10 MPI_Datatype types[2]={ MPI_DOUBLE, MPI_DOUBLE };
11 MPI_Aint disps[2];
12 MPI_Address(&cars[0].position, &disps[0]);
13 MPI_Address(&cars[0].velocity, &disps[1]);
14 disps[1] -= disps[0];
15 MPI_Type_struct(2, counts, disps, types, &tmp_car_type);
16 MPI_Type_create_resized(tmp_car_type, 0, sizeof(struct vehicle), &car_type);
16 MPI_Type_commit(&car_type);
...
17 MPI_Send(cars, 10, car_type, ...);
|
|