Compiling and running rocSHMEM applications#

This topic explains how to compile and run rocSHMEM applications.

Compiling and linking with rocSHMEM#

rocSHMEM is a library that can be statically linked to your application during compilation with hipcc. For more information, see HIPCC.

When compiling your application with hipcc, you must include the rocSHMEM header files and the rocSHMEM library. Because rocSHMEM depends on MPI (Message Passing Interface), you must manually add the arguments for MPI linkage instead of using mpicc.

When using hipcc directly without a build system, it’s recommended to perform the compilation and linking steps separately.

Example compile and link commands are provided at the top of the example files in the examples directory:

# Compile
hipcc -c -fgpu-rdc -x hip rocshmem_allreduce_test.cc \
  -I/opt/rocm/include                                \
  -I$ROCSHMEM_INSTALL_DIR/include                    \
  -I$OPENMPI_UCX_INSTALL_DIR/include/

# Link
hipcc -fgpu-rdc --hip-link rocshmem_allreduce_test.o -o rocshmem_allreduce_test \
  $ROCSHMEM_INSTALL_DIR/lib/librocshmem.a                                       \
  $OPENMPI_UCX_INSTALL_DIR/lib/libmpi.so                                        \
  -L/opt/rocm/lib -lamdhip64 -lhsa-runtime64

If your project uses CMake, see Using CMake with AMD ROCm.

Running a rocSHMEM application#

Applications using rocSHMEM typically deploy multiple processes, usually one per GPU. The MPI launcher, for example, mpiexec with Open MPI, is used to start the required number of processes. For example, to launch two getmem example processes (available when compiled from source):

mpiexec --map-by numa --mca pml ucx --mca osc ucx -np 2 ./build/examples/rocshmem_getmem_test

See the Open MPI documentation for more information about mpiexec command line parameters.

Note

Some systems may have multiple MPI installations, some of which do not have GPU support enabled. You must use the mpiexec from the expected MPI library, especially when using the MPI built by yourself as part of Building dependencies.

Environment variables#

You can control the behavior of rocSHMEM by using the following environment variables:

Table 1 Environment Variables#

Name

Default Value

Description

ROCSHMEM_HEAP_SIZE

1

Defines the size of the rocSHMEM symmetric heap in GB. Note the heap is on the GPU memory.

ROCSHMEM_MAX_NUM_CONTEXTS

1024

Defines the number of contexts an application can use.

ROCSHMEM_MAX_NUM_TEAMS

40

Defines the number of teams an application can use.