upc_all_gather
function#include <upc.h> #include <upc_collective.h> void upc_all_gather(shared void * restrict dst, shared const void * restrict src, size_t nbytes, upc_flag_t flags);
The upc_all_gather
function
copies a block of shared memory that has
affinity to the ith thread to the ith block of a shared memory area that has
affinity to a single thread. The number of bytes in each block is nbytes
.
nbytes
must be strictly greater than 0.
The upc all gather function treats the src
pointer as if it pointed to a
shared memory area of nbytes
bytes on each thread and therefore had type:
shared [nbytes] char[nbytes * THREADS]
and it treats the dst
pointer as if it pointed to a shared memory area with
the type:
shared [] char[nbytes * THREADS]
The target of the src
pointer must have affinity to thread 0.
The src
pointer is treated as if it has phase 0.
For each thread i, the effect is equivalent to copying the block of nbytes
bytes
pointed to by src
that has affinity to thread i to the ith block of nbytes
bytes pointed to by dst
.
upc_all_gather
for the static THREADS translation environment.
#include <upc.h> #include <upc_collective.h> #define NELEMS 10 shared [NELEMS] int A[NELEMS*THREADS]; shared [] int B[NELEMS*THREADS]; // Initialize A. upc_all_gather( B, A, sizeof(int)*NELEMS, UPC_IN_ALLSYNC | UPC_OUT_ALLSYNC );
upc_all_gather
for the dynamic THREADS translation environment.
#include <upc.h> #include <upc_collective.h> #define NELEMS 10 shared [NELEMS] int A[NELEMS*THREADS]; shared [] int *B; B = (shared [] int *) upc_all_alloc(1,NELEMS*THREADS*sizeof(int)); // Initialize A. upc_barrier; upc_all_gather( B, A, sizeof(int)*NELEMS, UPC_IN_NOSYNC | UPC_OUT_NOSYNC ); upc_barrier;