upc_all_permute
function#include <upc.h> #include <upc_collective.h> void upc_all_permute(shared void * restrict dst, shared const void * restrict src, shared const int * restrict perm, size_t nbytes, upc_flag_t flags);
The upc_all_permute
function
copies a block of memory from a shared memory
area that has affinity to the ith thread to a block of a shared memory
that has affinity to thread perm[i]. The number of bytes in each block is
nbytes
.
nbytes
must be strictly greater than 0.
perm[0..THREADS-1]
must contain THREADS
distinct values:
0, 1, ..., THREADS-1.
The upc_all_permute
function treats the src
pointer and the dst
pointer
as if each pointed to a shared memory area of nbytes
bytes on each thread
and therefore had type:
shared [nbytes
] char[nbytes
* THREADS]
The targets of the src
, perm, and dst
pointers must have affinity to thread 0.
The src
and dst
pointers are treated as if they have phase 0.
The effect is equivalent to copying the block of nbytes
bytes that has affinity
to thread i pointed to by src
to the block of nbytes
bytes that has affinity
to thread perm[i] pointed to by dst
.
upc_all_permute
#include <upc.h> #include <upc_collective.h> #define NELEMS 10 shared [NELEMS] int A[NELEMS*THREADS], B[NELEMS*THREADS]; shared int P[THREADS]; // Initialize A and P. upc_barrier; upc_all_permute( B, A, P, sizeof(int)*NELEMS, UPC_IN_NOSYNC | UPC_OUT_NOSYNC ); upc_barrier;