Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/cplusplus/144.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
MVAPICH挂起超过急切阈值的消息的MPI_发送 C++中有一个简单的程序(MVICH),它发送一个浮点数组。当我使用MPI_Send、MPI_Ssend、MPI_Rsend时,如果数据的大小超过急切阈值(在我的程序中为64k),那么在调用MPI_Send期间,我的程序将挂起。如果数组小于阈值,程序工作正常。源代码如下: #include "mpi.h" #include <unistd.h> #include <stdio.h> int main(int argc,char *argv[]) { int mype=0,size=1; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&mype); MPI_Comm_size(MPI_COMM_WORLD,&size); int num = 2048*2048; float* h_pos = new float[num]; MPI_Status stat; if(mype == 0) { MPI_Rsend(h_pos, 20000, MPI_FLOAT, 1, 5, MPI_COMM_WORLD); } if(mype == 1) { printf("%fkb\n", 20000.0f*sizeof(float)/1024); MPI_Recv(h_pos, 20000, MPI_FLOAT, 0, 5, MPI_COMM_WORLD, &stat); } MPI_Finalize(); return 0; } MVAPICH2 All Parameters MV2_COMM_WORLD_LOCAL_RANK : 0 PMI_ID : 0 MPIRUN_RSH_LAUNCH : 0 MPISPAWN_GLOBAL_NPROCS : 2 MPISPAWN_MPIRUN_HOST : g718a MPISPAWN_MPIRUN_ID : 10800 MPISPAWN_NNODES : 1 MPISPAWN_WORKING_DIR : /home/g718a/new_workspace/mpi_test USE_LINEAR_SSH : 1 PMI_PORT : g718a:42714 MV2_3DTORUS_SUPPORT : 0 MV2_NUM_SA_QUERY_RETRIES : 20 MV2_NUM_SLS : 8 MV2_DEFAULT_SERVICE_LEVEL : 0 MV2_PATH_SL_QUERY : 0 MV2_USE_QOS : 0 MV2_ALLGATHER_BRUCK_THRESHOLD : 524288 MV2_ALLGATHER_RD_THRESHOLD : 81920 MV2_ALLGATHER_REVERSE_RANKING : 1 MV2_ALLGATHERV_RD_THRESHOLD : 0 MV2_ALLREDUCE_2LEVEL_MSG : 262144 MV2_ALLREDUCE_SHORT_MSG : 2048 MV2_ALLTOALL_MEDIUM_MSG : 16384 MV2_ALLTOALL_SMALL_MSG : 2048 MV2_ALLTOALL_THROTTLE_FACTOR : 4 MV2_BCAST_TWO_LEVEL_SYSTEM_SIZE : 64 MV2_GATHER_SWITCH_PT : 0 MV2_INTRA_SHMEM_REDUCE_MSG : 2048 MV2_KNOMIAL_2LEVEL_BCAST_MESSAGE_SIZE_THRESHOLD : 2048 MV2_KNOMIAL_2LEVEL_BCAST_SYSTEM_SIZE_THRESHOLD : 64 MV2_KNOMIAL_INTER_LEADER_THRESHOLD : 65536 MV2_KNOMIAL_INTER_NODE_FACTOR : 4 MV2_KNOMIAL_INTRA_NODE_FACTOR : 4 MV2_KNOMIAL_INTRA_NODE_THRESHOLD : 131072 MV2_RED_SCAT_LARGE_MSG : 524288 MV2_RED_SCAT_SHORT_MSG : 64 MV2_REDUCE_2LEVEL_MSG : 16384 MV2_REDUCE_SHORT_MSG : 8192 MV2_SCATTER_MEDIUM_MSG : 0 MV2_SCATTER_SMALL_MSG : 0 MV2_SHMEM_ALLREDUCE_MSG : 32768 MV2_SHMEM_COLL_MAX_MSG_SIZE : 131072 MV2_SHMEM_COLL_NUM_COMM : 8 MV2_SHMEM_COLL_NUM_PROCS : 2 MV2_SHMEM_COLL_SPIN_COUNT : 5 MV2_SHMEM_REDUCE_MSG : 4096 MV2_USE_BCAST_SHORT_MSG : 16384 MV2_USE_DIRECT_GATHER : 1 MV2_USE_DIRECT_GATHER_SYSTEM_SIZE_MEDIUM : 1024 MV2_USE_DIRECT_GATHER_SYSTEM_SIZE_SMALL : 384 MV2_USE_DIRECT_SCATTER : 1 MV2_USE_OSU_COLLECTIVES : 1 MV2_USE_OSU_NB_COLLECTIVES : 1 MV2_USE_KNOMIAL_2LEVEL_BCAST : 1 MV2_USE_KNOMIAL_INTER_LEADER_BCAST : 1 MV2_USE_SCATTER_RD_INTER_LEADER_BCAST : 1 MV2_USE_SCATTER_RING_INTER_LEADER_BCAST : 1 MV2_USE_SHMEM_ALLREDUCE : 1 MV2_USE_SHMEM_BARRIER : 1 MV2_USE_SHMEM_BCAST : 1 MV2_USE_SHMEM_COLL : 1 MV2_USE_SHMEM_REDUCE : 1 MV2_USE_TWO_LEVEL_GATHER : 1 MV2_USE_TWO_LEVEL_SCATTER : 1 MV2_USE_XOR_ALLTOALL : 1 MV2_DEFAULT_SRC_PATH_BITS : 0 MV2_DEFAULT_STATIC_RATE : 0 MV2_DEFAULT_TIME_OUT : 67374100 MV2_DEFAULT_MTU : 0 MV2_DEFAULT_PKEY : 0 MV2_DEFAULT_PORT : -1 MV2_DEFAULT_GID_INDEX : 0 MV2_DEFAULT_PSN : 0 MV2_DEFAULT_MAX_RECV_WQE : 128 MV2_DEFAULT_MAX_SEND_WQE : 64 MV2_DEFAULT_MAX_SG_LIST : 1 MV2_DEFAULT_MIN_RNR_TIMER : 12 MV2_DEFAULT_QP_OUS_RD_ATOM : 257 MV2_DEFAULT_RETRY_COUNT : 67900423 MV2_DEFAULT_RNR_RETRY : 202639111 MV2_DEFAULT_MAX_CQ_SIZE : 40000 MV2_DEFAULT_MAX_RDMA_DST_OPS : 4 MV2_INITIAL_PREPOST_DEPTH : 10 MV2_IWARP_MULTIPLE_CQ_THRESHOLD : 32 MV2_NUM_HCAS : 1 MV2_NUM_NODES_IN_JOB : 1 MV2_NUM_PORTS : 1 MV2_NUM_QP_PER_PORT : 1 MV2_MAX_RDMA_CONNECT_ATTEMPTS : 10 MV2_ON_DEMAND_UD_INFO_EXCHANGE : 1 MV2_PREPOST_DEPTH : 64 MV2_HOMOGENEOUS_CLUSTER : 0 MV2_COALESCE_THRESHOLD : 6 MV2_DREG_CACHE_LIMIT : 0 MV2_IBA_EAGER_THRESHOLD : 0 MV2_MAX_INLINE_SIZE : 0 MV2_MAX_R3_PENDING_DATA : 524288 MV2_MED_MSG_RAIL_SHARING_POLICY : 0 MV2_NDREG_ENTRIES : 0 MV2_NUM_RDMA_BUFFER : 0 MV2_NUM_SPINS_BEFORE_LOCK : 2000 MV2_POLLING_LEVEL : 1 MV2_POLLING_SET_LIMIT : -1 MV2_POLLING_SET_THRESHOLD : 256 MV2_R3_NOCACHE_THRESHOLD : 32768 MV2_R3_THRESHOLD : 4096 MV2_RAIL_SHARING_LARGE_MSG_THRESHOLD : 16384 MV2_RAIL_SHARING_MED_MSG_THRESHOLD : 2048 MV2_RAIL_SHARING_POLICY : 4 MV2_RDMA_EAGER_LIMIT : 32 MV2_RDMA_FAST_PATH_BUF_SIZE : 4096 MV2_RDMA_NUM_EXTRA_POLLS : 1 MV2_RNDV_EXT_SENDQ_SIZE : 5 MV2_RNDV_PROTOCOL : 3 MV2_SMALL_MSG_RAIL_SHARING_POLICY : 0 MV2_SPIN_COUNT : 5000 MV2_SRQ_LIMIT : 30 MV2_SRQ_MAX_SIZE : 4096 MV2_SRQ_SIZE : 256 MV2_STRIPING_THRESHOLD : 8192 MV2_USE_COALESCE : 0 MV2_USE_XRC : 0 MV2_VBUF_MAX : -1 MV2_VBUF_POOL_SIZE : 512 MV2_VBUF_SECONDARY_POOL_SIZE : 256 MV2_VBUF_TOTAL_SIZE : 0 MV2_USE_HWLOC_CPU_BINDING : 1 MV2_ENABLE_AFFINITY : 1 MV2_ENABLE_LEASTLOAD : 0 MV2_SMP_BATCH_SIZE : 8 MV2_SMP_EAGERSIZE : 65537 MV2_SMPI_LENGTH_QUEUE : 262144 MV2_SMP_NUM_SEND_BUFFER : 256 MV2_SMP_SEND_BUF_SIZE : 131072 MV2_USE_SHARED_MEM : 1 MV2_CUDA_BLOCK_SIZE : 0 MV2_CUDA_NUM_RNDV_BLOCKS : 8 MV2_CUDA_VECTOR_OPT : 1 MV2_CUDA_KERNEL_OPT : 1 MV2_EAGER_CUDAHOST_REG : 0 MV2_USE_CUDA : 1 MV2_CUDA_NUM_EVENTS : 64 MV2_CUDA_IPC : 1 MV2_CUDA_IPC_THRESHOLD : 0 MV2_CUDA_ENABLE_IPC_CACHE : 0 MV2_CUDA_IPC_MAX_CACHE_ENTRIES : 1 MV2_CUDA_IPC_NUM_STAGE_BUFFERS : 2 MV2_CUDA_IPC_STAGE_BUF_SIZE : 524288 MV2_CUDA_IPC_BUFFERED : 1 MV2_CUDA_IPC_BUFFERED_LIMIT : 33554432 MV2_CUDA_IPC_SYNC_LIMIT : 16384 MV2_CUDA_USE_NAIVE : 1 MV2_CUDA_REGISTER_NAIVE_BUF : 524288 MV2_CUDA_GATHER_NAIVE_LIMIT : 32768 MV2_CUDA_SCATTER_NAIVE_LIMIT : 2048 MV2_CUDA_ALLGATHER_NAIVE_LIMIT : 1048576 MV2_CUDA_ALLGATHERV_NAIVE_LIMIT : 524288 MV2_CUDA_ALLTOALL_NAIVE_LIMIT : 262144 MV2_CUDA_ALLTOALLV_NAIVE_LIMIT : 262144 MV2_CUDA_BCAST_NAIVE_LIMIT : 2097152 MV2_CUDA_GATHERV_NAIVE_LIMIT : 0 MV2_CUDA_SCATTERV_NAIVE_LIMIT : 16384 MV2_CUDA_ALLTOALL_DYNAMIC : 1 MV2_CUDA_ALLGATHER_RD_LIMIT : 1024 MV2_CUDA_ALLGATHER_FGP : 1 MV2_SMP_CUDA_PIPELINE : 1 MV2_CUDA_INIT_CONTEXT : 1 MV2_SHOW_ENV_INFO : 2 MV2_DEFAULT_PUT_GET_LIST_SIZE : 200 MV2_EAGERSIZE_1SC : 0 MV2_GET_FALLBACK_THRESHOLD : 0 MV2_PIN_POOL_SIZE : 2097152 MV2_PUT_FALLBACK_THRESHOLD : 0 MV2_ASYNC_THREAD_STACK_SIZE : 1048576 MV2_THREAD_YIELD_SPIN_THRESHOLD : 5 MV2_USE_HUGEPAGES : 1_C++_Mpi_Mvapich2 - Fatal编程技术网

MVAPICH挂起超过急切阈值的消息的MPI_发送 C++中有一个简单的程序(MVICH),它发送一个浮点数组。当我使用MPI_Send、MPI_Ssend、MPI_Rsend时,如果数据的大小超过急切阈值(在我的程序中为64k),那么在调用MPI_Send期间,我的程序将挂起。如果数组小于阈值,程序工作正常。源代码如下: #include "mpi.h" #include <unistd.h> #include <stdio.h> int main(int argc,char *argv[]) { int mype=0,size=1; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&mype); MPI_Comm_size(MPI_COMM_WORLD,&size); int num = 2048*2048; float* h_pos = new float[num]; MPI_Status stat; if(mype == 0) { MPI_Rsend(h_pos, 20000, MPI_FLOAT, 1, 5, MPI_COMM_WORLD); } if(mype == 1) { printf("%fkb\n", 20000.0f*sizeof(float)/1024); MPI_Recv(h_pos, 20000, MPI_FLOAT, 0, 5, MPI_COMM_WORLD, &stat); } MPI_Finalize(); return 0; } MVAPICH2 All Parameters MV2_COMM_WORLD_LOCAL_RANK : 0 PMI_ID : 0 MPIRUN_RSH_LAUNCH : 0 MPISPAWN_GLOBAL_NPROCS : 2 MPISPAWN_MPIRUN_HOST : g718a MPISPAWN_MPIRUN_ID : 10800 MPISPAWN_NNODES : 1 MPISPAWN_WORKING_DIR : /home/g718a/new_workspace/mpi_test USE_LINEAR_SSH : 1 PMI_PORT : g718a:42714 MV2_3DTORUS_SUPPORT : 0 MV2_NUM_SA_QUERY_RETRIES : 20 MV2_NUM_SLS : 8 MV2_DEFAULT_SERVICE_LEVEL : 0 MV2_PATH_SL_QUERY : 0 MV2_USE_QOS : 0 MV2_ALLGATHER_BRUCK_THRESHOLD : 524288 MV2_ALLGATHER_RD_THRESHOLD : 81920 MV2_ALLGATHER_REVERSE_RANKING : 1 MV2_ALLGATHERV_RD_THRESHOLD : 0 MV2_ALLREDUCE_2LEVEL_MSG : 262144 MV2_ALLREDUCE_SHORT_MSG : 2048 MV2_ALLTOALL_MEDIUM_MSG : 16384 MV2_ALLTOALL_SMALL_MSG : 2048 MV2_ALLTOALL_THROTTLE_FACTOR : 4 MV2_BCAST_TWO_LEVEL_SYSTEM_SIZE : 64 MV2_GATHER_SWITCH_PT : 0 MV2_INTRA_SHMEM_REDUCE_MSG : 2048 MV2_KNOMIAL_2LEVEL_BCAST_MESSAGE_SIZE_THRESHOLD : 2048 MV2_KNOMIAL_2LEVEL_BCAST_SYSTEM_SIZE_THRESHOLD : 64 MV2_KNOMIAL_INTER_LEADER_THRESHOLD : 65536 MV2_KNOMIAL_INTER_NODE_FACTOR : 4 MV2_KNOMIAL_INTRA_NODE_FACTOR : 4 MV2_KNOMIAL_INTRA_NODE_THRESHOLD : 131072 MV2_RED_SCAT_LARGE_MSG : 524288 MV2_RED_SCAT_SHORT_MSG : 64 MV2_REDUCE_2LEVEL_MSG : 16384 MV2_REDUCE_SHORT_MSG : 8192 MV2_SCATTER_MEDIUM_MSG : 0 MV2_SCATTER_SMALL_MSG : 0 MV2_SHMEM_ALLREDUCE_MSG : 32768 MV2_SHMEM_COLL_MAX_MSG_SIZE : 131072 MV2_SHMEM_COLL_NUM_COMM : 8 MV2_SHMEM_COLL_NUM_PROCS : 2 MV2_SHMEM_COLL_SPIN_COUNT : 5 MV2_SHMEM_REDUCE_MSG : 4096 MV2_USE_BCAST_SHORT_MSG : 16384 MV2_USE_DIRECT_GATHER : 1 MV2_USE_DIRECT_GATHER_SYSTEM_SIZE_MEDIUM : 1024 MV2_USE_DIRECT_GATHER_SYSTEM_SIZE_SMALL : 384 MV2_USE_DIRECT_SCATTER : 1 MV2_USE_OSU_COLLECTIVES : 1 MV2_USE_OSU_NB_COLLECTIVES : 1 MV2_USE_KNOMIAL_2LEVEL_BCAST : 1 MV2_USE_KNOMIAL_INTER_LEADER_BCAST : 1 MV2_USE_SCATTER_RD_INTER_LEADER_BCAST : 1 MV2_USE_SCATTER_RING_INTER_LEADER_BCAST : 1 MV2_USE_SHMEM_ALLREDUCE : 1 MV2_USE_SHMEM_BARRIER : 1 MV2_USE_SHMEM_BCAST : 1 MV2_USE_SHMEM_COLL : 1 MV2_USE_SHMEM_REDUCE : 1 MV2_USE_TWO_LEVEL_GATHER : 1 MV2_USE_TWO_LEVEL_SCATTER : 1 MV2_USE_XOR_ALLTOALL : 1 MV2_DEFAULT_SRC_PATH_BITS : 0 MV2_DEFAULT_STATIC_RATE : 0 MV2_DEFAULT_TIME_OUT : 67374100 MV2_DEFAULT_MTU : 0 MV2_DEFAULT_PKEY : 0 MV2_DEFAULT_PORT : -1 MV2_DEFAULT_GID_INDEX : 0 MV2_DEFAULT_PSN : 0 MV2_DEFAULT_MAX_RECV_WQE : 128 MV2_DEFAULT_MAX_SEND_WQE : 64 MV2_DEFAULT_MAX_SG_LIST : 1 MV2_DEFAULT_MIN_RNR_TIMER : 12 MV2_DEFAULT_QP_OUS_RD_ATOM : 257 MV2_DEFAULT_RETRY_COUNT : 67900423 MV2_DEFAULT_RNR_RETRY : 202639111 MV2_DEFAULT_MAX_CQ_SIZE : 40000 MV2_DEFAULT_MAX_RDMA_DST_OPS : 4 MV2_INITIAL_PREPOST_DEPTH : 10 MV2_IWARP_MULTIPLE_CQ_THRESHOLD : 32 MV2_NUM_HCAS : 1 MV2_NUM_NODES_IN_JOB : 1 MV2_NUM_PORTS : 1 MV2_NUM_QP_PER_PORT : 1 MV2_MAX_RDMA_CONNECT_ATTEMPTS : 10 MV2_ON_DEMAND_UD_INFO_EXCHANGE : 1 MV2_PREPOST_DEPTH : 64 MV2_HOMOGENEOUS_CLUSTER : 0 MV2_COALESCE_THRESHOLD : 6 MV2_DREG_CACHE_LIMIT : 0 MV2_IBA_EAGER_THRESHOLD : 0 MV2_MAX_INLINE_SIZE : 0 MV2_MAX_R3_PENDING_DATA : 524288 MV2_MED_MSG_RAIL_SHARING_POLICY : 0 MV2_NDREG_ENTRIES : 0 MV2_NUM_RDMA_BUFFER : 0 MV2_NUM_SPINS_BEFORE_LOCK : 2000 MV2_POLLING_LEVEL : 1 MV2_POLLING_SET_LIMIT : -1 MV2_POLLING_SET_THRESHOLD : 256 MV2_R3_NOCACHE_THRESHOLD : 32768 MV2_R3_THRESHOLD : 4096 MV2_RAIL_SHARING_LARGE_MSG_THRESHOLD : 16384 MV2_RAIL_SHARING_MED_MSG_THRESHOLD : 2048 MV2_RAIL_SHARING_POLICY : 4 MV2_RDMA_EAGER_LIMIT : 32 MV2_RDMA_FAST_PATH_BUF_SIZE : 4096 MV2_RDMA_NUM_EXTRA_POLLS : 1 MV2_RNDV_EXT_SENDQ_SIZE : 5 MV2_RNDV_PROTOCOL : 3 MV2_SMALL_MSG_RAIL_SHARING_POLICY : 0 MV2_SPIN_COUNT : 5000 MV2_SRQ_LIMIT : 30 MV2_SRQ_MAX_SIZE : 4096 MV2_SRQ_SIZE : 256 MV2_STRIPING_THRESHOLD : 8192 MV2_USE_COALESCE : 0 MV2_USE_XRC : 0 MV2_VBUF_MAX : -1 MV2_VBUF_POOL_SIZE : 512 MV2_VBUF_SECONDARY_POOL_SIZE : 256 MV2_VBUF_TOTAL_SIZE : 0 MV2_USE_HWLOC_CPU_BINDING : 1 MV2_ENABLE_AFFINITY : 1 MV2_ENABLE_LEASTLOAD : 0 MV2_SMP_BATCH_SIZE : 8 MV2_SMP_EAGERSIZE : 65537 MV2_SMPI_LENGTH_QUEUE : 262144 MV2_SMP_NUM_SEND_BUFFER : 256 MV2_SMP_SEND_BUF_SIZE : 131072 MV2_USE_SHARED_MEM : 1 MV2_CUDA_BLOCK_SIZE : 0 MV2_CUDA_NUM_RNDV_BLOCKS : 8 MV2_CUDA_VECTOR_OPT : 1 MV2_CUDA_KERNEL_OPT : 1 MV2_EAGER_CUDAHOST_REG : 0 MV2_USE_CUDA : 1 MV2_CUDA_NUM_EVENTS : 64 MV2_CUDA_IPC : 1 MV2_CUDA_IPC_THRESHOLD : 0 MV2_CUDA_ENABLE_IPC_CACHE : 0 MV2_CUDA_IPC_MAX_CACHE_ENTRIES : 1 MV2_CUDA_IPC_NUM_STAGE_BUFFERS : 2 MV2_CUDA_IPC_STAGE_BUF_SIZE : 524288 MV2_CUDA_IPC_BUFFERED : 1 MV2_CUDA_IPC_BUFFERED_LIMIT : 33554432 MV2_CUDA_IPC_SYNC_LIMIT : 16384 MV2_CUDA_USE_NAIVE : 1 MV2_CUDA_REGISTER_NAIVE_BUF : 524288 MV2_CUDA_GATHER_NAIVE_LIMIT : 32768 MV2_CUDA_SCATTER_NAIVE_LIMIT : 2048 MV2_CUDA_ALLGATHER_NAIVE_LIMIT : 1048576 MV2_CUDA_ALLGATHERV_NAIVE_LIMIT : 524288 MV2_CUDA_ALLTOALL_NAIVE_LIMIT : 262144 MV2_CUDA_ALLTOALLV_NAIVE_LIMIT : 262144 MV2_CUDA_BCAST_NAIVE_LIMIT : 2097152 MV2_CUDA_GATHERV_NAIVE_LIMIT : 0 MV2_CUDA_SCATTERV_NAIVE_LIMIT : 16384 MV2_CUDA_ALLTOALL_DYNAMIC : 1 MV2_CUDA_ALLGATHER_RD_LIMIT : 1024 MV2_CUDA_ALLGATHER_FGP : 1 MV2_SMP_CUDA_PIPELINE : 1 MV2_CUDA_INIT_CONTEXT : 1 MV2_SHOW_ENV_INFO : 2 MV2_DEFAULT_PUT_GET_LIST_SIZE : 200 MV2_EAGERSIZE_1SC : 0 MV2_GET_FALLBACK_THRESHOLD : 0 MV2_PIN_POOL_SIZE : 2097152 MV2_PUT_FALLBACK_THRESHOLD : 0 MV2_ASYNC_THREAD_STACK_SIZE : 1048576 MV2_THREAD_YIELD_SPIN_THRESHOLD : 5 MV2_USE_HUGEPAGES : 1

MVAPICH挂起超过急切阈值的消息的MPI_发送 C++中有一个简单的程序(MVICH),它发送一个浮点数组。当我使用MPI_Send、MPI_Ssend、MPI_Rsend时,如果数据的大小超过急切阈值(在我的程序中为64k),那么在调用MPI_Send期间,我的程序将挂起。如果数组小于阈值,程序工作正常。源代码如下: #include "mpi.h" #include <unistd.h> #include <stdio.h> int main(int argc,char *argv[]) { int mype=0,size=1; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&mype); MPI_Comm_size(MPI_COMM_WORLD,&size); int num = 2048*2048; float* h_pos = new float[num]; MPI_Status stat; if(mype == 0) { MPI_Rsend(h_pos, 20000, MPI_FLOAT, 1, 5, MPI_COMM_WORLD); } if(mype == 1) { printf("%fkb\n", 20000.0f*sizeof(float)/1024); MPI_Recv(h_pos, 20000, MPI_FLOAT, 0, 5, MPI_COMM_WORLD, &stat); } MPI_Finalize(); return 0; } MVAPICH2 All Parameters MV2_COMM_WORLD_LOCAL_RANK : 0 PMI_ID : 0 MPIRUN_RSH_LAUNCH : 0 MPISPAWN_GLOBAL_NPROCS : 2 MPISPAWN_MPIRUN_HOST : g718a MPISPAWN_MPIRUN_ID : 10800 MPISPAWN_NNODES : 1 MPISPAWN_WORKING_DIR : /home/g718a/new_workspace/mpi_test USE_LINEAR_SSH : 1 PMI_PORT : g718a:42714 MV2_3DTORUS_SUPPORT : 0 MV2_NUM_SA_QUERY_RETRIES : 20 MV2_NUM_SLS : 8 MV2_DEFAULT_SERVICE_LEVEL : 0 MV2_PATH_SL_QUERY : 0 MV2_USE_QOS : 0 MV2_ALLGATHER_BRUCK_THRESHOLD : 524288 MV2_ALLGATHER_RD_THRESHOLD : 81920 MV2_ALLGATHER_REVERSE_RANKING : 1 MV2_ALLGATHERV_RD_THRESHOLD : 0 MV2_ALLREDUCE_2LEVEL_MSG : 262144 MV2_ALLREDUCE_SHORT_MSG : 2048 MV2_ALLTOALL_MEDIUM_MSG : 16384 MV2_ALLTOALL_SMALL_MSG : 2048 MV2_ALLTOALL_THROTTLE_FACTOR : 4 MV2_BCAST_TWO_LEVEL_SYSTEM_SIZE : 64 MV2_GATHER_SWITCH_PT : 0 MV2_INTRA_SHMEM_REDUCE_MSG : 2048 MV2_KNOMIAL_2LEVEL_BCAST_MESSAGE_SIZE_THRESHOLD : 2048 MV2_KNOMIAL_2LEVEL_BCAST_SYSTEM_SIZE_THRESHOLD : 64 MV2_KNOMIAL_INTER_LEADER_THRESHOLD : 65536 MV2_KNOMIAL_INTER_NODE_FACTOR : 4 MV2_KNOMIAL_INTRA_NODE_FACTOR : 4 MV2_KNOMIAL_INTRA_NODE_THRESHOLD : 131072 MV2_RED_SCAT_LARGE_MSG : 524288 MV2_RED_SCAT_SHORT_MSG : 64 MV2_REDUCE_2LEVEL_MSG : 16384 MV2_REDUCE_SHORT_MSG : 8192 MV2_SCATTER_MEDIUM_MSG : 0 MV2_SCATTER_SMALL_MSG : 0 MV2_SHMEM_ALLREDUCE_MSG : 32768 MV2_SHMEM_COLL_MAX_MSG_SIZE : 131072 MV2_SHMEM_COLL_NUM_COMM : 8 MV2_SHMEM_COLL_NUM_PROCS : 2 MV2_SHMEM_COLL_SPIN_COUNT : 5 MV2_SHMEM_REDUCE_MSG : 4096 MV2_USE_BCAST_SHORT_MSG : 16384 MV2_USE_DIRECT_GATHER : 1 MV2_USE_DIRECT_GATHER_SYSTEM_SIZE_MEDIUM : 1024 MV2_USE_DIRECT_GATHER_SYSTEM_SIZE_SMALL : 384 MV2_USE_DIRECT_SCATTER : 1 MV2_USE_OSU_COLLECTIVES : 1 MV2_USE_OSU_NB_COLLECTIVES : 1 MV2_USE_KNOMIAL_2LEVEL_BCAST : 1 MV2_USE_KNOMIAL_INTER_LEADER_BCAST : 1 MV2_USE_SCATTER_RD_INTER_LEADER_BCAST : 1 MV2_USE_SCATTER_RING_INTER_LEADER_BCAST : 1 MV2_USE_SHMEM_ALLREDUCE : 1 MV2_USE_SHMEM_BARRIER : 1 MV2_USE_SHMEM_BCAST : 1 MV2_USE_SHMEM_COLL : 1 MV2_USE_SHMEM_REDUCE : 1 MV2_USE_TWO_LEVEL_GATHER : 1 MV2_USE_TWO_LEVEL_SCATTER : 1 MV2_USE_XOR_ALLTOALL : 1 MV2_DEFAULT_SRC_PATH_BITS : 0 MV2_DEFAULT_STATIC_RATE : 0 MV2_DEFAULT_TIME_OUT : 67374100 MV2_DEFAULT_MTU : 0 MV2_DEFAULT_PKEY : 0 MV2_DEFAULT_PORT : -1 MV2_DEFAULT_GID_INDEX : 0 MV2_DEFAULT_PSN : 0 MV2_DEFAULT_MAX_RECV_WQE : 128 MV2_DEFAULT_MAX_SEND_WQE : 64 MV2_DEFAULT_MAX_SG_LIST : 1 MV2_DEFAULT_MIN_RNR_TIMER : 12 MV2_DEFAULT_QP_OUS_RD_ATOM : 257 MV2_DEFAULT_RETRY_COUNT : 67900423 MV2_DEFAULT_RNR_RETRY : 202639111 MV2_DEFAULT_MAX_CQ_SIZE : 40000 MV2_DEFAULT_MAX_RDMA_DST_OPS : 4 MV2_INITIAL_PREPOST_DEPTH : 10 MV2_IWARP_MULTIPLE_CQ_THRESHOLD : 32 MV2_NUM_HCAS : 1 MV2_NUM_NODES_IN_JOB : 1 MV2_NUM_PORTS : 1 MV2_NUM_QP_PER_PORT : 1 MV2_MAX_RDMA_CONNECT_ATTEMPTS : 10 MV2_ON_DEMAND_UD_INFO_EXCHANGE : 1 MV2_PREPOST_DEPTH : 64 MV2_HOMOGENEOUS_CLUSTER : 0 MV2_COALESCE_THRESHOLD : 6 MV2_DREG_CACHE_LIMIT : 0 MV2_IBA_EAGER_THRESHOLD : 0 MV2_MAX_INLINE_SIZE : 0 MV2_MAX_R3_PENDING_DATA : 524288 MV2_MED_MSG_RAIL_SHARING_POLICY : 0 MV2_NDREG_ENTRIES : 0 MV2_NUM_RDMA_BUFFER : 0 MV2_NUM_SPINS_BEFORE_LOCK : 2000 MV2_POLLING_LEVEL : 1 MV2_POLLING_SET_LIMIT : -1 MV2_POLLING_SET_THRESHOLD : 256 MV2_R3_NOCACHE_THRESHOLD : 32768 MV2_R3_THRESHOLD : 4096 MV2_RAIL_SHARING_LARGE_MSG_THRESHOLD : 16384 MV2_RAIL_SHARING_MED_MSG_THRESHOLD : 2048 MV2_RAIL_SHARING_POLICY : 4 MV2_RDMA_EAGER_LIMIT : 32 MV2_RDMA_FAST_PATH_BUF_SIZE : 4096 MV2_RDMA_NUM_EXTRA_POLLS : 1 MV2_RNDV_EXT_SENDQ_SIZE : 5 MV2_RNDV_PROTOCOL : 3 MV2_SMALL_MSG_RAIL_SHARING_POLICY : 0 MV2_SPIN_COUNT : 5000 MV2_SRQ_LIMIT : 30 MV2_SRQ_MAX_SIZE : 4096 MV2_SRQ_SIZE : 256 MV2_STRIPING_THRESHOLD : 8192 MV2_USE_COALESCE : 0 MV2_USE_XRC : 0 MV2_VBUF_MAX : -1 MV2_VBUF_POOL_SIZE : 512 MV2_VBUF_SECONDARY_POOL_SIZE : 256 MV2_VBUF_TOTAL_SIZE : 0 MV2_USE_HWLOC_CPU_BINDING : 1 MV2_ENABLE_AFFINITY : 1 MV2_ENABLE_LEASTLOAD : 0 MV2_SMP_BATCH_SIZE : 8 MV2_SMP_EAGERSIZE : 65537 MV2_SMPI_LENGTH_QUEUE : 262144 MV2_SMP_NUM_SEND_BUFFER : 256 MV2_SMP_SEND_BUF_SIZE : 131072 MV2_USE_SHARED_MEM : 1 MV2_CUDA_BLOCK_SIZE : 0 MV2_CUDA_NUM_RNDV_BLOCKS : 8 MV2_CUDA_VECTOR_OPT : 1 MV2_CUDA_KERNEL_OPT : 1 MV2_EAGER_CUDAHOST_REG : 0 MV2_USE_CUDA : 1 MV2_CUDA_NUM_EVENTS : 64 MV2_CUDA_IPC : 1 MV2_CUDA_IPC_THRESHOLD : 0 MV2_CUDA_ENABLE_IPC_CACHE : 0 MV2_CUDA_IPC_MAX_CACHE_ENTRIES : 1 MV2_CUDA_IPC_NUM_STAGE_BUFFERS : 2 MV2_CUDA_IPC_STAGE_BUF_SIZE : 524288 MV2_CUDA_IPC_BUFFERED : 1 MV2_CUDA_IPC_BUFFERED_LIMIT : 33554432 MV2_CUDA_IPC_SYNC_LIMIT : 16384 MV2_CUDA_USE_NAIVE : 1 MV2_CUDA_REGISTER_NAIVE_BUF : 524288 MV2_CUDA_GATHER_NAIVE_LIMIT : 32768 MV2_CUDA_SCATTER_NAIVE_LIMIT : 2048 MV2_CUDA_ALLGATHER_NAIVE_LIMIT : 1048576 MV2_CUDA_ALLGATHERV_NAIVE_LIMIT : 524288 MV2_CUDA_ALLTOALL_NAIVE_LIMIT : 262144 MV2_CUDA_ALLTOALLV_NAIVE_LIMIT : 262144 MV2_CUDA_BCAST_NAIVE_LIMIT : 2097152 MV2_CUDA_GATHERV_NAIVE_LIMIT : 0 MV2_CUDA_SCATTERV_NAIVE_LIMIT : 16384 MV2_CUDA_ALLTOALL_DYNAMIC : 1 MV2_CUDA_ALLGATHER_RD_LIMIT : 1024 MV2_CUDA_ALLGATHER_FGP : 1 MV2_SMP_CUDA_PIPELINE : 1 MV2_CUDA_INIT_CONTEXT : 1 MV2_SHOW_ENV_INFO : 2 MV2_DEFAULT_PUT_GET_LIST_SIZE : 200 MV2_EAGERSIZE_1SC : 0 MV2_GET_FALLBACK_THRESHOLD : 0 MV2_PIN_POOL_SIZE : 2097152 MV2_PUT_FALLBACK_THRESHOLD : 0 MV2_ASYNC_THREAD_STACK_SIZE : 1048576 MV2_THREAD_YIELD_SPIN_THRESHOLD : 5 MV2_USE_HUGEPAGES : 1,c++,mpi,mvapich2,C++,Mpi,Mvapich2,和配置: mpiname -a MVAPICH2 2.0 Fri Jun 20 20:00:00 EDT 2014 ch3:mrail Compilation CC: gcc -DNDEBUG -DNVALGRIND -O2 CXX: g++ -DNDEBUG -DNVALGRIND F77: no -L/lib -L/lib FC: no Configuration -with-device=ch3:mrail --with-rdma=gen2 --enable-cu

和配置:

 mpiname -a

MVAPICH2 2.0 Fri Jun 20 20:00:00 EDT 2014 ch3:mrail

Compilation
CC: gcc    -DNDEBUG -DNVALGRIND -O2
CXX: g++   -DNDEBUG -DNVALGRIND
F77: no -L/lib -L/lib  
FC: no  

Configuration
-with-device=ch3:mrail --with-rdma=gen2 --enable-cuda --disable-f77 --disable-fc --disable-mcast
该程序在两个进程上运行:

mpirun_rsh -hostfile hosts -n 2 MV2_USE_CUDA=1 MV2_SHOW_ENV_INFO=2 ./myTest

有什么想法吗?

我用781.2 KiB在笔记本电脑上运行了这个,没有死锁。在蓝色Gene/Q上运行,781.2 KiB,没有任何死锁。因此,感谢您的简短测试用例,但很抱歉,我无法复制您的问题。也许是infiniband特有的


这种情况下的一般解决方案是发布非阻塞发送和接收。我可以提供代码,但您询问的是ready send和Agree threshold,所以我很确定您已经知道这些,并且一定有很好的理由不使用它们…

我刚刚在InfiniBand系统上使用MVAPICH2-2.0运行了您的测试用例,但我无法重现挂起。您能发布挂起进程的调试跟踪吗

$ gdb attach <PID>
gdb> thread apply all bt
$gdb附加
gdb>线程应用所有bt
指定

使用就绪通信模式的发送只有在匹配的接收已发布时才能启动。否则,操作是错误的,其结果是未定义的


在此程序中,无法保证
Recv
将在
Rsend
之前发布,因此操作可能会失败或挂起。

要详细说明这一点,代码需要在MPI\u Recv之后和MPI\u Rsend之前添加同步。对MPI_屏障的调用是完全多余的,但已经足够了。通过MPI_Ssend进行点对点同步可能是最低限度的必要同步,但在本例中,这与MPI_Ssend的目的背道而驰。