OpenMPI Java绑定的锁定、累积和获取行为差异

OpenMPI Java绑定的锁定、累积和获取行为差异,java,mpi,openmpi,Java,Mpi,Openmpi,我有一个案例,我需要在我们的研究集群上使用Java和MPI。我需要的一个特定功能很好地包含在链接答案中的C++代码中。我构建了C++代码,它的工作原理与预期一致。p> 我试图构建与此代码相当的Java,但失败惨重。即使在功能上我复制了C++代码所做的事情,java版本也不能始终如一地返回期望的结果。p> mpiexec --oversubscribe -n 4 ./test 0 got counter 1 2 got counter 2 1 got counter 3 3 got counte

我有一个案例,我需要在我们的研究集群上使用Java和MPI。我需要的一个特定功能很好地包含在链接答案中的C++代码中。我构建了C++代码,它的工作原理与预期一致。p> 我试图构建与此代码相当的Java,但失败惨重。即使在功能上我复制了C++代码所做的事情,java版本也不能始终如一地返回期望的结果。p>
mpiexec --oversubscribe -n 4 ./test

0 got counter 1
2 got counter 2
1 got counter 3
3 got counter 4
 1  1  1  1 
在本地笔记本电脑上运行-超额订阅

当我运行Java等效程序时,我没有得到任何接近相同结果的结果:

mpirun --oversubscribe -n 4 java -cp .:/usr/local/lib/openmpi/mpi.jar CounterTest

0 got counter 1
3 got counter 1
1 got counter 3
2 got counter 2
 1  1  1  1 
我希望每个等级都有一个且只有一个计数器。在这次运行中,计数器1被使用了两次。每过一个月我就可以得到一次,1-4次订单并不重要;唯一计数为

我们在集群上运行版本2.1.0。在我的本地笔记本电脑上,我安装了OpenMPi 2.1.0和3.1.0流,我可以复制C++程序的正确行为和java程序的错误行为。 下面是我创建的计数器类:

import java.nio.ByteBuffer;
import java.util.ArrayList;

import mpi.MPI;
import mpi.MPIException;
import mpi.Win;

public class Counter {
    Win win;
    int hostRank;
    int myVal;
    ByteBuffer data;
    int rank;
    int size;

    public Counter(int hostRank) throws MPIException {
        this.setHostRank(hostRank);
        this.setSize(MPI.COMM_WORLD.getSize());
        this.setRank(MPI.COMM_WORLD.getRank());

        if (this.getRank() == hostRank) {
//          this.setData(MPI.newByteBuffer(this.getSize() * Integer.BYTES));
            this.setData(ByteBuffer.allocateDirect(this.getSize() * Integer.BYTES));
            for (int i = 0; i < this.getData().capacity(); i += Integer.BYTES)
                this.getData().putInt(i, 0);
        } else {
//          this.setData(MPI.newByteBuffer(0));
            this.setData(ByteBuffer.allocateDirect(0));
        }   

        this.setWin(new Win(this.getData(), this.getData().capacity(), Integer.BYTES,
                MPI.INFO_NULL, MPI.COMM_WORLD));

        this.setMyVal(0);
    }

    public int increment(int increment) throws MPIException {

        // A list to store all of the values we pull
        ArrayList<Integer> vals = new ArrayList<Integer>();
        for (int i = 0; i < this.getSize(); i++)
            vals.add(i, 0);

        // Need to convert the increment to a buffer
        ByteBuffer incrbuff = ByteBuffer.allocateDirect(Integer.BYTES);
        incrbuff.putInt(increment);

        // Our values are returned to us in a byte buffer
        ByteBuffer valBuff = ByteBuffer.allocateDirect(Integer.BYTES);

//      System.out.printf("Data for RANK %d: ", this.getRank());
        this.getWin().lock(MPI.LOCK_EXCLUSIVE, this.getHostRank(), 0);
        for (int i = 0; i < this.getSize(); i++) {
            // Always ensure that we're at the top of the buffer
            valBuff.position(0);
            if (i == this.getRank()) {
                this.getWin().accumulate(incrbuff, 1, MPI.INT, this.getHostRank(), i, 1, MPI.INT, MPI.SUM);
                // Without this, it comes back all 1s 
                this.getWin().flushLocalAll();
//              System.out.printf(" [%d] ", this.getMyVal() + increment);
            } else {
                this.getWin().get(valBuff, 1, MPI.INT, this.getHostRank(), i, 1, MPI.INT);
                vals.set(i, valBuff.getInt(0));
//              System.out.printf("  %d  ", vals.get(i))
            }
        }
        this.getWin().unlock(this.getHostRank());

        this.setMyVal(this.getMyVal() + increment);
        vals.set(this.getRank(), this.getMyVal());

//      System.out.printf(" <<%d>> \n", vals.stream().mapToInt(Integer::intValue).sum());
//      this.getWin().unlock(this.getHostRank());

        return vals.stream().mapToInt(Integer::intValue).sum();

    }

    public void printCounter() {
        if (this.getRank() == this.getHostRank()) {
            for (int i = 0; i < this.getSize(); i++) {
                System.out.printf(" %d ", this.getData().getInt());
            }
            System.out.println("");
        }
    }

    public void delete() throws MPIException {
        this.getWin().detach(this.getData());
        this.getWin().free();

        this.setData(null);
        this.setHostRank(0);
        this.setMyVal(0);
        this.setRank(0);
        this.setSize(0);
        this.setWin(null);

    }

    private Win getWin() {
        return win;
    }

    private void setWin(Win win) {
        this.win = win;
    }

    private int getHostRank() {
        return hostRank;
    }

    private void setHostRank(int hostrank) {
        this.hostRank = hostrank;
    }

    private int getMyVal() {
        return myVal;
    }

    private void setMyVal(int myval) {
        this.myVal = myval;
    }

    private ByteBuffer getData() {
        return data;
    }

    private void setData(ByteBuffer data) {
        this.data = data;
    }

    private int getRank() {
        return rank;
    }

    private void setRank(int rank) {
        this.rank = rank;
    }

    private int getSize() {
        return size;
    }

    private void setSize(int size) {
        this.size = size;
    }

}
如果没有这个,每个等级的计数器将是1

这也是测试类的第一部分:

import java.util.Random;

import mpi.*;

public class CounterTest {

    public static void main(String[] args) {

        try {
            MPI.Init(args);
        } catch (MPIException e1) {
            // TODO Auto-generated catch block
            e1.printStackTrace();
        }

        try {
            test1();
//          test2();
        } catch (MPIException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        try {
            MPI.Finalize();
        } catch (MPIException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }

    public static void test1 () throws MPIException {
        Counter c = new Counter(0);
        int rank = MPI.COMM_WORLD.getRank();
        int size = MPI.COMM_WORLD.getSize();

        int result = c.increment(1);
        System.out.printf("%d got counter %d\n", rank, result);

        MPI.COMM_WORLD.barrier();
        c.printCounter();
        c.delete();
        c = null;                       

    }
}
我尝试过各种其他技术,比如尝试围栏,使用小组来使用MPI_Win_start和MPI_Win_complete,但都没有效果。我觉得这是接近于我能得到的C++代码的真实表示。p> 我错过了什么?为什么这不与原生C++代码相同?

编辑:我还发现,在针对实际群集运行此命令时,需要添加此命令。在过去两天中,群集因维护而关闭:

this.getWin().flush(0);

我认为问题在于这些台词

this.getWin().get(valBuff, 1, MPI.INT, this.getHostRank(), i, 1, MPI.INT);
vals.set(i, valBuff.getInt(0));
我的理解是,在调用MPI_Win_unlock之前,您不能假设valBuff的内容是正确的

我使用几个缓冲区重写了子例程,并在MPI_Win_unlock之后设置VAL,从而能够获得正确的输出

public int increment(int increment) throws MPIException {

    // A list to store all of the values we pull
    ArrayList<Integer> vals = new ArrayList<Integer>();
    for (int i = 0; i < this.getSize(); i++)
        vals.add(i, 0);

    // Need to convert the increment to a buffer
    ByteBuffer incrbuff = ByteBuffer.allocateDirect(Integer.BYTES);
    incrbuff.putInt(increment);

    // Our values are returned to us in several byte buffers
    ByteBuffer valBuff[] = new ByteBuffer[this.getSize()];

    this.getWin().lock(MPI.LOCK_EXCLUSIVE, this.getHostRank(), 0);
    for (int i = 0; i < this.getSize(); i++) {
        // Always ensure that we're at the top of the buffer
        if (i == this.getRank()) {
            this.getWin().accumulate(incrbuff, 1, MPI.INT, this.getHostRank(), i, 1, MPI.INT, MPI.SUM);
        } else {
            valBuff[i] = ByteBuffer.allocateDirect(Integer.BYTES);
            valBuff[i].position(0);
            this.getWin().get(valBuff[i], 1, MPI.INT, this.getHostRank(), i, 1, MPI.INT);
        }
    }
    this.getWin().unlock(this.getHostRank());
    for (int i = 0; i < this.getSize(); i++) {
        if (i != this.getRank()) {
            vals.set(i, valBuff[i].getInt(0));
        }
    }

    this.setMyVal(this.getMyVal() + increment);
    vals.set(this.getRank(), this.getMyVal());

    return vals.stream().mapToInt(Integer::intValue).sum();

}

fWw,我尝试使用这个单一的缓冲区。GETSHIGE整数,但不能得到一些工作。

你也可以发布C++代码吗?这是我的文章的第一个链接:我对C++代码的唯一改变是注释第二个测试。我想如果我不能通过Java端的第一个测试,就根本不需要浪费时间运行另一个测试…谢谢你想看看@GillesGouaillardet!!我把每一个代码都转换了,但我没有想到。可能是因为我正在解释,因为C++代码正在立即写入[Val],所以我需要立即做同样的事情。在群集上启动会话以进行验证。!!:-不确定我是否得到了更新版本。应该:valBuff[i]=ByteBuffer.allocateDirectInteger.BYTES。。。是否为Integer.BYTES*this.getSize?圈外呢?它一直以ArrayOutOfBoundException爆炸。实际上是:ByteBuffer valBuff[]=new ByteBuffer[Integer.BYTES*this.getSize];这让我可以超越4级-对答案进行修改;等待同行审查。正确的修复程序是ByteBuffer valBuff[]=new ByteBuffer[this.getSize];如果你去达拉斯参加SC'18,你可能会在RIST展台找到我-
public int increment(int increment) throws MPIException {

    // A list to store all of the values we pull
    ArrayList<Integer> vals = new ArrayList<Integer>();
    for (int i = 0; i < this.getSize(); i++)
        vals.add(i, 0);

    // Need to convert the increment to a buffer
    ByteBuffer incrbuff = ByteBuffer.allocateDirect(Integer.BYTES);
    incrbuff.putInt(increment);

    // Our values are returned to us in several byte buffers
    ByteBuffer valBuff[] = new ByteBuffer[this.getSize()];

    this.getWin().lock(MPI.LOCK_EXCLUSIVE, this.getHostRank(), 0);
    for (int i = 0; i < this.getSize(); i++) {
        // Always ensure that we're at the top of the buffer
        if (i == this.getRank()) {
            this.getWin().accumulate(incrbuff, 1, MPI.INT, this.getHostRank(), i, 1, MPI.INT, MPI.SUM);
        } else {
            valBuff[i] = ByteBuffer.allocateDirect(Integer.BYTES);
            valBuff[i].position(0);
            this.getWin().get(valBuff[i], 1, MPI.INT, this.getHostRank(), i, 1, MPI.INT);
        }
    }
    this.getWin().unlock(this.getHostRank());
    for (int i = 0; i < this.getSize(); i++) {
        if (i != this.getRank()) {
            vals.set(i, valBuff[i].getInt(0));
        }
    }

    this.setMyVal(this.getMyVal() + increment);
    vals.set(this.getRank(), this.getMyVal());

    return vals.stream().mapToInt(Integer::intValue).sum();

}
this.getWin().flushLocalAll();
this.getWin().flush(0);