Java 调用Vertex.getEdgeValue（）两次后，EdgeValue就不一样了_Java_Algorithm_Hadoop_Graph_Giraph

Java 调用Vertex.getEdgeValue（）两次后，EdgeValue就不一样了

java algorithm hadoop graph

Java 调用Vertex.getEdgeValue（）两次后，EdgeValue就不一样了,java,algorithm,hadoop,graph,giraph,Java,Algorithm,Hadoop,Graph,Giraph,我试图在giraph中实现Spinner图分区算法。在第一步中，我的程序向给定的输入图添加边，使其成为无向图，每个顶点选择一个随机分区。（此分区整数存储在VertexValue中）在初始化步骤结束时，每个顶点向所有输出边发送一条消息，其中包含顶点ID（aLongWritable）和顶点选择的分区这一切都很好。现在在我遇到问题的步骤中，每个顶点迭代接收到的消息，并将接收到的分区保存在相应边的EdgeValue中。（VertexValue在Vertex中为V，EdgeValue在Edge中为E

我试图在giraph中实现Spinner图分区算法。在第一步中，我的程序向给定的输入图添加边，使其成为无向图，每个顶点选择一个随机分区。（此分区整数存储在

VertexValue

中）在初始化步骤结束时，每个顶点向所有输出边发送一条消息，其中包含顶点ID（a

LongWritable

）和顶点选择的分区

这一切都很好。现在在我遇到问题的步骤中，每个顶点迭代接收到的消息，并将接收到的分区保存在相应边的

EdgeValue

中。（

VertexValue

在

Vertex

中为

，

EdgeValue

在

Edge

中为

）

以下是我的代码的重要部分：

包装类：

public class EdgeValue implements Writable {
private int weight;
private int partition;
// Getters and setters for weight and partition
    public EdgeValue() {
    this.weight = -2;
    this.partition = -1;
}
// Constructors taking 1 and 2 ints and setting weight/partition to the given value

@Override
public void readFields(DataInput in) throws IOException {
    this.weight = in.readInt();
    this.partition = in.readInt();
}

@Override
public void write(DataOutput out) throws IOException {
    out.writeInt(this.weight);
    out.writeInt(this.partition);
}
}

public class SpinnerMessage implements Writable, Configurable {
private long senderId;
private int updatePartition;
public SpinnerMessage() {
    this.senderId = -1;
    this.updatePartition = -1;
}
// Constructors taking int and/or LongWritable and setting the fields
// Getters and setters for senderId and updatePartition

@Override
public void readFields(DataInput in) throws IOException {
    this.senderId = in.readLong();
    this.updatePartition = in.readInt();
}

@Override
public void write(DataOutput out) throws IOException {
    out.writeLong(this.senderId);
    out.writeInt(this.updatePartition);
}
}

前面步骤中的

compute

方法（ran是

Random

对象）：

但是，当我使用例如

for(Edge<LongWritable, EdgeValue> e : vertex.getEdges())

然后返回的

EdgeValue

具有权重

-2

和分区

-1

（标准构造函数的默认值）

我的想法是什么会导致错误：

getEdgeValue（new longwriteable（someLong））

可能不起作用，因为它与具有相同值的另一个

new longwriteable（someLong）

对象不同。但是，我在giraph代码中看到过这种用法，所以这似乎没有问题，只有

LongWritable

中存储的长时间似乎才有意义

（最可能的原因）Hadoop序列化和反序列化正在以某种方式改变我的

EdgeValue

对象。由于Hadoop适用于非常大的图形，它们可能不适合RAM。为此，

VertexValue

和

EdgeValue

必须实现

Writable

。然而，在在线检查了一些giraph代码之后，我以一种我认为正确的方式实现了

read（）

和

write（）

（以相同的顺序写入和读取重要字段）。（我认为这在某种程度上与问题有关，因为第二次调用返回的

EdgeValue

具有标准构造函数的字段值）

我还阅读了一些文档：

E getEdgeValue（I targetVertexId）

返回具有给定目标顶点id的第一条边的值，如果没有这样的边，则返回null。注意：此方法返回的边值对象可能会在下次调用时失效。因此，保留对边值的引用几乎总是导致不希望的行为

但是，这不适用于我，因为我只有一个

EdgeValue

变量，对吗

提前感谢所有花时间帮助我的人。（我正在使用hadoop 1.2.1和giraph 1.2.0）

在查看了更多的giraph代码示例后，我找到了解决方案：

Vertex.getEdgeValue（）

方法基本上创建了

EdgeValue

顶点的位置。如果更改它返回的对象，它将不会写入这些更改回到磁盘。要在

EdgeValue

或

VertexValue

中保存信息，必须使用

setVertexValue（）

或

setEdgeValue（）

public void compute(Vertex<LongWritable, VertexValue, EdgeValue> vertex,Iterable<SpinnerMessage> messages) throws IOException {
for (SpinnerMessage m : messages) {
    vertex.getEdgeValue(new LongWritable(m.getSenderWritable().get())).setPartition(m.getUpdatePartition());
}
// ... some other code, e.g. initializing the amountOfNeighbors array.
// Here I get an ArrayIndexOutOfBoundsException since the partition is -1:
for (Edge<LongWritable, EdgeValue> edge : vertex.getEdges()) {
    EdgeValue curValue = edge.getValue();
    amountOfNeighbors[curValue.getPartition()] += curValue.getWeight();
}

for(Edge<LongWritable, EdgeValue> e : vertex.getEdges())

vertex.getEdgeValue(someVertex)