流上循环的Scala嵌套_Scala_Parallel Processing_Iteration

流上循环的Scala嵌套

scala parallel-processing

流上循环的Scala嵌套,scala,parallel-processing,iteration,Scala,Parallel Processing,Iteration,我编写了以下Scala代码来计算距离矩阵： def dist(fasta: Stream[FastaRecord], f: (FastaRecord, FastaRecord) => Int) = { val inF = fasta.par for (i <- inF; j <- inF) yield (f(i, j)) } def dist（fasta:Stream[FastaRecord]，f:（FastaRecord，FastaRecord）=>Int）{

我编写了以下Scala代码来计算距离矩阵：

def dist(fasta: Stream[FastaRecord], f: (FastaRecord, FastaRecord) => Int) = {
  val inF = fasta.par
  for (i <- inF; j <- inF)
   yield (f(i, j))
}

def dist（fasta:Stream[FastaRecord]，f:（FastaRecord，FastaRecord）=>Int）{
瓦尔英夫
对于（我我认为使用zipWithIndex
可能会满足您的需求：
def dist(fasta: Stream[FastaRecord], f: (FastaRecord, FastaRecord) => Int) = {
  val inF = fasta.zipWithIndex.par
  for ((x, i) <- inF; (y, j) <- inF; if i <= j)
   yield f(x, y)
}

我不认为这真的是一个问题，但我也不知道如何抑制错误…
流缓存它们的结果，如中所示。因此，在第一次j
遍历整个流之后，它应该像普通列表一样快。这意味着我认为最好在开始时使用长度
来评估流然后用你的f
函数只做了一半的并行计算。我这样评论只是为了让你在对流进行一次迭代后了解它们的性能。我认为这里更大的问题是流没有针对随机访问进行优化，所以inF（I）
和inF（j）操作将很慢。
def dist(fasta: Stream[FastaRecord], f: (FastaRecord, FastaRecord) => Int) = {
  val inF = fasta.zipWithIndex.par
  for ((x, i) <- inF; (y, j) <- inF; if i <= j)
   yield f(x, y)
}

warning: `withFilter' method does not yet exist on scala.collection.parallel.immutable.ParSeq[(FastaRecord, Int)], using `filter' method instead