Scala惯用编码风格只是编写低效代码的一个很酷的陷阱吗？_Scala_Performance

Scala惯用编码风格只是编写低效代码的一个很酷的陷阱吗？

scala performance

Scala惯用编码风格只是编写低效代码的一个很酷的陷阱吗？,scala,performance,Scala,Performance,我感觉到Scala社区对编写“简明”、“酷”、“Scala惯用”、“一行代码”（如果可能的话）有点痴迷。紧接着是与Java/命令式/丑陋代码的比较虽然这（有时）会导致代码易于理解，但对于99%的开发人员来说，这也会导致代码效率低下。这就是Java/C++不容易击败的地方考虑这个简单的问题：给定一个整数列表，删除最大的元素。不需要保留排序这是我的解决方案版本（它可能不是最好的，但它是一般非rockstar开发者会做的）它是Scala惯用的、简洁的，并且使用了一些很好的列表函数。这也是非常低

我感觉到Scala社区对编写“简明”、“酷”、“Scala惯用”、“一行代码”（如果可能的话）有点痴迷。紧接着是与Java/命令式/丑陋代码的比较

虽然这（有时）会导致代码易于理解，但对于99%的开发人员来说，这也会导致代码效率低下。这就是Java/C++不容易击败的地方

考虑这个简单的问题：给定一个整数列表，删除最大的元素。不需要保留排序

这是我的解决方案版本（它可能不是最好的，但它是一般非rockstar开发者会做的）

它是Scala惯用的、简洁的，并且使用了一些很好的列表函数。这也是非常低效的。它至少遍历列表3或4次

这是我完全不酷的、类似Java的解决方案。这也是一个合理的Java开发人员（或Scala新手）会写的东西

def removeMaxFast(xs: List[Int]) = {
    var res = ArrayBuffer[Int]()
    var max = xs.head
    var first = true;   
    for (x <- xs) {
        if (first) {
            first = false;
        } else {
            if (x > max) {
                res.append(max)
                max = x
            } else {
                res.append(x)
            }
        }
    }
    res.toList
}

def removeMaxFast（xs:List[Int]）={
var res=ArrayBuffer[Int]（）
var max=xs.head
var first=真；
用于（x最大值）{
res.append（最大值）
最大值=x
}否则{
res.append（x）
}
}
}
托利斯特酒店
}

完全非Scala惯用、非功能、非简洁，但它非常高效。它只遍历列表一次

因此，如果99%的Java开发人员比99%的Scala开发人员编写的代码更高效，这将是一个巨大的挑战跨越更大范围采用Scala的障碍。有办法摆脱这个陷阱吗

我正在寻找实用的建议，以避免这种“低效陷阱”，同时保持实现的清晰和简洁

澄清：这个问题来自现实生活中的一个场景：我必须编写一个复杂的算法。首先我用Scala编写，然后我“不得不”用Java重写。Java实现的长度是原来的两倍，虽然不是那么清晰，但同时速度是原来的两倍。重写Scala代码以提高效率可能需要一些时间，并且需要对Scala内部效率有更深入的了解（对于vs.map vs.fold等）

我不知道有多大的可能性，以后的编译器将改进较慢的map调用，使其与while循环一样快。然而：您很少需要高速解决方案，但如果您经常需要，您会很快学会它们

你知道你的收集量有多大吗？要在你的机器上用一整秒钟的时间来完成你的解决方案

作为oneliner，与Daniel C.Sobrals解决方案类似：

((Nil : List[Int], xs(0)) /: xs.tail) ((p, x)=> if (p._2 > x) (x :: p._1, p._2) else ((p._2 :: p._1), x))._1

但这很难理解，我没有衡量有效的表现。正常模式是（x/：xs）（（a，b）=>/*某物*/）。这里，x和a是迄今为止的List和max对，这解决了将所有内容都放在一行代码中的问题，但可读性不强。然而，你可以通过这种方式在CodeGolf上赢得声誉，也许有人喜欢做一个绩效评估

现在让我们大吃一惊的是，一些测量：一个更新的计时方法，用于消除垃圾收集，并使热点编译器预热，一个main和来自该线程的许多方法一起包含在一个名为

object PerfRemMax {

  def timed (name: String, xs: List [Int]) (f: List [Int] => List [Int]) = {
    val a = System.currentTimeMillis 
    val res = f (xs)
    val z = System.currentTimeMillis 
    val delta = z-a
    println (name + ": "  + (delta / 1000.0))
    res
  }

def main (args: Array [String]) : Unit = {
  val n = args(0).toInt
  val funs : List [(String, List[Int] => List[Int])] = List (
    "indexOf/take-drop" -> adrian1 _, 
    "arraybuf"      -> adrian2 _, /* out of memory */
    "paradigmatic1"     -> pm1 _, /**/
    "paradigmatic2"     -> pm2 _, 
    // "match" -> uu1 _, /*oom*/
    "tailrec match"     -> uu2 _, 
    "foldLeft"      -> uu3 _,
    "buf-=buf.max"  -> soc1 _, 
    "for/yield"     -> soc2 _,
    "splitAt"       -> daniel1,
    "ListBuffer"    -> daniel2
    )

  val r = util.Random 
  val xs = (for (x <- 1 to n) yield r.nextInt (n)).toList 

// With 1 Mio. as param, it starts with 100 000, 200k, 300k, ... 1Mio. cases. 
// a) warmup
// b) look, where the process gets linear to size  
  funs.foreach (f => {
    (1 to 10) foreach (i => {
        timed (f._1, xs.take (n/10 * i)) (f._2)
        compat.Platform.collectGarbage
    });
    println ()
  })
}

这些数字并不完全稳定，这取决于样本大小，并且在不同的运行中略有不同。例如，对于100k到1M的运行，在100k的步数中，splitAt的计时如下：

splitAt: 0.109
splitAt: 0.118
splitAt: 0.129
splitAt: 0.139
splitAt: 0.157
splitAt: 0.166
splitAt: 0.749
splitAt: 0.752
splitAt: 1.444
splitAt: 1.127

最初的解决方案已经相当快了

splitAt

是Daniel的一个修改，通常更快，但并不总是如此

测量是在运行xUbuntu Linux、Scala-2.8和Sun-Java-1.6（桌面）的单核2Ghz Centrino上完成的

我的两个教训是：

始终衡量你的绩效改进；如果你不是每天都做的话，很难估计它
编写函数代码不仅有趣，有时结果甚至更快

编写程序时，效率低下的最大原因是担心错误的事情。这通常是不应该担心的事情。为什么?

开发人员的时间通常比CPU的时间要昂贵得多——事实上，前者通常不足，后者则过剩

大多数代码不需要非常高效，因为它永远不会每秒在百万项数据集上多次运行

大多数代码都需要无bug，代码越少，bug隐藏的空间就越小

试试这个：

(myList.foldLeft((List[Int](), None: Option[Int]))) {
  case ((_, None),     x) => (List(),               Some(x))
  case ((Nil, Some(m), x) => (List(Math.min(x, m)), Some(Math.max(x, m))
  case ((l, Some(m),   x) => (Math.min(x, m) :: l,  Some(Math.max(x, m))
})._1

惯用的，功能性的，只遍历一次。如果您不习惯函数式编程习惯用法，可能有点晦涩难懂

让我们试着解释一下这里发生了什么。我会尽量使它简单，缺乏一些严谨性

折叠是对

列表[A]

（即，包含

类型元素的列表）的操作，该列表将采用初始状态

s0:S

（即，类型

）和函数

f：（S，A）=>S

（即，从列表中获取当前状态和元素并给出下一个状态的函数，即，它根据下一个元素更新状态）

然后，该操作将迭代列表中的元素，使用每个元素根据给定的函数更新状态。在Java中，类似于：

interface Function<T, R> { R apply(T t); }
class Pair<A, B> { ... }
<State> State fold(List<A> list, State s0, Function<Pair<A, State>, State> f) {
  State s = s0;
  for (A a: list) {
    s = f.apply(new Pair<A, State>(a, s));
  }
  return s;
}

def removeMax3( xs: List[Int] ) = {
  val max = xs.max
  xs.filterNot( _ == max )
}

试着写一个折叠来乘以列表中的元素，然后再写一个折叠来找到极值（max，min）

现在，上面介绍的折叠有点复杂，因为状态由正在创建的新列表以及迄今为止找到的最大元素组成。一旦掌握了这些概念，更新状态的函数或多或少是简单的。它只是将当前最大值和当前el之间的最小值放入新列表中ement，而另一个值变为更新状态的当前最大值

什么是一点点

splitAt: 0.109
splitAt: 0.118
splitAt: 0.129
splitAt: 0.139
splitAt: 0.157
splitAt: 0.166
splitAt: 0.749
splitAt: 0.752
splitAt: 1.444
splitAt: 1.127

(myList.foldLeft((List[Int](), None: Option[Int]))) {
  case ((_, None),     x) => (List(),               Some(x))
  case ((Nil, Some(m), x) => (List(Math.min(x, m)), Some(Math.max(x, m))
  case ((l, Some(m),   x) => (Math.min(x, m) :: l,  Some(Math.max(x, m))
})._1

interface Function<T, R> { R apply(T t); }
class Pair<A, B> { ... }
<State> State fold(List<A> list, State s0, Function<Pair<A, State>, State> f) {
  State s = s0;
  for (A a: list) {
    s = f.apply(new Pair<A, State>(a, s));
  }
  return s;
}

myList.fold(0)((partialSum, element) => partialSum + element)

// Given a list of Int
def removeMaxCool(xs: List[Int]): List[Int] = {

  // Find the index of the biggest Int
  val maxIndex = xs.indexOf(xs.max);

  // Then take the ints before and after it, and then concatenate then
  xs.take(maxIndex) ::: xs.drop(maxIndex+1)
}

def removeMaxCool(xs: List[Int]): List[Int] = {
  // the result is the folding of the tail over the head 
  // and an empty list
  xs.tail.foldLeft(xs.head -> List[Int]()) {

    // Where the accumulated list is increased by the
    // lesser of the current element and the accumulated
    // element, and the accumulated element is the maximum between them
    case ((max, ys), x) => 
      if (x > max) (x, max :: ys)
      else (max, x :: ys)

  // and of which we return only the accumulated list
  }._2
}

def removeMax1( xs: List[Int] ) = {
  def rec( max: Int, rest: List[Int], result: List[Int]): List[Int] = {
    if( rest.isEmpty ) result
    else if( rest.head > max ) rec( rest.head, rest.tail, max :: result)
    else rec( max, rest.tail, rest.head :: result )
  }
  rec( xs.head, xs.tail, List() )
}

def removeMax2( xs: List[Int] ) = {
  val result = xs.tail.foldLeft( xs.head -> List[Int]() ) { 
    (acc,x) =>
      val (max,res) = acc
      if( x > max ) x -> ( max :: res )
      else max -> ( x :: res )
  }
  result._2
}

def removeMax3( xs: List[Int] ) = {
  val max = xs.max
  xs.filterNot( _ == max )
}

def splitAt(n: Int): (Repr, Repr) = {
  val l, r = newBuilder
  l.sizeHintBounded(n, this)
  if (n >= 0) r.sizeHint(this, -n)
  var i = 0
  for (x <- this) {
    (if (i < n) l else r) += x
    i += 1
  }
  (l.result, r.result)
}

def removeMax(xs: List[Int]) = {
  val buf = xs.toBuffer
  buf -= (buf.max)
}

def removeMax(xs: List[Int]) = {
  var max = xs.head
  for ( x <- xs.tail ) 
  yield {
    if (x > max) { val result = max; max = x; result}
    else x
  }
}

res.append(max)
res.append(x)

res.toList

val max = xs.max
val (before, _ :: after) = xs span (max !=)
before ::: after

  def shareTail(xs: List[Int]): List[Int] = {
    var res = ListBuffer[Int]()
    var maxTail = xs
    var first = true;
    var x = xs
    while ( x != Nil ) {
      if (x.head > maxTail.head) {
          while (!(maxTail.head == x.head)) {
              res += maxTail.head
              maxTail = maxTail.tail
          }
      }
      x = x.tail
    }
    res.prependToList(maxTail.tail)
  }

package code.array

object SliceArrays {
  def main(args: Array[String]): Unit = {
    println(removeMaxCool(Vector(1,2,3,100,12,23,44)))
  }
  def removeMaxCool(xs: Vector[Int]) = xs.filter(_ < xs.max)
}

package code.array {
  object SliceArrays extends Object {
    def main(args: Array[String]): Unit = scala.Predef.println(SliceArrays.this.removeMaxCool(scala.`package`.Vector().apply(scala.Predef.wrapIntArray(Array[Int]{1, 2, 3, 100, 12, 23, 44})).$asInstanceOf[scala.collection.immutable.Vector]()));
    def removeMaxCool(xs: scala.collection.immutable.Vector): scala.collection.immutable.Vector = xs.filter({
  ((x$1: Int) => SliceArrays.this.$anonfun$removeMaxCool$1(xs, x$1))
}).$asInstanceOf[scala.collection.immutable.Vector]();
    final <artifact> private[this] def $anonfun$removeMaxCool$1(xs$1: scala.collection.immutable.Vector, x$1: Int): Boolean = x$1.<(scala.Int.unbox(xs$1.max(scala.math.Ordering$Int)));
    def <init>(): code.array.SliceArrays.type = {
      SliceArrays.super.<init>();
      ()
    }
  }
}