Scala 包含NaN的集合的最小/最大值(处理排序中的不可比性)
由于以下行为,我刚刚遇到了一个讨厌的bug:Scala 包含NaN的集合的最小/最大值(处理排序中的不可比性),scala,max,min,Scala,Max,Min,由于以下行为,我刚刚遇到了一个讨厌的bug: scala> List(1.0, 2.0, 3.0, Double.NaN).min res1: Double = NaN scala> List(1.0, 2.0, 3.0, Double.NaN).max res2: Double = NaN 我理解,对于成对比较,有时最好使用max(NaN,0)=NaN,这可能就是java.lang.Double.compare遵循此约定的原因(似乎有一个规则)。然而,对于一个收藏品来说,我真的
scala> List(1.0, 2.0, 3.0, Double.NaN).min
res1: Double = NaN
scala> List(1.0, 2.0, 3.0, Double.NaN).max
res2: Double = NaN
我理解,对于成对比较,有时最好使用
max(NaN,0)=NaN
,这可能就是java.lang.Double.compare
遵循此约定的原因(似乎有一个规则)。然而,对于一个收藏品来说,我真的认为这是一个奇怪的惯例。在所有上述集合不包含有效数字之后;这些数字有明确的最大值和最小值。在我看来,集合的最大数量不是一个数字的概念是矛盾的,因为NaN不是一个数字,所以它不能是集合的最大或最小“数字”——除非根本没有有效的数字;在这种情况下,最大值“不是一个数字”是完全有道理的。语义上,min
和max
函数退化为检查集合是否包含NaN。因为有更合适的方法来检查NaN的存在(例如,collection.find({uu.isNaN)
),所以最好在集合上保持语义上有意义的最小值/最大值
所以我的问题是:获得忽略NAN存在的行为的最佳途径是什么?我认为有两种可能性:
object NanAwareOrdering extends Ordering[Double] {
def compare(x: Double, y: Double) = {
if (x.isNaN()) {
+1 // without checking x, return y < x
} else if (y.isNaN()) {
-1 // without checking y, return x < y
} else {
java.lang.Double.compare(x, y)
}
}
}
这意味着,根据我想要的是最小值还是最大值,我将不得不进行两次重新排序,这将禁止使用隐式val
。因此,我的问题是:如何定义一个排序,以便同时处理这两种情况scala> List(1.0, Double.NaN).min
res1: Double = NaN
scala> List(Double.NaN, 1.0).min
res2: Double = 1.0
将一个隐式方法引入范围,允许您在列表中有新的min/max方法,怎么样 比如:
object NanAwareMinOrdering extends Ordering[Double] {
def compare(x: Double, y: Double) = {
if (x.isNaN()) {
+1 // without checking x, return y < x
} else if (y.isNaN()) {
-1 // without checking y, return x < y
} else {
java.lang.Double.compare(x, y)
}
}
}
object NanAwareMaxOrdering extends Ordering[Double] {
....
}
implicit class MinMaxList(list:List[Double]) {
def min2 = list.min(NanAwareMinOrdering)
def max2 = list.max(NanAwareMaxOrdering)
}
object nanawareminording扩展排序[Double]{
def比较(x:Double,y:Double)={
if(x.isNaN()){
+1//不检查x,返回y
列表(1.0、2.0、3.0、Double.NaN)。对于
val a = List(1.0, 2.0, 3.0, Double.NaN)
分类
a.sortWith {_ >_ }
res: List[Double] = List(3.0, 2.0, 1.0, NaN)
因此NaN
值被降级,因此对于max
a.sortWith {_ >_ }.head
res: Double = 3.0
同样地
a.sortWith {_ < _ }
res: List[Double] = List(1.0, 2.0, 3.0, NaN)
a.sortWith{{uu}
res:List[Double]=List(1.0,2.0,3.0,NaN)
所以对于min来说
a.sortWith {_ < _ }.head
res: Double = 1.0
a.sortWith{{{}.head
res:Double=1.0
这个答案只是为了解释这个问题,@monkjack的答案可能提供了最好的实用解决方案
既然Scala提供了隐式传递这样一个排序的可能性,那么传递一个可以根据我们的需求处理“不可比性”的排序不是一种自然的愿望吗
Scala中的排序
仅表示总排序,即所有元素都具有可比性的排序。有一个部分排序[T]
:,但有几个问题:
它实际上并没有在标准库中的任何地方使用
如果您尝试实现采用部分排序的max
/maxBy
/等,您会很快发现这通常是不可能的,除非在像Float
/Double
这样的情况下,您有一些元素与任何东西都不可比较,而所有其他元素都彼此可比较(您可以决定忽略不可比较的元素)
免责声明:我会为这个问题添加我自己的答案,以防其他人对这个问题的更多细节感兴趣
一些理论。。。
我觉得这个问题比我预期的要复杂。正如Alexey Romanov已经指出的,不可比性的概念要求最大/最小函数采用偏序。不幸的是,Alexey也正确地指出,基于偏序的一般最大/最小函数没有意义:想想偏序只定义了某些群内的关系,但群本身是完全独立的(例如,元素{a,b,c,d}仅具有两个关系a
因此,由于偏序过于一般/复杂,min/max函数采用了排序
。不幸的是,总序不允许不可比性的概念。回顾总序的三个定义属性,很明显,“忽略NaN”在形式上是不可能的:
如果≤ b和b≤ a然后a=b(反对称性)
如果≤ b和b≤ c然后a≤ c(及物性)
a≤ b或b≤ a(总体)
…和练习。。。
因此,当试图实现一个排序以实现我们所期望的最小/最大行为时,很明显我们必须违反某些东西(并承担后果)。min
/max
/minBy
/maxBy
在TraversableOnce
中的min
的实现遵循这种模式(对于min
a.sortWith {_ < _ }.head
res: Double = 1.0
reduceLeft((x, y) => if (cmp.lteq(x, y)) x else y)
x <comparison_operator> NaN is always true to keep x in the reduction
NaN <comparison_operator> x is always false to inject x into the reduction
object BiasedOrdering extends Ordering[Double] {
def compare(x: Double, y: Double) = java.lang.Double.compare(x, y) // this is inconsistent, but the same goes for Double.Ordering
override def lteq(x: Double, y: Double): Boolean = if (x.isNaN() && !y.isNaN) false else if (!x.isNaN() && y.isNaN) true else if (x.isNaN() && y.isNaN) true else compare(x, y) <= 0
override def gteq(x: Double, y: Double): Boolean = if (x.isNaN() && !y.isNaN) false else if (!x.isNaN() && y.isNaN) true else if (x.isNaN() && y.isNaN) true else compare(x, y) >= 0
override def lt(x: Double, y: Double): Boolean = if (x.isNaN() && !y.isNaN) false else if (!x.isNaN() && y.isNaN) true else if (x.isNaN() && y.isNaN) false else compare(x, y) < 0
override def gt(x: Double, y: Double): Boolean = if (x.isNaN() && !y.isNaN) false else if (!x.isNaN() && y.isNaN) true else if (x.isNaN() && y.isNaN) false else compare(x, y) > 0
override def equiv(x: Double, y: Double): Boolean = if (x.isNaN() && !y.isNaN) false else if (!x.isNaN() && y.isNaN) true else if (x.isNaN() && y.isNaN) true else compare(x, y) == 0
}
object OrderingDerivedFromCompare extends Ordering[Double] {
def compare(x: Double, y: Double) = {
java.lang.Double.compare(x, y)
}
}
Ordering.Double 0.0 > NaN = false
Ordering.Double 0.0 >= NaN = false
Ordering.Double 0.0 == NaN = false
Ordering.Double 0.0 <= NaN = false
Ordering.Double 0.0 < NaN = false
OrderingDerivedFromCompare 0.0 > NaN = false
OrderingDerivedFromCompare 0.0 >= NaN = false
OrderingDerivedFromCompare 0.0 == NaN = false
OrderingDerivedFromCompare 0.0 <= NaN = true
OrderingDerivedFromCompare 0.0 < NaN = true
BiasedOrdering 0.0 > NaN = true
BiasedOrdering 0.0 >= NaN = true
BiasedOrdering 0.0 == NaN = true
BiasedOrdering 0.0 <= NaN = true
BiasedOrdering 0.0 < NaN = true
Ordering.Double NaN > 0.0 = false
Ordering.Double NaN >= 0.0 = false
Ordering.Double NaN == 0.0 = false
Ordering.Double NaN <= 0.0 = false
Ordering.Double NaN < 0.0 = false
OrderingDerivedFromCompare NaN > 0.0 = true
OrderingDerivedFromCompare NaN >= 0.0 = true
OrderingDerivedFromCompare NaN == 0.0 = false
OrderingDerivedFromCompare NaN <= 0.0 = false
OrderingDerivedFromCompare NaN < 0.0 = false
BiasedOrdering NaN > 0.0 = false
BiasedOrdering NaN >= 0.0 = false
BiasedOrdering NaN == 0.0 = false
BiasedOrdering NaN <= 0.0 = false
BiasedOrdering NaN < 0.0 = false
Ordering.Double NaN > NaN = false
Ordering.Double NaN >= NaN = false
Ordering.Double NaN == NaN = false
Ordering.Double NaN <= NaN = false
Ordering.Double NaN < NaN = false
OrderingDerivedFromCompare NaN > NaN = false
OrderingDerivedFromCompare NaN >= NaN = true
OrderingDerivedFromCompare NaN == NaN = true
OrderingDerivedFromCompare NaN <= NaN = true
OrderingDerivedFromCompare NaN < NaN = false
BiasedOrdering NaN > NaN = false
BiasedOrdering NaN >= NaN = true
BiasedOrdering NaN == NaN = true
BiasedOrdering NaN <= NaN = true
BiasedOrdering NaN < NaN = false
OrderingDerivedFromCompare List(1.0, 2.0, 3.0, Double.NaN).min = 1.0
OrderingDerivedFromCompare List(Double.NaN, 1.0, 2.0, 3.0).min = 1.0
OrderingDerivedFromCompare List(1.0, 2.0, 3.0, Double.NaN).max = NaN
OrderingDerivedFromCompare List(Double.NaN, 1.0, 2.0, 3.0).max = NaN
Ordering.Double List(1.0, 2.0, 3.0, Double.NaN).min = NaN
Ordering.Double List(Double.NaN, 1.0, 2.0, 3.0).min = 1.0
Ordering.Double List(1.0, 2.0, 3.0, Double.NaN).max = NaN
Ordering.Double List(Double.NaN, 1.0, 2.0, 3.0).max = 3.0
BiasedOrdering List(1.0, 2.0, 3.0, Double.NaN).min = 1.0
BiasedOrdering List(Double.NaN, 1.0, 2.0, 3.0).min = 1.0
BiasedOrdering List(1.0, 2.0, 3.0, Double.NaN).max = 3.0
BiasedOrdering List(Double.NaN, 1.0, 2.0, 3.0).max = 3.0
Ordering.Double.compare(0.0, Double.NaN) == -1 // indicating 0.0 < NaN
Ordering.Double.lt (0.0, Double.NaN) == false // contradiction
Ordering.Double List(1.0, 2.0, 3.0, Double.NaN).sorted = List(1.0, 2.0, 3.0, NaN)
OrderingDerivedFromCompare List(1.0, 2.0, 3.0, Double.NaN).sorted = List(1.0, 2.0, 3.0, NaN)
BiasedOrdering List(1.0, 2.0, 3.0, Double.NaN).sorted = List(1.0, 2.0, 3.0, NaN)
Ordering.Double List(Double.NaN, 1.0, 2.0, 3.0).sorted = List(1.0, 2.0, 3.0, NaN)
OrderingDerivedFromCompare List(Double.NaN, 1.0, 2.0, 3.0).sorted = List(1.0, 2.0, 3.0, NaN)
BiasedOrdering List(Double.NaN, 1.0, 2.0, 3.0).sorted = List(1.0, 2.0, 3.0, NaN)
implicit class MinMaxNanAware(t: TraversableOnce[Double]) {
def nanAwareMin = t.minBy(x => if (x.isNaN) Double.PositiveInfinity else x)
def nanAwareMax = t.maxBy(x => if (x.isNaN) Double.NegativeInfinity else x)
}
// and now we can simply use
val goodMin = list.nanAwareMin