Optimization Clojure中的快速复数算法

Optimization Clojure中的快速复数算法,optimization,clojure,type-hinting,numerical-computing,Optimization,Clojure,Type Hinting,Numerical Computing,我在Clojure中实现了一些基本的复数算法,注意到它比大致相同的Java代码慢10倍左右,即使有类型提示 比较: (defn plus [[^double x1 ^double y1] [^double x2 ^double y2]] [(+ x1 x2) (+ y1 y2)]) (defn times [[^double x1 ^double y1] [^double x2 ^double y2]] [(- (* x1 x2) (* y1 y2)) (+ (* x1 y2)

我在Clojure中实现了一些基本的复数算法,注意到它比大致相同的Java代码慢10倍左右,即使有类型提示

比较:

(defn plus [[^double x1 ^double y1] [^double x2 ^double y2]]
    [(+ x1 x2) (+ y1 y2)])

(defn times [[^double x1 ^double y1] [^double x2 ^double y2]]
    [(- (* x1 x2) (* y1 y2)) (+ (* x1 y2) (* y1 x2))])

(time (dorun (repeatedly 100000 #(plus [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(times [1 0] [0 1])))) 
输出:

"Elapsed time: 69.429796 msecs"
"Elapsed time: 72.232479 msecs"
Time taken: 6 millis
Time taken: 6 millis
与:

事实上,类型提示似乎没有什么不同:如果我删除它们,得到的结果大致相同。真正奇怪的是,如果在没有REPL的情况下运行Clojure脚本,结果会变慢:

"Elapsed time: 137.337782 msecs"
"Elapsed time: 214.213993 msecs"
所以我的问题是:如何接近Java代码的性能?为什么在没有REPL的情况下运行clojure时表达式的计算时间更长

更新==============

很好,在
deftype
defn
s中使用带有类型提示的
deftype
,并且反复使用
dotimes
而不是
可以提供与Java版本相同或更好的性能。谢谢你们两位

(deftype complex [^double real ^double imag])

(defn plus [^complex z1 ^complex z2]
  (let [x1 (double (.real z1))
        y1 (double (.imag z1))
        x2 (double (.real z2))
        y2 (double (.imag z2))]
    (complex. (+ x1 x2) (+ y1 y2))))

(defn times [^complex z1 ^complex z2]
  (let [x1 (double (.real z1))
        y1 (double (.imag z1))
        x2 (double (.real z2))
        y2 (double (.imag z2))]
    (complex. (- (* x1 x2) (* y1 y2)) (+ (* x1 y2) (* y1 x2)))))

(println "Warm up")
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))

(println "Try with dorun")
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))

(println "Try with dotimes")
(time (dotimes [_ 100000]
        (plus (complex. 1 0) (complex. 0 1))))

(time (dotimes [_ 100000]
        (times (complex. 1 0) (complex. 0 1))))
输出:

Warm up
"Elapsed time: 92.805664 msecs"
"Elapsed time: 164.929421 msecs"
"Elapsed time: 23.799012 msecs"
"Elapsed time: 32.841624 msecs"
"Elapsed time: 20.886101 msecs"
"Elapsed time: 18.872783 msecs"
Try with dorun
"Elapsed time: 19.238403 msecs"
"Elapsed time: 17.856938 msecs"
Try with dotimes
"Elapsed time: 5.165658 msecs"
"Elapsed time: 5.209027 msecs"

您表现缓慢的原因可能是:

  • Clojure向量本质上比Java double[]数组更重。因此,在创建和读取向量时会有相当多的额外开销
  • 您正在装箱双倍函数作为函数的参数,并且当它们被放入向量时也是如此。装箱/拆箱在这种低级数字代码中相对昂贵
  • 类型提示(
    ^double
    )对您没有帮助:虽然您可以在普通Clojure函数上使用基元类型提示,但它们对向量不起作用
有关更多详细信息,请参见此

如果您确实想要Clojure中的快速复数,可能需要使用
deftype
实现它们,例如:

(deftype Complex [^double real ^double imag])
然后使用此类型定义所有复杂函数。这将使您能够在整个过程中使用基本算术,并且应该大致相当于编写良好的Java代码的性能。

  • 我对基准测试知之甚少,但似乎您需要 在启动测试时预热jvm。所以,当你在REPL里做的时候,它已经预热了。当您以脚本方式运行时,它还不是

  • 在java中,在1方法中运行所有循环。除了调用
    plus
    times
    之外,没有其他方法。在clojure中,您创建匿名函数并重复调用它。这需要一些时间。您可以将其替换为
    dotimes

我的尝试:

(println "Warm up")
(time (dorun (repeatedly 100000 #(plus [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(times [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(plus [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(times [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(plus [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(times [1 0] [0 1]))))

(println "Try with dorun")
(time (dorun (repeatedly 100000 #(plus [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(times [1 0] [0 1]))))

(println "Try with dotimes")
(time (dotimes [_ 100000]
        (plus [1 0] [0 1])))

(time (dotimes [_ 100000]
        (times [1 0] [0 1])))
结果:

Warm up
"Elapsed time: 367.569195 msecs"
"Elapsed time: 493.547628 msecs"
"Elapsed time: 116.832979 msecs"
"Elapsed time: 46.862176 msecs"
"Elapsed time: 27.805174 msecs"
"Elapsed time: 28.584179 msecs"
Try with dorun
"Elapsed time: 26.540489 msecs"
"Elapsed time: 27.64626 msecs"
Try with dotimes
"Elapsed time: 7.3792 msecs"
"Elapsed time: 5.940705 msecs"

你试过设置看看是否有任何反射潜入吗?@DaoWen:没有,我从来没有用过那个设置。我刚刚再次运行脚本,脚本顶部有
(set!*反射时发出警告*true)
,并且没有打印到stdout的警告,所以这意味着没有使用反射,对吗?只是想确保我正确地使用它。我认为对于像这样的简单类型,建议使用它。@DaoWen-我可能错了,但我相信您会从deftype获得更好的性能-它比defrecord(稍微)更少的开销。defrecord实现了完整的映射式行为,更适合于“业务对象数据”,而deftype更适合于级别稍低的数据类型。谢谢,我想知道deftype/defrecord,但认为它们可能会引入更多的开销,但我会尝试一下deftype(以及博客文章中的内容)然后再报告。@mikera-关于deftype是较低级别的说法,你肯定是对的,但我不认为使用defrecord代替deftype必然会带来额外的开销。如果不调用任何方法,实现额外的接口(例如IPersistentMap)不会对您造成伤害。使用deftype代替defrecord将阻止您对实例进行关键字查找和解构,这在性能不太关键的部分代码中可能很有用。
^:static
与类型提示无关,并且至少从1.2开始就没有过。从1.3开始,您可以将基本类型提示作为函数参数;然而,OP并没有这样做,因为他接受的是向量,而不是原语,所以双打必须装箱才能适应。除此之外,我同意你最终建议使用deftype。谢谢,这是有道理的。我得到了类似的结果。
Warm up
"Elapsed time: 367.569195 msecs"
"Elapsed time: 493.547628 msecs"
"Elapsed time: 116.832979 msecs"
"Elapsed time: 46.862176 msecs"
"Elapsed time: 27.805174 msecs"
"Elapsed time: 28.584179 msecs"
Try with dorun
"Elapsed time: 26.540489 msecs"
"Elapsed time: 27.64626 msecs"
Try with dotimes
"Elapsed time: 7.3792 msecs"
"Elapsed time: 5.940705 msecs"