Optimization Clojure中的快速复数算法
我在Clojure中实现了一些基本的复数算法,注意到它比大致相同的Java代码慢10倍左右,即使有类型提示 比较:Optimization Clojure中的快速复数算法,optimization,clojure,type-hinting,numerical-computing,Optimization,Clojure,Type Hinting,Numerical Computing,我在Clojure中实现了一些基本的复数算法,注意到它比大致相同的Java代码慢10倍左右,即使有类型提示 比较: (defn plus [[^double x1 ^double y1] [^double x2 ^double y2]] [(+ x1 x2) (+ y1 y2)]) (defn times [[^double x1 ^double y1] [^double x2 ^double y2]] [(- (* x1 x2) (* y1 y2)) (+ (* x1 y2)
(defn plus [[^double x1 ^double y1] [^double x2 ^double y2]]
[(+ x1 x2) (+ y1 y2)])
(defn times [[^double x1 ^double y1] [^double x2 ^double y2]]
[(- (* x1 x2) (* y1 y2)) (+ (* x1 y2) (* y1 x2))])
(time (dorun (repeatedly 100000 #(plus [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(times [1 0] [0 1]))))
输出:
"Elapsed time: 69.429796 msecs"
"Elapsed time: 72.232479 msecs"
Time taken: 6 millis
Time taken: 6 millis
与:
事实上,类型提示似乎没有什么不同:如果我删除它们,得到的结果大致相同。真正奇怪的是,如果在没有REPL的情况下运行Clojure脚本,结果会变慢:
"Elapsed time: 137.337782 msecs"
"Elapsed time: 214.213993 msecs"
所以我的问题是:如何接近Java代码的性能?为什么在没有REPL的情况下运行clojure时表达式的计算时间更长
更新==============
很好,在deftype
和defn
s中使用带有类型提示的deftype
,并且反复使用dotimes
而不是,
可以提供与Java版本相同或更好的性能。谢谢你们两位
(deftype complex [^double real ^double imag])
(defn plus [^complex z1 ^complex z2]
(let [x1 (double (.real z1))
y1 (double (.imag z1))
x2 (double (.real z2))
y2 (double (.imag z2))]
(complex. (+ x1 x2) (+ y1 y2))))
(defn times [^complex z1 ^complex z2]
(let [x1 (double (.real z1))
y1 (double (.imag z1))
x2 (double (.real z2))
y2 (double (.imag z2))]
(complex. (- (* x1 x2) (* y1 y2)) (+ (* x1 y2) (* y1 x2)))))
(println "Warm up")
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))
(println "Try with dorun")
(time (dorun (repeatedly 100000 #(plus (complex. 1 0) (complex. 0 1)))))
(time (dorun (repeatedly 100000 #(times (complex. 1 0) (complex. 0 1)))))
(println "Try with dotimes")
(time (dotimes [_ 100000]
(plus (complex. 1 0) (complex. 0 1))))
(time (dotimes [_ 100000]
(times (complex. 1 0) (complex. 0 1))))
输出:
Warm up
"Elapsed time: 92.805664 msecs"
"Elapsed time: 164.929421 msecs"
"Elapsed time: 23.799012 msecs"
"Elapsed time: 32.841624 msecs"
"Elapsed time: 20.886101 msecs"
"Elapsed time: 18.872783 msecs"
Try with dorun
"Elapsed time: 19.238403 msecs"
"Elapsed time: 17.856938 msecs"
Try with dotimes
"Elapsed time: 5.165658 msecs"
"Elapsed time: 5.209027 msecs"
您表现缓慢的原因可能是:
- Clojure向量本质上比Java double[]数组更重。因此,在创建和读取向量时会有相当多的额外开销
- 您正在装箱双倍函数作为函数的参数,并且当它们被放入向量时也是如此。装箱/拆箱在这种低级数字代码中相对昂贵
- 类型提示(
)对您没有帮助:虽然您可以在普通Clojure函数上使用基元类型提示,但它们对向量不起作用^double
deftype
实现它们,例如:
(deftype Complex [^double real ^double imag])
然后使用此类型定义所有复杂函数。这将使您能够在整个过程中使用基本算术,并且应该大致相当于编写良好的Java代码的性能。- 我对基准测试知之甚少,但似乎您需要 在启动测试时预热jvm。所以,当你在REPL里做的时候,它已经预热了。当您以脚本方式运行时,它还不是
- 在java中,在1方法中运行所有循环。除了调用
和plus
之外,没有其他方法。在clojure中,您创建匿名函数并重复调用它。这需要一些时间。您可以将其替换为times
dotimes
(println "Warm up")
(time (dorun (repeatedly 100000 #(plus [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(times [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(plus [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(times [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(plus [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(times [1 0] [0 1]))))
(println "Try with dorun")
(time (dorun (repeatedly 100000 #(plus [1 0] [0 1]))))
(time (dorun (repeatedly 100000 #(times [1 0] [0 1]))))
(println "Try with dotimes")
(time (dotimes [_ 100000]
(plus [1 0] [0 1])))
(time (dotimes [_ 100000]
(times [1 0] [0 1])))
结果:
Warm up
"Elapsed time: 367.569195 msecs"
"Elapsed time: 493.547628 msecs"
"Elapsed time: 116.832979 msecs"
"Elapsed time: 46.862176 msecs"
"Elapsed time: 27.805174 msecs"
"Elapsed time: 28.584179 msecs"
Try with dorun
"Elapsed time: 26.540489 msecs"
"Elapsed time: 27.64626 msecs"
Try with dotimes
"Elapsed time: 7.3792 msecs"
"Elapsed time: 5.940705 msecs"
你试过设置看看是否有任何反射潜入吗?@DaoWen:没有,我从来没有用过那个设置。我刚刚再次运行脚本,脚本顶部有
(set!*反射时发出警告*true)
,并且没有打印到stdout的警告,所以这意味着没有使用反射,对吗?只是想确保我正确地使用它。我认为对于像这样的简单类型,建议使用它。@DaoWen-我可能错了,但我相信您会从deftype获得更好的性能-它比defrecord(稍微)更少的开销。defrecord实现了完整的映射式行为,更适合于“业务对象数据”,而deftype更适合于级别稍低的数据类型。谢谢,我想知道deftype/defrecord,但认为它们可能会引入更多的开销,但我会尝试一下deftype(以及博客文章中的内容)然后再报告。@mikera-关于deftype是较低级别的说法,你肯定是对的,但我不认为使用defrecord代替deftype必然会带来额外的开销。如果不调用任何方法,实现额外的接口(例如IPersistentMap)不会对您造成伤害。使用deftype代替defrecord将阻止您对实例进行关键字查找和解构,这在性能不太关键的部分代码中可能很有用。^:static
与类型提示无关,并且至少从1.2开始就没有过。从1.3开始,您可以将基本类型提示作为函数参数;然而,OP并没有这样做,因为他接受的是向量,而不是原语,所以双打必须装箱才能适应。除此之外,我同意你最终建议使用deftype。谢谢,这是有道理的。我得到了类似的结果。
Warm up
"Elapsed time: 367.569195 msecs"
"Elapsed time: 493.547628 msecs"
"Elapsed time: 116.832979 msecs"
"Elapsed time: 46.862176 msecs"
"Elapsed time: 27.805174 msecs"
"Elapsed time: 28.584179 msecs"
Try with dorun
"Elapsed time: 26.540489 msecs"
"Elapsed time: 27.64626 msecs"
Try with dotimes
"Elapsed time: 7.3792 msecs"
"Elapsed time: 5.940705 msecs"