Haskell 剖析平凡循环

Haskell 剖析平凡循环,haskell,profiling,Haskell,Profiling,我试图了解如何使用性能评测。以下是USACO 2013的“视线”问题解决方案 import Data.Array.Unboxed import Data.List import Data.Int angle !a | a > 2 * pi = a - 2 * pi angle !a | a < 0 = a + 2 * pi angle !a = a tans :: Int64 -> [[Int64]] -> UArray (Int,

我试图了解如何使用性能评测。以下是USACO 2013的“视线”问题解决方案

import Data.Array.Unboxed
import Data.List
import Data.Int

angle !a | a > 2 * pi = a - 2 * pi
angle !a | a < 0      = a + 2 * pi
angle !a              = a

tans :: Int64 -> [[Int64]] -> UArray (Int,Int) Double
tans r cs = listArray ((0,0), (length cs - 1, 1)) $ concatMap f cs where
  f :: [Int64] -> [Double]
  f [x,y] = [angle a2, angle a1] where
    phi | y == 0    = if x < 0 then pi else 0.0
        | otherwise = (fromIntegral $ signum y) * (acos $ (fromIntegral x) / d)
    d = sqrt $ fromIntegral $ x*x + y*y
    z = sqrt $ fromIntegral $ x*x + y*y - r*r
    a1 = phi + (acos $ (fromIntegral r)/d)
    a2 = phi - (acos $ (fromIntegral r)/d)

overlap !a1 !a2 !a1' !a2'
   | a1 < a2 && a1' < a2' = a1 <= a2' && a1' <= a2
   | a1 > a2 && a1' > a2' = overlap (a1 - 2*pi) a2 (a1' - 2*pi) a2'
   | a1 > a2 && a1' <= pi = overlap (a1 - 2*pi) a2 a1'          a2'
   | a1 > a2              = overlap a1 (a2 + 2*pi) a1'          a2'
   | a1 <= pi             = overlap a1          a2 (a1' - 2*pi) a2'
   | otherwise            = overlap a1          a2 a1'          (a2' + 2 * pi)

solve cows = length $ [ 1
                      | i <- [0..n]
                      , j <- [i+1..n]
                      , let a1 = cows ! (i,0)
                      , let a2 = cows ! (i,1)
                      , let a1' = cows ! (j,0)
                      , let a2' = cows ! (j,1)
                      , overlap a1 a2 a1' a2' ] where
  ((0,0),(n,1)) = bounds cows

main = do
         ls <- getContents
         let ([n, r]: cows ) = map (map read . words) $ lines ls
         print $ solve $ tans r cows
import Data.Array.unbox
导入数据。列表
导入数据.Int
角度!a | a>2*pi=a-2*pi
角度!a | a<0=a+2*pi
角度!a=a
tans::Int64->[[Int64]]->UArray(Int,Int)双精度
tans r cs=listArray((0,0),(长度cs-1,1))$concatMap f cs其中
f::[Int64]->[Double]
f[x,y]=[角度a2,角度a1]其中
phi | y==0=如果x<0,则pi=0.0
|否则=(从积分$signum y)*(acos$(从积分x)/d)
d=sqrt$from积分$x*x+y*y
z=sqrt$from积分$x*x+y*y-r*r
a1=φ+(acos$(来自积分r)/d)
a2=φ-(acos$(来自积分r)/d)
重叠a1!a2!a1'!a2'
|a1|a1ghc无法通过构建列表来折叠长度计算-即,它分配列表单元格

如果将
solve
重写为显式循环,则分配将消失:

solve cows = n `seq` go 0 0 1 n
  where
    (_,(n,_)) = bounds cows
    go count i j n | i > n = count
                   | j > n = go count (i+1) (i+2) n
                   | overlap (cows ! (i,0)) (cows ! (i,1)) (cows ! (j,0)) (cows ! (j,1))
                       = go (count + 1) i (j + 1) n
                   | otherwise = go count i (j + 1) n
至于为什么分配给a1'和a2',我不知道

Cpu使用主要由
go
功能控制,这可能意味着阵列访问<代码>重叠
仅占总运行时间的15%左右

编辑:以下是(可读性较差)版本,其中两个数组访问移出了内部循环:

solve !cows = n `seq` go 0 0
 where
  (_,(n,_)) = bounds cows
  go !count !i | i >= n = count
               | otherwise = go2 count i (i+1) (cows ! (i,0)) (cows ! (i,1))
  go2 !count !i !j !a1 !a2 | j > n = go count (i+1)
                           | overlap a1 a2 (cows ! (j,0)) (cows ! (j,1))
                                  = go2 (count+1) i (j+1) a1 a2
                           | otherwise = go2 count i (j+1) a1 a2

我对此有意见。1.阵列访问成本比一系列测试高4倍?2.如果仅此而已,为什么它会运行3秒?(对于6.in,超过15秒——这只是几百万次数组访问和比较)3。分配仍然归于a1、a2、a1'和a2'(我保留了它们,而不是直接提供参数)4。用“go”重写并没有加快程序的速度。你没有要求加快程序的速度,只是要求分配和时间到哪里。虽然5.in中的数据应该适合二级缓存,但与某些分支相比(假设分支预测率较高),访问这些数据仍然需要大量时间。请注意,我的版本实际上比原始版本慢,因为索引为i的数组访问没有移出内部j循环。添加更快的版本。另外,使用-fllvmA C版本编译FP代码的速度仅为原来的3倍——要更快地获得Haskell版本,您需要进行低级代码调优。顺便说一句,分析C版本显示80%的时间重叠;但是我希望“问题出在哪里”的证明通过改变执行速度来显示:)好的,我将尝试使用-fllvm
solve !cows = n `seq` go 0 0
 where
  (_,(n,_)) = bounds cows
  go !count !i | i >= n = count
               | otherwise = go2 count i (i+1) (cows ! (i,0)) (cows ! (i,1))
  go2 !count !i !j !a1 !a2 | j > n = go count (i+1)
                           | overlap a1 a2 (cows ! (j,0)) (cows ! (j,1))
                                  = go2 (count+1) i (j+1) a1 a2
                           | otherwise = go2 count i (j+1) a1 a2