为什么从Data.Text构建Haskell字符串这么慢_Haskell

为什么从Data.Text构建Haskell字符串这么慢

haskell

为什么从Data.Text构建Haskell字符串这么慢,haskell,Haskell,所以我上了一堂定位课 data Location = Location { title :: String , description :: String } instance Show Location where show l = title l ++ "\n" ++ replicate (length $ title l) '-' ++ "\n" ++ description l 然后我将它改为使用Data.Text data L

所以我上了一堂定位课

data Location = Location {
    title :: String
  , description :: String
  }

instance Show Location where
  show l = title l ++ "\n"
         ++ replicate (length $ title l) '-' ++ "\n"
         ++ description l

然后我将它改为使用

Data.Text

data Location = Location {
    title :: Text
  , description :: Text
  }

instance Show Location where
  show l = T.unpack $
    title l <> "\n"
    <> T.replicate (T.length $ title l) "-" <> "\n"
    <> description l

String

实现耗时34ns，

Data.Text

实现的速度几乎慢了六倍，为170ns

如何获取

数据。Text

与

String

一样快

编辑：愚蠢的错误

我不确定这是怎么发生的，但我无法复制最初的速度差异：现在对于字符串和文本，我分别得到28ns和24ns

对于更激进的

基准测试“length.show”（whnf（length.show）l）

基准测试，对于字符串和文本，我分别得到467ns和3954ns

如果我使用一个非常基本的惰性构建器，没有复制的破折号

import qualified Data.Text.Lazy.Builder as Bldr

instance Show Location where
  show l = show $
    Bldr.fromText (title l) <> Bldr.singleton '\n'
  --  <> Bldr.fromText (T.replicate (T.length $ title l) "-") <> Bldr.singleton '\n'
    <> Bldr.fromText (description l)

将限定的Data.Text.Lazy.Builder作为Bldr导入
实例显示位置在哪里
show l=show$
Bldr.fromText（标题l）Bldr.singleton'\n'
--Bldr.fromText（T.replicate（T.length$title l）“-”）Bldr.singleton'\n'
Bldr.fromText（说明l）

试一下原始的，普通的

show

benchmark，我得到了19ns。现在这是一个错误，因为使用

show

将生成器转换为字符串将转义换行符。如果我将其替换为

LT.unpack$Bldr.toLazyText

，其中

LT

是

Data.Text.Lazy

的限定导入，那么我得到192ns

我正在Mac笔记本电脑上测试，我怀疑我的计时被机器噪音严重破坏了。谢谢你的指导。

你不能让它跑得那么快，但你可以加快一些

附加

文本

表示为数组。这使得

相当慢，因为必须分配一个新数组，并将每个

文本复制到其中。您可以通过先将每个片段转换为字符串
，然后连接它们来解决此问题。我想Text
可能还提供了一种同时连接多个文本的有效方法（如上所述，您可以使用惰性构建器），但出于此目的，速度会较慢。另一个好的选择可能是Text
的延迟版本，它可能支持高效的连接
分享
在基于字符串的实现中，根本不需要复制说明字段。它只是在位置
和显示位置
的结果之间共享。使用文本
版本无法实现这一点。
在字符串情况下，您没有完全评估所有字符串操作-（++）和复制
如果您将基准更改为：
benchmarks = [ bench "show" (whnf (length.show) l) ]

您将看到字符串大小写大约需要520纳秒-大约是10倍长。：“惰性文本值的有效构造”这样做，我发现字符串版本需要700纳秒，文本版本需要2600纳秒，因此问题显然是在show的实现中，而不是评估——尽管调用长度显然有助于确保事情得到充分评估。
benchmarks = [ bench "show" (whnf (length.show) l) ]