Haskell 替换ByteString中的换行符
我想要一个函数,它接受一个ByteString并用逗号替换换行符Haskell 替换ByteString中的换行符,haskell,replace,pattern-matching,bytestring,Haskell,Replace,Pattern Matching,Bytestring,我想要一个函数,它接受一个ByteString并用逗号替换换行符\n和\n\r,但想不出一个好的方法 import qualified Data.ByteString as BS import Data.Char (ord) import Data.Word (Word8) endlWord8 = fromIntegral $ ord '\n' :: Word8 replace :: BS.ByteString -> BS.ByteString 我曾想过使用BS.map,但由于无法
\n
和\n\r
,但想不出一个好的方法
import qualified Data.ByteString as BS
import Data.Char (ord)
import Data.Word (Word8)
endlWord8 = fromIntegral $ ord '\n' :: Word8
replace :: BS.ByteString -> BS.ByteString
我曾想过使用
BS.map
,但由于无法在Word8
上进行模式匹配,因此看不出如何使用。另一种选择是BS.split
,然后用Word8逗号连接,但这听起来很慢也不雅观。有什么想法吗 使用Data.ByteString.Char8
来摆脱讨厌的Word8
,Char
转换。根据性能,不应更改
另外使用B.span
而不是B.split
,因为您还想替换\n\r
组合,而不仅仅是\n
我自己(可能很笨拙)尝试这样做:
module Test where
import Data.Monoid ((<>))
import Data.ByteString.Char8 (ByteString)
import qualified Data.ByteString.Char8 as B
import qualified Data.ByteString.Builder as Build
import qualified Data.ByteString.Lazy as LB
eatNewline :: ByteString -> (Maybe Char, ByteString)
eatNewline string
| B.null string = (Nothing, string)
| B.head string == '\n' && B.null (B.tail string) = (Just ',', B.empty)
| B.head string == '\n' && B.head (B.tail string) /= '\r' = (Just ',', B.drop 1 string)
| B.head string == '\n' && B.head (B.tail string) == '\r' = (Just ',', B.drop 2 string)
| otherwise = (Nothing, string)
replaceNewlines :: ByteString -> ByteString
replaceNewlines = LB.toStrict . Build.toLazyByteString . go mempty
where
go :: Build.Builder -> ByteString -> Build.Builder
go builder string = let (chunk, rest) = B.span (/= '\n') string
(c, rest1) = eatNewline rest
maybeComma = maybe mempty Build.char8 c
in if B.null rest1 then
builder <> Build.byteString chunk <> maybeComma
else
go (builder <> Build.byteString chunk <> maybeComma) rest1
模块测试,其中
导入数据。Monoid(())
导入Data.ByteString.Char8(ByteString)
将限定数据.ByteString.Char8作为B导入
将符合条件的Data.ByteString.Builder作为生成导入
将限定数据.ByteString.Lazy导入为LB
eatNewline::ByteString->(可能是Char,ByteString)
换行符字符串
|B.null字符串=(无,字符串)
|B.head string=='\n'&&B.null(B.tail string)=(Just',',B.empty)
|B.head字符串=='\n'&&B.head(B.tail字符串)/='\r'=(刚好',',B.drop 1字符串)
|B.head字符串=='\n'&&B.head(B.tail字符串)='\r'=(Just',',B.drop 2字符串)
|否则=(无,字符串)
replaceNewlines::ByteString->ByteString
replaceNewlines=LB.toStrict。Build.toLazyByteString。回忆
哪里
go::Build.Builder->ByteString->Build.Builder
go builder string=let(块,rest)=B.span(/='\n')string
(c,rest1)=eat换行符rest
maybeComma=maybempty Build.char8 c
在if B.null rest1中,则
builder Build.byteString块可能是comma
其他的
go(builder Build.byteString chunk可能是命令)rest1
希望
Data.ByteString.Builder的mappend
在它的一个操作数中使用的次数不是线性的mappend
,否则,这里会有一个二次算法。使用Data.ByteString.Char8
来摆脱讨厌的Word8
,Char
转换和返回。不应根据需要更改性能。否则,BS.split
并没有真正的帮助,因为您还想重新绘制\n\r
组合?我将使用BS.span(/='\n')
执行递归函数。不雅观,但希望不会太慢。谢谢,这很有效!如果你把它写下来作为答案,我会接受的。我无法编辑之前的评论,所以我希望不要把它放在这里太混乱:@krom