String 如何在Haskell中拆分字符串？_String_Haskell

String 如何在Haskell中拆分字符串？

string haskell

String 如何在Haskell中拆分字符串？,string,haskell,String,Haskell,在Haskell中有没有标准的分割字符串的方法行和字在空格或换行符上拆分效果很好，但在逗号上拆分肯定有一种标准方法吗我在胡格尔上找不到它具体地说，我正在寻找split“，”my，逗号，separated，list“返回[“my”，“comma”，“separated”，“list”]试试这个： import Data.List (unfoldr) separateBy :: Eq a => a -> [a] -> [[a]] separateBy chr = unfol

在Haskell中有没有标准的分割字符串的方法

行

和

字

在空格或换行符上拆分效果很好，但在逗号上拆分肯定有一种标准方法吗

我在胡格尔上找不到它

具体地说，我正在寻找

split“，”my，逗号，separated，list“

[“my”，“comma”，“separated”，“list”]

试试这个：

import Data.List (unfoldr)

separateBy :: Eq a => a -> [a] -> [[a]]
separateBy chr = unfoldr sep where
  sep [] = Nothing
  sep l  = Just . fmap (drop 1) . break (== chr) $ l

仅适用于单个字符，但应易于扩展。

有一个名为

像这样使用它：

ghci> import Data.List.Split
ghci> splitOn "," "my,comma,separated,list"
["my","comma","separated","list"]

它还附带许多其他函数，用于根据匹配的分隔符进行拆分或具有多个分隔符。

在模块Text.Regex（Haskell平台的一部分）中，有一个函数：

splitRegex :: Regex -> String -> [String]

它基于正则表达式拆分字符串。API可在中找到。

请记住，您可以查找Prelude函数的定义

看看这里，

单词的定义是
words   :: String -> [String]
words s =  case dropWhile Char.isSpace s of
                      "" -> []
                      s' -> w : words s''
                            where (w, s'') = break Char.isSpace s'

因此，将其更改为具有谓词的函数：
wordsWhen     :: (Char -> Bool) -> String -> [String]
wordsWhen p s =  case dropWhile p s of
                      "" -> []
                      s' -> w : wordsWhen p s''
                            where (w, s'') = break p s'

然后用你想要的谓词调用它
main = print $ wordsWhen (==',') "break,this,string,at,commas"

我不知道如何在Steve的答案上添加评论，但我想推荐

，

在那里，特别是


作为参考，这比仅仅阅读普通的Haskell报告要好得多
一般来说，带有何时创建新的子列表以供输入的规则的折叠也可以解决此问题。
我昨天开始学习Haskell，如果我错了，请纠正我，但是：
split :: Eq a => a -> [a] -> [[a]]
split x y = func x y [[]]
    where
        func x [] z = reverse $ map (reverse) z
        func x (y:ys) (z:zs) = if y==x then 
            func x ys ([]:(z:zs)) 
        else 
            func x ys ((y:z):zs)

给出：
*Main> split ' ' "this is a test"
["this","is","a","test"]

或者你想
*Main> splitWithStr  " and " "this and is and a and test"
["this","is","a","test"]

这将是：
splitWithStr :: Eq a => [a] -> [a] -> [[a]]
splitWithStr x y = func x y [[]]
    where
        func x [] z = reverse $ map (reverse) z
        func x (y:ys) (z:zs) = if (take (length x) (y:ys)) == x then
            func x (drop (length x) (y:ys)) ([]:(z:zs))
        else
            func x ys ((y:z):zs)

如果使用Data.Text，则会出现以下情况：

这是在Haskell平台上构建的
例如：
import qualified Data.Text as T
main = print $ T.splitOn (T.pack " ") (T.pack "this is a test")

或：
使用Data.List.Split
，它使用Split
：
[me@localhost]$ ghci
Prelude> import Data.List.Split
Prelude Data.List.Split> let l = splitOn "," "1,2,3,4"
Prelude Data.List.Split> :t l
l :: [[Char]]
Prelude Data.List.Split> l
["1","2","3","4"]
Prelude Data.List.Split> let { convert :: [String] -> [Integer]; convert = map read }
Prelude Data.List.Split> let l2 = convert l
Prelude Data.List.Split> :t l2
l2 :: [Integer]
Prelude Data.List.Split> l2
[1,2,3,4]

例如
将删除单个尾随分隔符：
split ';' "a;bb;ccc;;d;"
> ["a","bb","ccc","","d"]

除了答案中给出的高效和预构建函数外，我还将添加我自己的函数，这些函数只是我编写的Haskell函数库的一部分，我是在自己的时间学习该语言的：
-- Correct but inefficient implementation
wordsBy :: String -> Char -> [String]
wordsBy s c = reverse (go s []) where
    go s' ws = case (dropWhile (\c' -> c' == c) s') of
        "" -> ws
        rem -> go ((dropWhile (\c' -> c' /= c) rem)) ((takeWhile (\c' -> c' /= c) rem) : ws)

-- Breaks up by predicate function to allow for more complex conditions (\c -> c == ',' || c == ';')
wordsByF :: String -> (Char -> Bool) -> [String]
wordsByF s f = reverse (go s []) where
    go s' ws = case ((dropWhile (\c' -> f c')) s') of
        "" -> ws
        rem -> go ((dropWhile (\c' -> (f c') == False)) rem) (((takeWhile (\c' -> (f c') == False)) rem) : ws)

解决方案至少是尾部递归的，因此不会导致堆栈溢出。
ghci中的示例：
>  import qualified Text.Regex as R
>  R.splitRegex (R.mkRegex "x") "2x3x777"
>  ["2","3","777"]

在不导入任何内容的情况下，一个字符直接替换一个空格，单词的目标分隔符是一个空格。比如：
words [if c == ',' then ' ' else c|c <- "my,comma,separated,list"]

您可以将其转换为带有参数的函数。您可以删除参数字符以匹配匹配的多个，如：
 [if elem c ";,.:-+@!$#?" then ' ' else c|c <-"my,comma;separated!list"]

[如果元素c”；.：-+@！$#？“那么”如果元素c”我发现这更容易理解：
split :: Char -> String -> [String]
split c xs = case break (==c) xs of 
  (ls, "") -> [ls]
  (ls, x:rs) -> ls : split c rs

我真的很想在未来发布的Data.List
甚至Prelude
中使用这样一个函数。如果没有code-golf.Cool，它是如此常见和令人讨厌。我不知道这个包。这是一个最终的拆分包，因为它提供了对操作的大量控制（修剪结果中的空格、保留结果中的分隔符、删除连续分隔符等…）。拆分列表的方法很多，不可能在单个split
功能中满足所有需求，您确实需要这种包。否则，如果可以接受外部包，MissingH还提供了拆分功能：该包还提供了大量其他“很好拥有”功能函数和我发现相当多的包依赖于它。从最新版本开始，拆分包现在已经脱离了haskell平台。导入Data.List.split（splitOn）并转到town.splitOn:：Eq a=>[a]->[a]->[[a]]]@russabbot下载拆分包时，它包含在haskell平台中（），但它在生成项目时不会自动加载。将split
添加到cabal文件中的build dependens
列表中，例如，如果您的项目名为hello，则在hello.cabal
文件的executable hello
行下放置一行，如“build dependens:base，split”（注意两个空格缩进）。然后使用cabal build
命令进行构建。Cf.我正在寻找一个内置的split
，被具有良好开发库的语言所破坏。但无论如何，谢谢。您在6月份写了这篇文章，所以我想您已经开始了您的旅程：）作为练习，尝试在不使用反向或长度的情况下重写此函数，因为使用这些函数会导致算法复杂性惩罚，并且会阻止应用到无限列表。祝您玩得开心！@RussAbbott您可能需要依赖于text
包或安装它。但这属于另一个问题。无法匹配键入带有“Char”的“T.Text”，预期类型：[Char]实际类型：[T.Text]请不要使用正则表达式来分割字符串。谢谢。@kirelagin，为什么要这样评论？我正在学习Haskell，我想知道你评论背后的理性。@Andrey，我甚至不能运行我的ghci
中的第一行有什么原因吗？@EnricoMariaDeAngelis正则表达式是一个强大的字符串匹配工具。它是在匹配非平凡的内容时使用它们是很有意义的。如果您只想在与另一个固定字符串一样平凡的内容上拆分字符串，则绝对不需要使用正则表达式–这只会使代码更复杂，可能会更慢。“请不要使用正则表达式拆分字符串。”WTF，为什么不？？？用正则表达式拆分字符串是一件非常合理的事情。有很多琐碎的情况下，字符串需要拆分，但分隔符并不总是完全相同。找不到模块“Text.Regex”，可能您指的是Text.Read（来自base-4.10.1.0）
>  import qualified Text.Regex as R
>  R.splitRegex (R.mkRegex "x") "2x3x777"
>  ["2","3","777"]

words [if c == ',' then ' ' else c|c <- "my,comma,separated,list"]

words let f ',' = ' '; f c = c in map f "my,comma,separated,list"

 [if elem c ";,.:-+@!$#?" then ' ' else c|c <-"my,comma;separated!list"]

split :: Char -> String -> [String]
split c xs = case break (==c) xs of 
  (ls, "") -> [ls]
  (ls, x:rs) -> ls : split c rs