Haskell 搜索字符串

Haskell 搜索字符串,haskell,Haskell,我在一本书中找到了一个很好的例子,我正试图解决这个问题。我正在尝试编写一个名为“指针”的函数,其签名为,pointer::String->Int。它将获取带有类似于[Int]的“指针”的文本,然后返回找到的指针总数 指针函数将检查的文本如下所示: txt :: String txt = "[1] and [2] are friends who grew up together who " ++ "went to the same school and got the same deg

我在一本书中找到了一个很好的例子,我正试图解决这个问题。我正在尝试编写一个名为“指针”的函数,其签名为,
pointer::String->Int
。它将获取带有类似于[Int]的“指针”的文本,然后返回找到的指针总数

指针函数将检查的文本如下所示:

txt :: String
txt = "[1] and [2] are friends who grew up together who " ++
      "went to the same school and got the same degrees." ++
      "They eventually opened up a store named [2] which was pretty successful."
在命令行中,我们将按如下方式运行代码:

> pointer txt 
3
3表示找到的指针数

我的理解是:

  • 我知道“单词”会把一个字符串分解成一个单词列表。 例如:

    “这些苹果都在哪里?”

    [“哪里”、“在哪里”、“全部”、“属于”、“这些”、“苹果?”]

  • 我知道“过滤器”会在列表中选择特定的元素。 例如:

    过滤器(>3)[1,5,6,4,3]

  • 我得到“length”将返回列表的长度

我认为我需要做的是:

Step 1) look at txt and then break it down into single words until you have a long list of words.
Step 2) use filter to examine the list for [1] or [2]. Once found, filter will place these pointers into an list.
Step 3) call the length function on the resulting list.
面临的问题:


我很难接受我所知道的一切并实施它。

这里是一个假设的ghci会议:

ghci> words txt
[ "[1]", "and", "[2]", "are", "friends", "who", ...]

ghci> filter (\w -> w == "[1]" || w == "[2]") (words txt)
[ "[1]", "[2]", "[2]" ]

ghci> length ( filter (\w -> w == "[1]" || w == "[2]") (words txt) )
3
您可以使用
$
运算符使最后一个表达式更具可读性:

length $ filter (\w -> w == "[1]" || w == "[2]") $ words txt

如果希望能够在字符串中找到[Int]类型的所有模式,例如[3]、[465]等,而不仅仅是[1]和[2],最简单的方法是使用正则表达式:

{-# LANGUAGE NoOverloadedStrings #-}

import Text.Regex.Posix

txt :: String
txt = "[1] and [2] are friends who grew up together who " ++
      "went to the same school and got the same degrees." ++
      "They eventually opened up a store named [2] which was pretty successful."

pointer :: String -> Int
pointer source = source =~ "\\[[0-9]{1,}\\]"
我们现在可以运行:

pointer txt
> 3

这适用于单个数字的“指针”:


使用ie提供的解析器组合符可以更好地处理这一问题,但这可能有点过分。

因此,您只需要计算字符串中出现的单词
[1]
[2]
的次数?如果输入字符串是“指针”,则编写一个函数
f::string->Bool
。那么您的函数就完全如您所描述的:
length。过滤器f。words
。由于默认情况下,
重载字符串
通常处于关闭状态,因此您不需要
{-#LANGUAGE NoOverloadedStrings}
,除非项目的
.cabal
文件在默认情况下将其打开。
pointer txt
> 3
pointer :: String -> Int
pointer ('[':_:']':xs) = 1 + pointer xs
pointer (_:        xs) = pointer xs
pointer _              = 0