Parsing Haskell';s帕塞克
我正在尝试使用Parsec在Haskell中解析一种基于缩进的语言(想想Python、Haskell本身、Boo、YAML)。我看过IndentParser库,它看起来非常匹配,但我不知道如何将我的Parsing Haskell';s帕塞克,parsing,haskell,indentation,parsec,Parsing,Haskell,Indentation,Parsec,我正在尝试使用Parsec在Haskell中解析一种基于缩进的语言(想想Python、Haskell本身、Boo、YAML)。我看过IndentParser库,它看起来非常匹配,但我不知道如何将我的TokenParser转换成缩进解析器。以下是我目前掌握的代码: import qualified Text.ParserCombinators.Parsec.Token as T import qualified Text.ParserCombinators.Parsec.IndentParser.
TokenParser
转换成缩进解析器。以下是我目前掌握的代码:
import qualified Text.ParserCombinators.Parsec.Token as T
import qualified Text.ParserCombinators.Parsec.IndentParser.Token as IT
lexer = T.makeTokenParser mylangDef
ident = IT.identifier lexer
这将抛出错误:
parser2.hs:29:28:
Couldn't match expected type `IT.TokenParser st'
against inferred type `T.GenTokenParser s u m'
In the first argument of `IT.identifier', namely `lexer'
In the expression: IT.identifier lexer
In the definition of `ident': ident = IT.identifier lexer
我做错了什么?如何创建
IT.TokenParser
?或者IndentParser被破坏了,需要避免吗?这里是我为Parsec 3准备的一组解析器组合器,可用于Haskell样式的布局,可能对您有用。关键注意事项是,laidout
启动并运行布局规则,出于相同目的,您应该使用提供的space
和spaced
组合器,而不是stockParsec
组合器。由于布局和注释的交互作用,我不得不将注释解析合并到标记器中
{-# LANGUAGE FlexibleContexts, FlexibleInstances, MultiParamTypeClasses #-}
module Text.Parsec.Layout
( laidout -- repeat a parser in layout, separated by (virtual) semicolons
, space -- consumes one or more spaces, comments, and onside newlines in a layout rule
, maybeFollowedBy
, spaced -- (`maybeFollowedBy` space)
, LayoutEnv -- type needed to describe parsers
, defaultLayoutEnv -- a fresh layout
, semi -- semicolon or virtual semicolon
) where
import Control.Applicative ((<$>))
import Control.Monad (guard)
import Data.Char (isSpace)
import Text.Parsec.Combinator
import Text.Parsec.Pos
import Text.Parsec.Prim hiding (State)
import Text.Parsec.Char hiding (space)
data LayoutContext = NoLayout | Layout Int deriving (Eq,Ord,Show)
data LayoutEnv = Env
{ envLayout :: [LayoutContext]
, envBol :: Bool -- if true, must run offside calculation
}
defaultLayoutEnv :: LayoutEnv
defaultLayoutEnv = Env [] True
pushContext :: Stream s m c => LayoutContext -> ParsecT s LayoutEnv m ()
pushContext ctx = modifyState $ \env -> env { envLayout = ctx:envLayout env }
popContext :: Stream s m c => String -> ParsecT s LayoutEnv m ()
popContext loc = do
(_:xs) <- envLayout <$> getState
modifyState $ \env' -> env' { envLayout = xs }
<|> unexpected ("empty context for " ++ loc)
getIndentation :: Stream s m c => ParsecT s LayoutEnv m Int
getIndentation = depth . envLayout <$> getState where
depth :: [LayoutContext] -> Int
depth (Layout n:_) = n
depth _ = 0
pushCurrentContext :: Stream s m c => ParsecT s LayoutEnv m ()
pushCurrentContext = do
indent <- getIndentation
col <- sourceColumn <$> getPosition
pushContext . Layout $ max (indent+1) col
maybeFollowedBy :: Stream s m c => ParsecT s u m a -> ParsecT s u m b -> ParsecT s u m a
t `maybeFollowedBy` x = do t' <- t; optional x; return t'
spaced :: Stream s m Char => ParsecT s LayoutEnv m a -> ParsecT s LayoutEnv m a
spaced t = t `maybeFollowedBy` space
data Layout = VSemi | VBrace | Other Char deriving (Eq,Ord,Show)
-- TODO: Parse C-style #line pragmas out here
layout :: Stream s m Char => ParsecT s LayoutEnv m Layout
layout = try $ do
bol <- envBol <$> getState
whitespace False (cont bol)
where
cont :: Stream s m Char => Bool -> Bool -> ParsecT s LayoutEnv m Layout
cont True = offside
cont False = onside
-- TODO: Parse nestable {-# LINE ... #-} pragmas in here
whitespace :: Stream s m Char =>
Bool -> (Bool -> ParsecT s LayoutEnv m Layout) -> ParsecT s LayoutEnv m Layout
whitespace x k =
try (string "{-" >> nested k >>= whitespace True)
<|> try comment
<|> do newline; whitespace True offside
<|> do tab; whitespace True k
<|> do (satisfy isSpace <?> "space"); whitespace True k
<|> k x
comment :: Stream s m Char => ParsecT s LayoutEnv m Layout
comment = do
string "--"
many (satisfy ('\n'/=))
newline
whitespace True offside
nested :: Stream s m Char =>
(Bool -> ParsecT s LayoutEnv m Layout) ->
ParsecT s LayoutEnv m (Bool -> ParsecT s LayoutEnv m Layout)
nested k =
try (do string "-}"; return k)
<|> try (do string "{-"; k' <- nested k; nested k')
<|> do newline; nested offside
<|> do anyChar; nested k
offside :: Stream s m Char => Bool -> ParsecT s LayoutEnv m Layout
offside x = do
p <- getPosition
pos <- compare (sourceColumn p) <$> getIndentation
case pos of
LT -> do
popContext "the offside rule"
modifyState $ \env -> env { envBol = True }
return VBrace
EQ -> return VSemi
GT -> onside x
-- we remained onside.
-- If we skipped any comments, or moved to a new line and stayed onside, we return a single a ' ',
-- otherwise we provide the next char
onside :: Stream s m Char => Bool -> ParsecT s LayoutEnv m Layout
onside True = return $ Other ' '
onside False = do
modifyState $ \env -> env { envBol = False }
Other <$> anyChar
layoutSatisfies :: Stream s m Char => (Layout -> Bool) -> ParsecT s LayoutEnv m ()
layoutSatisfies p = guard . p =<< layout
virtual_lbrace :: Stream s m Char => ParsecT s LayoutEnv m ()
virtual_lbrace = pushCurrentContext
virtual_rbrace :: Stream s m Char => ParsecT s LayoutEnv m ()
virtual_rbrace = try (layoutSatisfies (VBrace ==) <?> "outdent")
-- recognize a run of one or more spaces including onside carriage returns in layout
space :: Stream s m Char => ParsecT s LayoutEnv m String
space = do
try $ layoutSatisfies (Other ' ' ==)
return " "
<?> "space"
-- recognize a semicolon including a virtual semicolon in layout
semi :: Stream s m Char => ParsecT s LayoutEnv m String
semi = do
try $ layoutSatisfies p
return ";"
<?> "semi-colon"
where
p VSemi = True
p (Other ';') = True
p _ = False
lbrace :: Stream s m Char => ParsecT s LayoutEnv m String
lbrace = do
char '{'
pushContext NoLayout
return "{"
rbrace :: Stream s m Char => ParsecT s LayoutEnv m String
rbrace = do
char '}'
popContext "a right brace"
return "}"
laidout :: Stream s m Char => ParsecT s LayoutEnv m a -> ParsecT s LayoutEnv m [a]
laidout p = try (braced statements) <|> vbraced statements where
braced = between (spaced lbrace) (spaced rbrace)
vbraced = between (spaced virtual_lbrace) (spaced virtual_rbrace)
statements = p `sepBy` spaced semi
{-#语言flexibleContext,FlexibleInstances,MultiParamTypeClasses}
模块Text.Parsec.Layout
(laidout——在布局中重复解析器,以(虚拟)分号分隔
,space--在布局规则中使用一个或多个空格、注释和同位换行符
,可由
,间隔--(`maybowedby`space)
,LayoutEnv--描述解析器所需的类型
,defaultLayoutEnv--一个新的布局
,半--分号或虚拟分号
)在哪里
导入控件。应用程序(())
进口管制.单子(警卫)
导入Data.Char(isSpace)
导入Text.Parsec.Combinator
导入Text.Parsec.Pos
导入Text.Parsec.Prim隐藏(状态)
导入Text.Parsec.Char隐藏(空格)
数据布局上下文=NoLayout |布局整数推导(Eq、Ord、Show)
数据LayoutEnv=Env
{envLayout::[LayoutContext]
,envBol::Bool--如果为true,则必须运行越位计算
}
defaultLayoutEnv::LayoutEnv
defaultLayoutEnv=Env[]真
pushContext::Stream s m c=>LayoutContext->ParsecT s LayoutEnv m()
pushContext ctx=modifyState$\env->env{envLayout=ctx:envLayout env}
popContext::Stream s m c=>String->ParsecT s LayoutEnv m()
popContext loc=do
(:xs)解析s LayoutEnv m Int
getIndentation=深度。envLayout getState在哪里
深度::[LayoutContext]->Int
深度(布局n:uu)=n
深度=0
pushCurrentContext::Stream s m c=>ParsecT s LayoutEnv m()
pushCurrentContext=do
缩进解析SUMB->解析SUMA
t`maybowedby`x=dot'看起来您在这里使用的是parsec3,而IndentParser需要parsec2。您的示例使用-package parsec-2.1.0.1
为我编译
因此,IndentParser不一定是坏的,但是作者应该更具体地说明依赖项列表中的版本。可以同时安装两个版本的Parsec,因此没有理由不使用IndentParser,除非您出于其他原因承诺使用Parsec 3
更新:实际上,要让Identitparser使用Parsec 3,不需要对源代码进行任何更改。我们两人都遇到的问题似乎是由于cabal install
对parsec2有一个限制。您只需在Parsec版本上使用显式约束重新安装IndentParser即可:
cabal install IndentParser --reinstall --constraint="parsec >= 3"
或者,您可以下载和。先生,您真是太棒了!非常感谢。你怎么知道我在用Parsec 3?猜猜看?因为我认为我的例子可能是……我担心我在这里的侦探工作实际上不是很令人兴奋:我用Parsec 3编译了你的代码,得到了一个类似于你的错误,然后尝试了Parsec 2,这很有效。顺便说一句,让IndentParser使用Parsec3看起来并不难;如果你觉得缩写解析器有用的话,你可以考虑试试看。不过我现在只是在学习Haskell;我担心我会在这样的外国代码库中迷路。你能举个例子吗?我试过了,但我无法让绑定
接受像取消线[“x=y”,“a=b”]
这样简单的东西。我目前认为上面的源代码已损坏,但我还没有机会重新访问它。