Parsing 用bison解析bibtex_Parsing_Bison_Flex Lexer_Bibtex

Parsing 用bison解析bibtex

parsing bison

Parsing 用bison解析bibtex,parsing,bison,flex-lexer,bibtex,Parsing,Bison,Flex Lexer,Bibtex,我是个新手。我想使用flex/bison解析bibtex文件。样品 bibtex是： @Book{a1, author="amook", Title="ASR", Publisher="oxf", Year="2010", Add="UK", Edition="1", } @Article{a2, Author="Rudra Banerjee", Title={FeNiMo}, Publisher={P{\"R}B}, Issue="12", Page="36690", Year="2011",

我是个新手。我想使用flex/bison解析bibtex文件。样品 bibtex是：

@Book{a1,
author="amook",
Title="ASR",
Publisher="oxf",
Year="2010",
Add="UK",
Edition="1",
}
@Article{a2,
Author="Rudra Banerjee",
Title={FeNiMo},
Publisher={P{\"R}B},
Issue="12",
Page="36690",
Year="2011",
Add="UK",
Edition="1",
}

为了分析这一点，我编写了以下代码：

%{
#include <stdio.h>
#include <stdlib.h>
%}

%{
char yylval;
int YEAR,i;
//char array_author[1000];
%}
%x author
%x title
%x pub
%x year
%%
@                               printf("\nNEWENTRY\n");
[a-zA-Z][a-zA-Z0-9]*            {printf("%s",yytext);
                                        BEGIN(INITIAL);}
author=                         {BEGIN(author);}
<author>\"[a-zA-Z\/.]+\"        {printf("%s",yytext);
                                        BEGIN(INITIAL);}
year=                           {BEGIN(year);}
<year>\"[0-9]+\"                {printf("%s",yytext);
                                        BEGIN(INITIAL);}
title=                          {BEGIN(title);}
<title>\"[a-zA-Z\/.]+\"         {printf("%s",yytext);
                                        BEGIN(INITIAL);}
publisher=                      {BEGIN(pub);}
<pub>\"[a-zA-Z\/.]+\"           {printf("%s",yytext);
                                        BEGIN(INITIAL);}
[a-zA-Z0-9\/.-]+=        printf("ENTRY TYPE ");
\"                      printf("QUOTE ");
\{                      printf("LCB ");
\}                      printf(" RCB");
;                       printf("SEMICOLON ");
\n                      printf("\n");
%%

int main(){
  yylex();
//char array_author[1000];
//printf("%d%s",&i,array_author[i]);
i++;
return 0;
}

%{
#包括
#包括
%}
%{
查尔瓦尔；
年，我；
//字符数组_author[1000]；
%}
%x作者
%x标题
%x酒吧
%x年
%%
@printf（“\nNewtry\n”）；
[a-zA-Z][a-zA-Z0-9]*{printf（“%s”，yytext）；
开始（首字母）；}
作者={BEGIN（author）；}
\“[a-zA-Z\/.]+\”{printf（“%s”，yytext）；
开始（首字母）；}
年份={开始（年）；}
\“[0-9]+\”{printf（“%s”，文本）；
开始（首字母）；}
title={BEGIN（title）；}
\“[a-zA-Z\/.]+\”{printf（“%s”，yytext）；
开始（首字母）；}
publisher={BEGIN（pub）；}
\“[a-zA-Z\/.]+\”{printf（“%s”，yytext）；
开始（首字母）；}
[a-zA-Z0-9\/.-]+=printf（“条目类型”）；
\“printf”（“报价”）；
\{printf（“LCB”）；
\}printf（“RCB”）；
；printf（“分号”）；
\n printf（“\n”）；
%%
int main（）{
yylex（）；
//字符数组_author[1000]；
//printf（“%d%s”，&i，数组_作者[i]）；
i++；
返回0；
}

问题是我想在不同的位置分离key和val 变量并将其存储在某个位置（可能是数组）。

我能了解一些情况吗？

如果我在一年前看到这个问题，我会同时发表评论，以便改进这个问题。提供的代码不是解析器，而是仅针对flex编写的正则表达式。使用正则表达式扫描输入文件中的标记只是构建解析器的一部分。没有语法或结构r bibtex文件已为bison定义

如果只需要将key和val分开，那么使用awk和sed等工具要比使用flex容易得多。我要指出的一点是，val总是跟在一个等号后面。这使得它们很容易识别，而不需要任何特殊的语法把戏

由于我们没有关于为什么的信息，我们需要解析一个bibtex文件，并且这个练习的最终目标很难看出什么是最好的方法

编辑：这是一个重复的问题，因为OP再次问了它，它得到了回答：
实际上这两个问题的可能重复是由我提出的…（首先，也就是说，这一个没有答案，可能是因为我对问题的框架不好，因此再次访问）。我无法删除它…因为我不喜欢删除已回答的问题。添加了“关闭投票”。