C++ c++；编译器中迭代函数的使用_C++_Regex_Compiler Construction_Iterator_Lexical Analysis

C++ c++；编译器中迭代函数的使用

c++ regex compiler-construction

C++ c++；编译器中迭代函数的使用,c++,regex,compiler-construction,iterator,lexical-analysis,C++,Regex,Compiler Construction,Iterator,Lexical Analysis,因此，我试图构建一个小型词汇扫描程序，用于标记文本并确定每个标记的类型。输出应该是一个文本文件，其行号为token&token，在每行键入。如果任何RE都不接受令牌，那么它应该报告一个有意义的错误，显示令牌的行号、令牌和错误。我使用C++中的ReXEXP库，现在我试图在下面的代码中暗示迭代器函数，但是我不知道如何在这里使用。p> #include <iostream> #include <string> #include <regex> #include &l

因此，我试图构建一个小型词汇扫描程序，用于标记文本并确定每个标记的类型。输出应该是一个文本文件，其行号为token&token，在每行键入。如果任何RE都不接受令牌，那么它应该报告一个有意义的错误，显示令牌的行号、令牌和错误。我使用C++中的ReXEXP库，现在我试图在下面的代码中暗示迭代器函数，但是我不知道如何在这里使用。p>

#include <iostream>
#include <string>
#include <regex>
#include <sstream>
#include <fstream>
using namespace std;

int main()
{
    ofstream myfile;
    myfile.open("mytext1.txt");
    myfile << " int 33.2 + bla 059 3 " << endl;
    myfile << " void nn + fbla 09 3 " << endl;
    myfile << " int float + bsla 09 3.2 " << endl;
    myfile.close();

    string s;
    regex keywords("int|if|else|while|float|return|void|breack|for");
    regex id("[[:alpha:]]+[[:d:]]*[[:alpha:]]*", regex_constants::icase);
    regex  integer("[[:d:]]+");
    regex  floatt("[[:d:]]+[.]+[[:d:]]+");
    regex symbolls("[&&]|[||]|[<=]|[>=]|[==]|[<]|[>]|[!=]|[=]|[(]|[)]|[{]|[}]|[;]|[,]|[.]|[+]|[-]|[*]|[/]|[/*]|[*/]");
    regex comment("//[[:w:]]*");
    ifstream myfile2("mytext1.txt");

    //int linenum= 1;
    if (myfile2.is_open())
    {
        while (getline(myfile2, s, ' '))
        {
            cout << s << ",";
            //cout <<linenum<< s << ",";

            bool match = regex_match(s, floatt);
            if (match) cout << "float number" << endl;
            match = regex_match(s, integer);
            if (match)cout << "integer number" << endl;
            match = regex_match(s, keywords);
            if (match){ cout << "keywords" << endl; goto a;
        }
            match = regex_match(s, id);
            if (match)cout << "identifer" << endl;
        a:  match = regex_match(s, comment);
            if (match)cout << "comment" << endl;
            match = regex_match(s, symbolls);
            if (match)cout << "symbolls" << endl;}

    } myfile2.close();

    system("pause");
    return 0;
}

#包括
#包括
#包括
#包括
#包括
使用名称空间std；
int main（）
{
流文件；
myfile.open（“mytext1.txt”）；
myfile符号regex的作用与您认为的不同
问题：

-作为文字的元字符需要转义。

-不带量词的字符类仅匹配单个字符。

-备选方案从左到右排列优先级（例如，a|aw
将只匹配a
we）。

解决方法是将最长的一个放在第一位aw | a

有用提示：如果不需要，请避免使用posix
至于函数<代码> ReXixMatHeMe（）/Cuff>，在C++中，这通常要求正则表达式与完整行匹配。
要查找子字符串，请使用
regex\u search（）

一般来说，编写解析器相当复杂，因为每个字符都必须按顺序进行解析。在任何阶段，每个标记都应该将逻辑置于只接受某些字符或其他标记的状态
无论如何，祝你好运。

下面是一些修改过的正则表达式
  keywords  "int|if|else|while|float|return|void|breack|for"
  -----------
       int
    |  if
    |  else
    |  while
    |  float
    |  return
    |  void
    |  breack
    |  for

  id  "[a-zA-Z]+[0-9]*[a-zA-Z]*"
  -----------
   [a-zA-Z]+ [0-9]* [a-zA-Z]* 

  integer  "[0-9]+"
  -----------
   [0-9]+ 

  floatt  "[0-9]+[.]+[0-9]+"
  -----------
   [0-9]+ [.]+ [0-9]+ 

  symbols  "[&]{2}|[|]{2}|<=|>=|[=]{2}|!=|/\\*|\\*/|[<>(){};,.+*/=-]"
  -----------
       [&]{2} 
    |  [|]{2} 
    |  <= 
    |  >= 
    |  [=]{2} 
    |  != 
    |  /\* 
    |  \*/ 
    |  [<>(){};,.+*/=-] 


  comment
  -----------
    // [a-zA-Z0-9_]* 

关键字“int | if | else | while | float | return | void | breack | for”
-----------
int
|如果
|否则
|当
|浮动
|返回
|空虚
|胸部
|为了
id“[a-zA-Z]+[0-9]*[a-zA-Z]*”
-----------
[a-zA-Z]+[0-9]*[a-zA-Z]*
整数“[0-9]+”
-----------
[0-9]+ 
浮动“[0-9]+[.]+[0-9]+”
-----------
[0-9]+ [.]+ [0-9]+ 
符号“[&]{2}{124;[|]{2}}{124;={2}}{124;！={124;/\\*.+*/\*/[（）{}，.+*/=-]”
-----------
[&]{2} 
|  [|]{2} 
|  = 
|  [=]{2} 
|  != 
|  /\* 
|  \*/ 
|  [(){};,.+*/=-] 
评论
-----------
//[a-zA-Z0-9\]*