C# 正则表达式帮助:我的正则表达式模式将匹配无效的字典
我希望你们能帮我。 我正在使用C#.Net 4.0 我想要像这样验证文件结构C# 正则表达式帮助:我的正则表达式模式将匹配无效的字典,c#,.net,regex,linq,dictionary,C#,.net,Regex,Linq,Dictionary,我希望你们能帮我。 我正在使用C#.Net 4.0 我想要像这样验证文件结构 const string dataFileScr = @" Start 0 { Next = 1 Author = rk Date = 2011-03-10 /* Description = simple */ } PZ 11 { IA_return() } GDC 7 { Message = 6 Message = 7 Message = 8
const string dataFileScr = @"
Start 0
{
Next = 1
Author = rk
Date = 2011-03-10
/* Description = simple */
}
PZ 11
{
IA_return()
}
GDC 7
{
Message = 6
Message = 7
Message = 8
Message = 8
RepeatCount = 2
ErrorMessage = 10
ErrorMessage = 11
onKey[5] = 6
onKey[6] = 4
onKey[9] = 11
}
";
到目前为止,我成功地构建了这个正则表达式模式
const string patternFileScr = @"
^
((?:\[|\s)*
(?<Section>[^\]\r\n]*)
(?:\])*
(?:[\r\n]{0,}|\Z))
(
(?:\{) ### !! improve for .ini file, dont take {
(?:[\r\n]{0,}|\Z)
( # Begin capture groups (Key Value Pairs)
(?!\}|\[) # Stop capture groups if a } is found; new section
(?:\s)* # Line with space
(?<Key>[^=]*?) # Any text before the =, matched few as possible
(?:[\s]*=[\s]*) # Get the = now
(?<Value>[^\r\n]*) # Get everything that is not an Line Changes
(?:[\r\n]{0,})
)* # End Capture groups
(?:[\r\n]{0,})
(?:\})?
(?:[\r\n\s]{0,}|\Z)
)*
";
换句话说
Section1
{
key1=value1
key2=value2
}
Section2
{
key1=value1
key2=value2
}
,但是
DictDataFileScr["GDC 7"]["Message"] = "6|7|8|8"
DictDataFileScr["GDC 7"]["ErrorMessage"] = "10|11"
....
[Section1]
key1 = value1
key2 = value2
[Section2]
key1 = value1
key2 = value2
...
....
PZ 11
{
IA_return()
}
.....
#2。不适用于.ini文件,如
将不起作用,因为正如正则表达式所述,{在[Section]之后是必需的。
如果您有以下内容,您的正则表达式将匹配:
[Section]
{
key = value
}
[部分]
{
键=值
}
以下是Perl中的一个示例。Perl没有命名捕获数组。可能是因为回溯。
也许你可以从正则表达式中选择一些东西,这假设没有{}括号嵌套 编辑永远不要满足于离开,下面是修订版
use strict;
use warnings;
my $str = '
Start 0
{
Next = 1
Author = rk
Date = 2011-03-10
/* Description = simple
*/
}
asdfasdf
PZ 11
{
IA_return()
}
[ section 5 ]
this = that
[ section 6 ]
this_ = _that{hello() hhh = bbb}
TOC{}
GDC 7
{
Message = 6
Message = 7
Message = 8
Message = 8
RepeatCount = 2
ErrorMessage = 10
ErrorMessage = 11
onKey[5] = 6
onKey[6] = 4
onKey[9] = 11
}
';
use re 'eval';
my $rx = qr/
\s*
( \[ [^\S\n]* )? # Grp 1 optional ini section delimeter '['
(?<Section> \w+ (?:[^\S\n]+ \w+)* ) # Grp 2 'Section'
(?(1) [^\S\n]* \] |) # Condition, if we matched '[' then look for ']'
\s*
(?<Body> # Grp 3 'Body' (for display only)
(?(1)| \{ ) # Condition, if we're not a ini section then look for '{'
(?{ print "Section: '$+{Section}'\n" }) # SECTION debug print, remove in production
(?: # _grp_
\s* # whitespace
(?: # _grp_
\/\* .*? \*\/ # some comments
| # OR ..
# Grp 4 'Key' (tested with print, Perl doesen't have named capture arrays)
(?<Key> \w[\w\[\]]* (?:[^\S\n]+ [\w\[\]]+)* )
[^\S\n]* = [^\S\n]* # =
(?<Value> [^\n]* ) # Grp 5 'Value' (tested with print)
(?{ print " k\/v: '$+{Key}' = '$+{Value}'\n" }) # KEY,VALUE debug print, remove in production
| # OR ..
(?(1)| [^{}\n]* ) # any chars except newline and [{}] on the condition we're not a ini section
) # _grpend_
\s* # whitespace
)* # _grpend_ do 0 or more times
(?(1)| \} ) # Condition, if we're not a ini section then look for '}'
)
/x;
while ($str =~ /$rx/xsg)
{
print "\n";
print "Body:\n'$+{Body}'\n";
print "=========================================\n";
}
__END__
帮你自己和你的理智一个忙,学习如何使用和。它们是C#与Lex和Yacc(或者Flex和Bison,如果你愿意的话)最接近的东西,这两种工具是完成这项工作的合适工具 正则表达式是执行健壮字符串匹配的绝佳工具,但当您想要匹配字符串结构时,就需要“语法”。这就是解析器的用途。GPLex接受一组正则表达式并生成一个超快速的词法分析器。GPPG接受您编写的语法并生成一个超快速的解析器
相信我,学习如何使用这些工具……或者任何其他类似的工具。你会很高兴你这么做的。这里是对C#中正则表达式的一次彻底修改 假设:(告诉我其中一个是错误的还是所有都是错误的)
RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled | RegexOptions.Singleline
输入测试:
const string dataFileScr = @"
Start 0
{
Next = 1
Author = rk
Date = 2011-03-10
/* Description = simple */
}
PZ 11
{
IA_return()
}
GDC 7
{
Message = 6
Message = 7
Message = 8
Message = 8
RepeatCount = 2
ErrorMessage = 10
ErrorMessage = 11
onKey[5] = 6
onKey[6] = 4
onKey[9] = 11
}
[Section1]
key1 = value1
key2 = value2
[Section2]
key1 = value1
key2 = value2
";
const string patternFileScr = @"
(?<Section> (?# Start of a non ini file section)
(?<SectionName>[\w ]+)\s* (?# Capture section name)
{ (?# Match but don't capture beginning of section)
(?<SectionBody> (?# Capture section body. Section body can be empty)
(?<SectionLine>\s* (?# Capture zero or more line(s) in the section body)
(?: (?# A line can be either a key/value pair, a comment or a function call)
(?<KeyValuePair>(?<Key>[\w\[\]]+)\s*=\s*(?<Value>[\w-]*)) (?# Capture key/value pair. Key and value are sub-captured separately)
|
(?<Comment>/\*.+?\*/) (?# Capture comment)
|
(?<FunctionCall>[\w]+\(\)) (?# Capture function call. A function can't have parameters though)
)\s* (?# Match but don't capture white characters)
)* (?# Zero or more line(s), previously mentionned in comments)
)
} (?# Match but don't capture beginning of section)
)
|
(?<Section> (?# Start of an ini file section)
\[(?<SectionName>[\w ]+)\] (?# Capture section name)
(?<SectionBody> (?# Capture section body. Section body can be empty)
(?<SectionLine> (?# Capture zero or more line(s) in the section body. Only key/value pair allowed.)
\s*(?<KeyValuePair>(?<Key>[\w\[\]]+)\s*=\s*(?<Value>[\w-]+))\s* (?# Capture key/value pair. Key and value are sub-captured separately)
)* (?# Zero or more line(s), previously mentionned in comments)
)
)
";
重写的正则表达式:
const string dataFileScr = @"
Start 0
{
Next = 1
Author = rk
Date = 2011-03-10
/* Description = simple */
}
PZ 11
{
IA_return()
}
GDC 7
{
Message = 6
Message = 7
Message = 8
Message = 8
RepeatCount = 2
ErrorMessage = 10
ErrorMessage = 11
onKey[5] = 6
onKey[6] = 4
onKey[9] = 11
}
[Section1]
key1 = value1
key2 = value2
[Section2]
key1 = value1
key2 = value2
";
const string patternFileScr = @"
(?<Section> (?# Start of a non ini file section)
(?<SectionName>[\w ]+)\s* (?# Capture section name)
{ (?# Match but don't capture beginning of section)
(?<SectionBody> (?# Capture section body. Section body can be empty)
(?<SectionLine>\s* (?# Capture zero or more line(s) in the section body)
(?: (?# A line can be either a key/value pair, a comment or a function call)
(?<KeyValuePair>(?<Key>[\w\[\]]+)\s*=\s*(?<Value>[\w-]*)) (?# Capture key/value pair. Key and value are sub-captured separately)
|
(?<Comment>/\*.+?\*/) (?# Capture comment)
|
(?<FunctionCall>[\w]+\(\)) (?# Capture function call. A function can't have parameters though)
)\s* (?# Match but don't capture white characters)
)* (?# Zero or more line(s), previously mentionned in comments)
)
} (?# Match but don't capture beginning of section)
)
|
(?<Section> (?# Start of an ini file section)
\[(?<SectionName>[\w ]+)\] (?# Capture section name)
(?<SectionBody> (?# Capture section body. Section body can be empty)
(?<SectionLine> (?# Capture zero or more line(s) in the section body. Only key/value pair allowed.)
\s*(?<KeyValuePair>(?<Key>[\w\[\]]+)\s*=\s*(?<Value>[\w-]+))\s* (?# Capture key/value pair. Key and value are sub-captured separately)
)* (?# Zero or more line(s), previously mentionned in comments)
)
)
";
常量字符串模式文件CR=@”
(?#非ini文件节的开始)
(?[\w]+)\s*(?#捕获节名称)
{(?#匹配但不捕获节的开头)
(?(?#捕获段体。段体可以为空)
(?\s*(?#捕获段正文中的零行或多行)
(?:(?#行可以是键/值对、注释或函数调用)
(?(?[\w\[\]+)\s*=\s*(?[\w-]*)(?#捕获键/值对。键和值分别被子捕获)
|
(?/\*.+?\*/)(?#捕获评论)
|
(?[\w]+\(\)(?#捕获函数调用。函数不能有参数)
)\s*(?#匹配但不捕获白色字符)
)*(?#之前在评论中提到的零行或多行)
)
}(?#匹配但不捕获节的开头)
)
|
(?#ini文件节的开始)
\[(?[\w]+)\](?#捕获节名称)
(?(?#捕获段体。段体可以为空)
(?#捕获节正文中的零行或多行。仅允许键/值对。)
\s*(?(?[\w\[\]]+)\s*=\s*(?[\w-]+)\s*(?#捕获键/值对。键和值分别被子捕获)
)*(?#之前在评论中提到的零行或多行)
)
)
";
讨论 正则表达式的构建是为了匹配非INI文件节(1)或INI文件节(2) (1)非INI文件节这些节由节名和{和}包围的正文组成。 节名con包含字母、数字或空格。 节正文由零行或多行组成。行可以是键/值对(key=value)、注释(/*这里是注释*/)或不带参数的函数调用(my_function()) (2)INI文件节
这些部分由一个由[and]括起的部分名称组成,后面是零对或多对键/值对。每对键/值对位于一行。如果您可以将您的案例减少到几行,也许人们可以帮助您更好地发送其他示例。哦,您想告诉我为什么
\s*(\[[^\s\n]*)(?\w+(?:[^\s\n]+\w+*)(?(1)[^\s\n]*)\s*(?(1)\\)(?:\s*(?:\/\*.*?\*/\*.(?\w[\]*(?:[^\s\n]+[\w\[\]]+*)[^\s\n]*=[^\s\n]*(?[^\n]*)(?(1)[^{}\n]*)\s*(?(1)\\}
,只有一行('.'点也表示换行)doesen不为ya man工作,我的意思是我在这里向你扔骨头。我在网上读到了关于集合的文章,这绝对可以做到。我可以被雇来做你能想象到的最具挑战性的事情。这个正则表达式的子结构是崇高的。它的流程简单而强大,如果你知道你在看什么的话。不,它不会停在结尾}
。由于他的类,他的正则表达式正好通过了最后一个}
。我做了一个测试。将我的示例部分放在r4ph代码的末尾,该部分是匹配的。没有{它不是
const string patternFileScr = @"
(?<Section> (?# Start of a non ini file section)
(?<SectionName>[\w ]+)\s* (?# Capture section name)
{ (?# Match but don't capture beginning of section)
(?<SectionBody> (?# Capture section body. Section body can be empty)
(?<SectionLine>\s* (?# Capture zero or more line(s) in the section body)
(?: (?# A line can be either a key/value pair, a comment or a function call)
(?<KeyValuePair>(?<Key>[\w\[\]]+)\s*=\s*(?<Value>[\w-]*)) (?# Capture key/value pair. Key and value are sub-captured separately)
|
(?<Comment>/\*.+?\*/) (?# Capture comment)
|
(?<FunctionCall>[\w]+\(\)) (?# Capture function call. A function can't have parameters though)
)\s* (?# Match but don't capture white characters)
)* (?# Zero or more line(s), previously mentionned in comments)
)
} (?# Match but don't capture beginning of section)
)
|
(?<Section> (?# Start of an ini file section)
\[(?<SectionName>[\w ]+)\] (?# Capture section name)
(?<SectionBody> (?# Capture section body. Section body can be empty)
(?<SectionLine> (?# Capture zero or more line(s) in the section body. Only key/value pair allowed.)
\s*(?<KeyValuePair>(?<Key>[\w\[\]]+)\s*=\s*(?<Value>[\w-]+))\s* (?# Capture key/value pair. Key and value are sub-captured separately)
)* (?# Zero or more line(s), previously mentionned in comments)
)
)
";