RegEx与kotlin合作,但没有';使用dart时无法按预期工作
正则表达式在kotlin代码中运行良好:RegEx与kotlin合作,但没有';使用dart时无法按预期工作,regex,flutter,dart,Regex,Flutter,Dart,正则表达式在kotlin代码中运行良好: var text = "Today, scientists confirmed the worst possible outcome: the massive asteroid will collide with Earth" val encodeRegex = Regex("""'s|'t|'re|'ve|'m|'ll|'d| ?\p{L}+| ?\p{N}+| ?[^\s\p{L}\p{N}
var text = "Today, scientists confirmed the worst possible outcome: the massive asteroid will collide with Earth"
val encodeRegex = Regex("""'s|'t|'re|'ve|'m|'ll|'d| ?\p{L}+| ?\p{N}+| ?[^\s\p{L}\p{N}]+|\s+(?!\S)|\s+""")
val x= encodeRegex.findAll(text).map { result ->
result.value
}
print(x.toList())
输出:
[Today, ,, scientists, confirmed, the, worst, possible, outcome, :, the, massive, asteroid, will, collide, with, Earth]
[Today,, scientists, confirmed, the, worst, , ossible, outcome:, the, massive, asteroid, will, collide, with, Earth]
我尝试将相同的regexp用于flatter,但它没有按预期工作
Dart代码:
final RegExp encodeRegex = RegExp(
r"""'s|'t|'re|'ve|'m|'ll|'d| ?\p{L}+| ?\p{N}+| ?[^\s\p{L}\p{N}]+|\s+(?!\S)|\s+""",
);
final text ='Today, scientists confirmed the worst possible outcome: the massive asteroid will collide with Earth';
final tokens = encodeRegex
.allMatches(text)
.map(
(element) =>
element.group(0),
)
.toList();
print('${tokens}');
输出:
[Today, ,, scientists, confirmed, the, worst, possible, outcome, :, the, massive, asteroid, will, collide, with, Earth]
[Today,, scientists, confirmed, the, worst, , ossible, outcome:, the, massive, asteroid, will, collide, with, Earth]
问题是,默认情况下,正则表达式与unicode类别不匹配。您需要为正则表达式添加,
unicode:true
,以匹配它们。尝试:
main(){
final RegExp encodeRegex = RegExp(
r"""'s|'t|'re|'ve|'m|'ll|'d| ?\p{L}+| ?\p{N}+| ?[^\s\p{L}\p{N}]+|\s+(?!\S)|\s+""", unicode: true
);
final text ='Today, scientists confirmed the worst possible outcome: the massive asteroid will collide with Earth';
final tokens = encodeRegex
.allMatches(text)
.map(
(element) =>
element.group(0),
)
.toList();
print('${tokens}');
}
它在DartPad中工作。
如果未启用unicode,则它将p{L}
和p{N}
匹配为文字pL和pN