Regex VBA中的标记化数学(中缀)表达式

Regex VBA中的标记化数学(中缀)表达式,regex,excel,vba,tokenize,mathematical-expressions,Regex,Excel,Vba,Tokenize,Mathematical Expressions,我需要使用VBA标记一个数学表达式。我有一个可行的解决方案,但正在寻找一种更有效的方法(可能是RegExp) 我当前的解决方案: Function TokeniseTheString(str As String) As String() Dim Operators() As String ' Array of Operators: Operators = Split("+,-,/,*,^,<=,>=,<,>,=", ",") '

我需要使用VBA标记一个数学表达式。我有一个可行的解决方案,但正在寻找一种更有效的方法(可能是RegExp)

我当前的解决方案:

Function TokeniseTheString(str As String) As String()

Dim Operators() As String
' Array of Operators:
Operators = Split("+,-,/,*,^,<=,>=,<,>,=", ",")

' add special characters around all "(", ")" and ","
str = Replace(str, "(", Chr(1) & "(" & Chr(1))
str = Replace(str, ")", Chr(1) & ")" & Chr(1))
str = Replace(str, ",", Chr(1) & "," & Chr(1))

Dim i As Long
' add special characters around all operators
For i = LBound(Operators) To UBound(Operators)
    str = Replace(str, Operators(i), Chr(1) & Operators(i) & Chr(1))
Next i

' for <= and >=, there will now be two special characters between them instead of being one token
' to change <  = back to <=, for example
For i = LBound(Operators) To UBound(Operators)
    If Len(Operators(i)) = 2 Then
        str = Replace(str, Left(Operators(i), 1) & Chr(1) & Chr(1) & Right(Operators(i), 1), Operators(i))
    End If
Next i

' if there was a "(", ")", "," or operator next to each other, there will be two special characters next to each other
Do While InStr(str, Chr(1) & Chr(1)) > 0
    str = Replace(str, Chr(1) & Chr(1), Chr(1))
Loop
' Remove special character at the end of the string:
If Right(str, 1) = Chr(1) Then str = Left(str, Len(str) - 1)

TokeniseTheString = Split(str, Chr(1))

End Function
我以前从未见过正则表达式,并尝试将其实现到VBA中。我遇到的问题是VBA中的
RegExp
对象不允许


我将感谢任何比我上面提到的更有效的解决方案。

正如@Florent B所建议的,以下函数使用RegExp给出了相同的结果:

Function TokenRegex(str As String) As String()
Dim objRegEx As New RegExp
Dim strPattern As String

strPattern = "(""(?:""""|[^""])*""|[^\s()+\-\/*^<>=,]+|<=|>=|\S)\s*"
With objRegEx
    .Global = True
    .MultiLine = False
    .IgnoreCase = True
    .Pattern = strPattern
End With

str = objRegEx.Replace(str, "$1" & ChrW(-1))
If Right(str, 1) = ChrW(-1) Then str = Left(str, Len(str) - 1)
TokenRegex = Split(str, ChrW(-1))

End Function
函数TokenRegex(str作为字符串)作为字符串()
Dim objRegEx作为新的RegExp
作为字符串的Dim strPattern
strPattern=“((?:”)*“|[^\s()+\-\/*^=,]+\=\s)\s*”
用objRegEx
.Global=True
.MultiLine=False
.IgnoreCase=True
.Pattern=strPattern
以
str=objRegEx.Replace(str,“$1”和ChrW(-1))
如果右(str,1)=ChrW(-1),则str=Left(str,Len(str)-1)
TokenRegex=Split(str,ChrW(-1))
端函数

作为没有正则表达式的第一个简单步骤,您可以限制第二个循环匹配“>=”和“a
RegEx
将简化代码,但将字符串转换为缓冲区
Dim buffer()作为Byte:buffer=str
,使用
选择大小写在每个字符周围循环将比较便宜。如果您想尝试
RegEx
,则使用模式
((?:“”“|[^”)*“[^()+-/\*=,]+\=\S*”
,并替换/拆分匹配项:
标记=拆分(re.replace(str,$1)&ChrW(-1)),ChrW(-1))
。为什么你认为你需要一个积极的回顾?你使用了什么表达式?谢谢@FlorentB。我在下面发布了一个与我的原始解决方案相同的函数。你介意解释模式的不同部分吗?有关正则表达式的解释,请参阅
Function TokenRegex(str As String) As String()
Dim objRegEx As New RegExp
Dim strPattern As String

strPattern = "(""(?:""""|[^""])*""|[^\s()+\-\/*^<>=,]+|<=|>=|\S)\s*"
With objRegEx
    .Global = True
    .MultiLine = False
    .IgnoreCase = True
    .Pattern = strPattern
End With

str = objRegEx.Replace(str, "$1" & ChrW(-1))
If Right(str, 1) = ChrW(-1) Then str = Left(str, Len(str) - 1)
TokenRegex = Split(str, ChrW(-1))

End Function