从AppleScript'中删除连字符;s文本项分隔符

从AppleScript'中删除连字符;s文本项分隔符,applescript,delimiter,hyphen,Applescript,Delimiter,Hyphen,我正在用AppleScript编写一个计算器(请不要告诉我我不应该这样做,我知道AppleScript不是为这个而设计的)。但当涉及到方程时,我很难将单词彼此分开,因为如果涉及到负变量,我无法将方程分成两部分 set function to "-3x" return word 1 of function 这将返回“3x”,因为连字符是文本项分隔符,但我希望它返回“-3x”。是否有任何方法可以从文本项分隔符中删除连字符,或者有任何其他方法可以将连字符包含到字符串中 非常感谢 如果愿意,您可以用A

我正在用AppleScript编写一个计算器(请不要告诉我我不应该这样做,我知道AppleScript不是为这个而设计的)。但当涉及到方程时,我很难将单词彼此分开,因为如果涉及到负变量,我无法将方程分成两部分

set function to "-3x"
return word 1 of function
这将返回“3x”,因为连字符是文本项分隔符,但我希望它返回“-3x”。是否有任何方法可以从文本项分隔符中删除连字符,或者有任何其他方法可以将连字符包含到字符串中


非常感谢

如果愿意,您可以用AppleScript编写计算器,但您需要像使用其他语言一样:1。使用标记器将输入文本拆分为标记列表,2。将这些标记提供给解析器,解析器将它们组装成抽象语法树,以及3。评估该树以产生结果

对于您正在执行的操作,您可能可以将标记器编写为正则表达式(假设您不介意通过AppleScript ObjC桥向下访问
NSRegularExpression
)。对于解析,我建议阅读Pratt解析器,它易于实现,但功能强大,足以支持前缀、中缀、posfix运算符和运算符优先级。对于评估,一个简单的递归AST行走算法可能就足够了,但每次只需一步

这些都是解决得很好的问题,因此您在查找教程和其他在线信息时不会遇到任何问题。(当然有很多废话,所以要准备花点时间弄清楚如何区分好坏。)


您的一个问题是,没有一个是专门为AppleScript编写的,所以请准备好使用其他语言(Python、Java等)编写的材料,并将其翻译为您自己的语言。这将需要一些努力和耐心来完成所有程序员的发言,但非常可行(我最初在AppleScript上刻苦学习,现在编写自己的自动化脚本语言),而且是一个很好的学习练习,有助于提高您的技能。

给大家一个想法,下面是一个非常简单的类似Lisp语言的简单标记器:

-- token types
property StartList : "START"
property EndList : "END"
property ANumber : "NUMBER"
property AWord : "WORD"

-- recognized token chars
property _startlist : "("
property _endlist : ")"
property _number : "+-.1234567890"
property _word : "abcdefghijklmnopqrstuvwxyz"
property _whitespace : space & tab & linefeed & return


to tokenizeCode(theCode)
    considering diacriticals, hyphens, punctuation and white space but ignoring case and numeric strings
        set i to 1
        set l to theCode's length
        set tokensList to {}
        repeat while i ≤ l
            set c to character i of theCode
            if c is _startlist then
                set end of tokensList to {tokenType:StartList, tokenText:c}
                set i to i + 1
            else if c is _endlist then
                set end of tokensList to {tokenType:EndList, tokenText:c}
                set i to i + 1
            else if c is in _number then
                set tokenText to ""
                repeat while character i of theCode is in _number and i ≤ l
                    set tokenText to tokenText & character i of theCode
                    set i to i + 1
                end repeat
                set end of tokensList to {tokenType:ANumber, tokenText:tokenText}
            else if c is in _word then
                set tokenText to ""
                repeat while character i of theCode is in _word and i ≤ l
                    set tokenText to tokenText & character i of theCode
                    set i to i + 1
                end repeat
                set end of tokensList to {tokenType:AWord, tokenText:tokenText}
            else if c is in _whitespace then -- skip over white space
                repeat while character i of theCode is in _whitespace and i ≤ l
                    set i to i + 1
                end repeat
            else
                error "Unknown character: '" & c & "'"
            end if
        end repeat
        return tokensList
    end considering
end tokenizeCode
此语言的语法规则如下所示:

  • 数字表达式包含一个或多个数字、“+”或“-”符号和/或小数点。(上面的代码目前没有检查令牌是否为有效数字,例如,它很乐意接受诸如“0.1.2-3+”之类的无意义输入,但这很容易添加。)

  • 单词表达式包含一个或多个字符(A-z)

  • 列表表达式以“(”开头,以“)”结尾。列表表达式中的第一个标记必须是要应用的运算符的名称;后面可能有零个或多个表示其操作数的附加表达式

  • 任何无法识别的字符都将被视为错误

例如,让我们使用它来标记数学表达式“3+(2.5*-2)”,它在前缀表示法中是这样写的:

set programText to "(add 3 (multiply 2.5 -2))"

set programTokens to tokenizeCode(programText)

--> {{tokenType:"START", tokenText:"("}, 
     {tokenType:"WORD", tokenText:"add"}, 
     {tokenType:"NUMBER", tokenText:"3"}, 
     {tokenType:"START", tokenText:"("}, 
     {tokenType:"WORD", tokenText:"multiply"}, 
     {tokenType:"NUMBER", tokenText:"2.5"}, 
     {tokenType:"NUMBER", tokenText:"-2"}, 
     {tokenType:"END", tokenText:")"}, 
     {tokenType:"END", tokenText:")"}}
一旦文本被分割成一个标记列表,下一步就是将该列表提供给解析器,解析器将其组装成一个抽象语法树,该树完全描述了程序的结构


就像我说的,这东西有一个学习曲线,但是一旦你掌握了基本原理,你就可以在睡觉的时候把它写下来。询问之后,我将添加一个如何将这些令牌解析为可用形式的示例。

从前面开始,下面是一个解析器,它将令牌发生器的输出转换为基于树的数据结构,描述程序的逻辑

-- token types
property StartList : "START"
property EndList : "END"
property ANumber : "NUMBER"
property AWord : "WORD"


-------
-- handlers called by Parser to construct Abstract Syntax Tree nodes,
-- simplified here for demonstration purposes

to makeOperation(operatorName, operandsList)
    return {operatorName:operatorName, operandsList:operandsList}
end makeOperation

to makeWord(wordText)
    return wordText
end makeWord

to makeNumber(numberText)
    return numberText as number
end makeNumber


-------
-- Parser

to makeParser(programTokens)
    script ProgramParser

        property currentToken : missing value

        to advanceToNextToken()
            if programTokens is {} then error "Found unexpected end of program after '" & currentToken & "'."
            set currentToken to first item of programTokens
            set programTokens to rest of programTokens
            return
        end advanceToNextToken

        --

        to parseOperation() -- parses an '(OPERATOR [OPERANDS ...])' list expression
            advanceToNextToken()
            if currentToken's tokenType is AWord then -- parse 'OPERATOR'
                set operatorName to currentToken's tokenText
                set operandsList to {}
                advanceToNextToken()
                repeat while currentToken's tokenType is not EndList -- parse 'OPERAND(S)'
                    if currentToken's tokenType is StartList then
                        set end of operandsList to parseOperation()
                    else if currentToken's tokenType is AWord then
                        set end of operandsList to makeWord(currentToken's tokenText)
                    else if currentToken's tokenType is ANumber then
                        set end of operandsList to makeNumber(currentToken's tokenText)
                    else
                        error "Expected word, number, or list expression but found '" & currentToken's tokenText & "' instead."
                    end if
                    advanceToNextToken()
                end repeat
                return makeOperation(operatorName, operandsList)
            else
                error "Expected operator name but found '" & currentToken's tokenText & "' instead."
            end if
        end parseOperation

        to parseProgram() -- parses the entire program
            advanceToNextToken()
            if currentToken's tokenType is StartList then
                return parseOperation()
            else
                error "Found unexpected '" & currentToken's tokenText & "' at start of program."
            end if
        end parseProgram

    end script
end makeParser


-------
-- parse the tokens list produced by the tokenizer into an Abstract Syntax Tree

set programTokens to {{tokenType:"START", tokenText:"("}, ¬
    {tokenType:"WORD", tokenText:"add"}, ¬
    {tokenType:"NUMBER", tokenText:"3"}, ¬
    {tokenType:"START", tokenText:"("}, ¬
    {tokenType:"WORD", tokenText:"multiply"}, ¬
    {tokenType:"NUMBER", tokenText:"2.5"}, ¬
    {tokenType:"NUMBER", tokenText:"-2"}, ¬
    {tokenType:"END", tokenText:")"}, ¬
    {tokenType:"END", tokenText:")"}}


set parserObject to makeParser(programTokens)

set abstractSyntaxTree to parserObject's parseProgram()
--> {operatorName:"add", operandsList:{3, {operatorName:"multiply", operandsList:{2.5, -2}}}}
ProgramParser
对象是一个非常简单的递归下降解析器,是一组处理程序,每个处理程序都知道如何将一系列标记转换为特定的数据结构。事实上,这里使用的Lisp-y语法非常简单,实际上只需要两个处理程序:
parseProgram
,它让一切都进行起来,和
parseOperation
,它知道如何读取组成
(操作符名称[OPERATOR\u NAME[OPERATOR 1 OPERATOR 2…])
列表的标记,并将其转换为描述单个操作的记录(加法、乘法等)执行


AST的优点是,特别是像这样一个非常简单的常规AST,您可以将其作为数据进行操作。例如,给定程序
(乘以x y)
,以及
y
=
(添加x 1)的定义
,你可以走AST,用它的定义代替任何提到的
y
,在这种情况下给出
(乘以x(加x 1))
。也就是说,你不仅可以做算术计算(算法编程),还可以做代数操作(符号编程)也是。这在这里有点令人兴奋,但我将在稍后讨论如何组合一个简单的算术计算器。

最后,这里是一个用于解析器输出的简单计算器:

to makeOperation(operatorName, operandsList)
    if operatorName is "add" then
        script AddOperationNode
            to eval(env)
                if operandsList's length ≠ 2 then error "Wrong number of operands."
                return ((operandsList's item 1)'s eval(env)) + ((operandsList's item 2)'s eval(env))
            end eval
        end script
    else if operatorName is "multiply" then
        script MultiplyOperationNode
            to eval(env)
                if operandsList's length ≠ 2 then error "Wrong number of operands."
                return ((operandsList's item 1)'s eval(env)) * ((operandsList's item 2)'s eval(env))
            end eval
        end script
    -- define more operations here as needed...
    else
        error "Unknown operator: '" & operatorName & "'"
    end if
end makeOperation


to makeWord(wordText)
    script WordNode
        to eval(env)
            return env's getValue(wordText)'s eval(env)
        end eval
    end script
end makeWord


to makeNumber(numberText)
    script NumberNode
        to eval(env)
            return numberText as number
        end eval
    end script
end makeNumber


to makeEnvironment()
    script EnvironmentObject
        property _storedValues : {}
        --
        to setValue(theKey, theValue)
            -- theKey : text
            -- theValue : script
            repeat with aRef in _storedValues
                if aRef's k is theKey then
                    set aRef's v to theValue
                    return
                end if
            end repeat
            set end of _storedValues to {k:theKey, v:theValue}
            return
        end setValue
        --
        to getValue(theKey)
            repeat with aRef in _storedValues
                if aRef's k is theKey then return aRef's v
            end repeat
            error "'" & theKey & "' is undefined." number -1728
        end getValue
        --
    end script
end makeEnvironment


to runProgram(programText, theEnvironment)
    set programTokens to tokenizeCode(programText)
    set abstractSyntaxTree to makeParser(programTokens)'s parseProgram()
    return abstractSyntaxTree's eval(theEnvironment)
end runProgram
这将用新的处理程序替换用于测试解析器的
make…
处理程序,这些处理程序构造了完整的对象,表示可以组成抽象语法树的每种类型的结构:数字、单词和操作。每个对象定义一个
eval
处理程序,该处理程序知道如何计算特定结构:在中e> NumberNode它只返回数字,在
WordNode
中检索并计算以该名称存储的结构,在
AddOperationNode
中计算每个操作数,然后求和,依此类推

例如,要评估我们原来的
3+2.5*-2
计划:

set theEnvironment to makeEnvironment()
runProgram("(add 3 (multiply 2.5 -2))", theEnvironment)
--> -2.0
此外,
环境对象
用于存储命名值。例如,要存储名为
“x”
的值供程序使用,请执行以下操作:

set theEnvironment to makeEnvironment()
theEnvironment's setValue("x", makeNumber(5))
runProgram("(add 3 x)", theEnvironment)
--> 8
显然,这需要更多的工作才能使它成为一个合适的计算器:一整套操作符定义,更好的错误报告,等等