Ios NSLinguagistTagger:根据标记类型筛选出指定的标记_Ios_Swift_Macos_Swift4_Nslinguistictagger

Ios NSLinguagistTagger:根据标记类型筛选出指定的标记

ios swift macos

Ios NSLinguagistTagger:根据标记类型筛选出指定的标记,ios,swift,macos,swift4,nslinguistictagger,Ios,Swift,Macos,Swift4,Nslinguistictagger,我试图根据标记筛选出特定的令牌。当我运行我的代码时，我得到这个作为输出。我只想检索形容词并将其输出。有没有一个简单的方法可以做到这一点 Hello: NSLinguisticTag(_rawValue: Interjection) World: NSLinguisticTag(_rawValue: Noun) this: NSLinguisticTag(_rawValue: Determiner) is: NSLinguisticTag(_rawValue: Verb) my: NSLingui

我试图根据标记筛选出特定的令牌。当我运行我的代码时，我得到这个作为输出。我只想检索形容词并将其输出。有没有一个简单的方法可以做到这一点

Hello: NSLinguisticTag(_rawValue: Interjection)
World: NSLinguisticTag(_rawValue: Noun)
this: NSLinguisticTag(_rawValue: Determiner)
is: NSLinguisticTag(_rawValue: Verb)
my: NSLinguisticTag(_rawValue: Determiner)
main: NSLinguisticTag(_rawValue: Adjective)
goal: NSLinguisticTag(_rawValue: Noun)

TokenizeTextInputedText:Hello World这是我的主要目标，用这些词找出形容词、动词和名词

你只需检查标记是否为类型。enumerateTags闭包中的形容词只有在以下情况下才继续：

let sentence = "The yellow cat hunts the little gray mouse around the block"
let options: NSLinguisticTagger.Options = [.omitWhitespace, .omitPunctuation, .joinNames]
let schemes = NSLinguisticTagger.availableTagSchemes(forLanguage: "en")
let tagger = NSLinguisticTagger(tagSchemes: schemes, options: Int(options.rawValue))
tagger.string = sentence
tagger.enumerateTags(in: NSRange(location: 0, length: sentence.count), scheme: .nameTypeOrLexicalClass, options: options) { (tag, tokenRange, _, _) in
    guard tag == .adjective, let adjectiveRange = Range(tokenRange, in: sentence) else { return }
    let adjectiveToken = sentence[adjectiveRange]
    print(adjectiveToken)
}

这将打印出：

黄色的小的灰色的

编辑

如果您想要多个标记类型的标记，可以将标记存储在字典中，并将标记作为键：

let sentence = "The yellow cat hunts the little gray mouse around the block"
let options: NSLinguisticTagger.Options = [.omitWhitespace, .omitPunctuation, .joinNames]
let schemes = NSLinguisticTagger.availableTagSchemes(forLanguage: "en")
let tagger = NSLinguisticTagger(tagSchemes: schemes, options: Int(options.rawValue))
tagger.string = sentence
var tokens: [NSLinguisticTag: [String]] = [:]
tagger.enumerateTags(in: NSRange(location: 0, length: sentence.count), scheme: .nameTypeOrLexicalClass, options: options) { (tag, tokenRange, _, _) in
    guard let tag = tag, let range = Range(tokenRange, in: sentence) else { return }
    let token = String(sentence[range])
    if tokens[tag] != nil {
        tokens[tag]!.append(token)
    } else {
        tokens[tag] = [token]
    }
}
print(tokens[.adjective])
print(tokens[.noun])

打印出：

可选[黄色、小颜色、灰色] 可选[猫，鼠标，块]

编辑2

如果希望能够从文本中删除某些标记，可以编写如下扩展：

extension NSLinguisticTagger {
    func eliminate(unwantedTags: [NSLinguisticTag], from text: String, options: NSLinguisticTagger.Options) -> String {
        string = text
        var textWithoutUnwantedTags = ""
        enumerateTags(in: NSRange(location: 0, length: text.utf16.count), scheme: .nameTypeOrLexicalClass, options: options) { (tag, tokenRange, _, _) in
            guard
                let tag = tag,
                !unwantedTags.contains(tag),
                let range = Range(tokenRange, in: text)
                else { return }
            let token = String(text[range])
            textWithoutUnwantedTags += " \(token)"
        }

        return textWithoutUnwantedTags.trimmingCharacters(in: .whitespaces)
    }
}

let sentence = "The yellow cat hunts the little gray mouse around the block"
let options: NSLinguisticTagger.Options = [.omitWhitespace, .omitPunctuation, .joinNames]
let schemes = NSLinguisticTagger.availableTagSchemes(forLanguage: "en")
let tagger = NSLinguisticTagger(tagSchemes: schemes, options: Int(options.rawValue))

let sentenceWithoutAdjectives = tagger.eliminate(unwantedTags: [.adjective], from: sentence, options: options)
print(sentenceWithoutAdjectives)

然后你可以这样使用它：

extension NSLinguisticTagger {
    func eliminate(unwantedTags: [NSLinguisticTag], from text: String, options: NSLinguisticTagger.Options) -> String {
        string = text
        var textWithoutUnwantedTags = ""
        enumerateTags(in: NSRange(location: 0, length: text.utf16.count), scheme: .nameTypeOrLexicalClass, options: options) { (tag, tokenRange, _, _) in
            guard
                let tag = tag,
                !unwantedTags.contains(tag),
                let range = Range(tokenRange, in: text)
                else { return }
            let token = String(text[range])
            textWithoutUnwantedTags += " \(token)"
        }

        return textWithoutUnwantedTags.trimmingCharacters(in: .whitespaces)
    }
}

let sentence = "The yellow cat hunts the little gray mouse around the block"
let options: NSLinguisticTagger.Options = [.omitWhitespace, .omitPunctuation, .joinNames]
let schemes = NSLinguisticTagger.availableTagSchemes(forLanguage: "en")
let tagger = NSLinguisticTagger(tagSchemes: schemes, options: Int(options.rawValue))

let sentenceWithoutAdjectives = tagger.eliminate(unwantedTags: [.adjective], from: sentence, options: options)
print(sentenceWithoutAdjectives)

打印出：

猫在街区周围捕鼠

那么，如果我想添加名词、动词等，会发生什么呢？难道没有一种方法可以将文本添加到地图中并相应地拉出文本吗？或者我只需要对所有我想提取的标记这样做吗？不幸的是，由于NSLingusticTagger基于块的实现，您无法直接对标记应用过滤器或映射。但是您可以枚举一次并根据标记存储令牌。请看一下编辑后的答案作为示例。对所有问题表示歉意。基本上，我的目标是打印出句子，消除与标记相关的某些标记。假设你的句子，如果我想去掉形容词，这个句子会打印出除了黄色和灰色以外的所有东西，就是这样。如果我也想排除名词，我可以在不需要的标记参数下传递名词吗？当然，你可以过滤掉你想要的任何标记。只需将它们放入unwantedTags数组参数中。您还可以一次过滤掉多个标记。例如，这会过滤掉名词和形容词：tagger.eliminateunwantedTags:[.形容词，.名词]，from:句子，options:选项