Python 在列表列表中向后查找第一个匹配项_Python_Regex_List_Nlp

Python 在列表列表中向后查找第一个匹配项

python regex list nlp

Python 在列表列表中向后查找第一个匹配项,python,regex,list,nlp,Python,Regex,List,Nlp,我有以下清单： [ ['the', 'the +Det'], ['dog', 'dog +N +A-right'], ['ran', 'run +V +past'], ['at', 'at +P'], ['me', 'I +N +G-left'], ['and', 'and +Cnj'], ['the', 'the +Det'], ['ball', 'ball +N +G-right'], ['was', 'was +C'], ['kicked', 'kick +V +past'] ['by'

我有以下清单：

[
['the', 'the +Det'],
['dog', 'dog +N +A-right'],
['ran', 'run +V +past'],
['at', 'at +P'], 
['me', 'I +N +G-left'],
['and', 'and +Cnj'],
['the', 'the +Det'],
['ball', 'ball +N +G-right'],
['was', 'was +C'],
['kicked', 'kick +V +past']
['by', 'by +P']
['me', 'I +N +A-left']

]

基本上，我想做的是：

遍历列表列表

查找

+G-left

、

+A-left

、

+G-right

和

+A-right

的所有实例

如果看到

+G-left

或

+A-left

，向后看带有元素

+V

的列表的第一个实例，将包含

+G-left

或

+a-left

的列表的第一个索引添加到包含
+V
的列表末尾，并使用

+G-left

或

+a-left

标记，然后继续并重复

如果看到

+G-right

或

+A-right

，期待包含元素

+V

的列表的第一个实例将包含

+G-right

或

+a-right

的列表的第一个索引添加到包含
+V
的列表末尾，并使用

+G-right

或

+a-right

标记，然后继续并重复

因此，在我上面的例子中，期望的状态是：

[
['the', 'the +Det'],
['dog', 'dog +N +A-right'],
['ran', 'run +V +past', 'dog+A-right', 'me+G-left'],
['at', 'at +P'], 
['me', 'I +N +G-left'],
['and', 'and +Cnj'],
['the', 'the +Det'],
['ball', 'ball +N +G-right'],
['was', 'was +C'],
['kicked', 'kick +V +past', 'ball+G-right', 'me+A-left']
['by', 'by +P']
['me', 'I +N +A-left']
]

我认为正确的方法是使用

re

，因此：

gleft = re.compile(r"G-left")
gright = re.compile(r"G-right")
aleft = re.compile(r"A-left")
aright = re.compile(r"A-right")

然后像

for item in list:
    if aleft.match(item[1]):
        somehow work backwards to find the +V tag
            whatever.insert(-1, item[0]) #can you concatenate a string here to add +A-left

    if aright.match(item[1]):
        somehow work forwards to find the +V tag
            whatever.insert(-1, item[0]) #can you concatenate a string here to add +A-right

还有同样的东西，但是有G标签

希望有人能帮我指出正确的方向。我相信我已经正确地分解了这些步骤，只是我对Python还不太熟悉，还不知道它的语法。

这可能通过使用辅助函数来简化，但除此之外，试试这个，它不需要正则表达式：

wls = [your list of lists, above, fixed (some commas are missing)]
for wl in wls:
    for w in wl:
        if '-right' in w:                        
            targ = wls.index(wl)            
            counter = 0
            for wt in (wls[targ+1:]):                               
                for t in wt:
                    if '+V' in t:
                        if counter<1:                            
                            wt.insert(len(wt),wl[0]+w.split(' ')[-1])
                        counter+=1

        if '-left' in w:            
            targ = wls.index(wl)            
            counter = 0
            revd = [item for item in reversed(wls[:targ])]
            for wt in revd:           
                for t in wt:
                    if '+V' in t:
                        if counter<1:
                            wt.insert(len(wt),wl[0]+w.split(' ')[-1])
                        counter+=1
           
wls

wls=[上面的列表列表已修复（缺少一些逗号）]
对于wls中的wl：
对于wl中的w：
如果w中为“-右”：
targ=wls.索引（wl）
计数器=0
对于wt in（wls[targ+1:]）：
对于t（单位：wt）：
如果t中的“+V”：
如果我觉得周围有这么多符号真的让人头晕目眩。如果你能用一个更简单的例子来解释，不是很容易吗？找到正则表达式的所有实例是：\+
，但是因为它们都是常量，你不需要正则表达式，使用substr或类似的东西。@Austin，我相信我已经使它变得简单多了。基本上，我想把动词的主语和宾语放在动词的分析中。我正在研究的语言没有软件包，所以我不能使用pre-fab树库。