Python 从列表中提取出发和到达_Python_List_Identification

Python 从列表中提取出发和到达

python list

Python 从列表中提取出发和到达,python,list,identification,Python,List,Identification,我试图从结构和长度可变的列表中提取一些参数。基本上，这些参数是路线的出发地址和到达地址。此列表基于自然语言的句子构建，因此不遵循任何特定模板： 1st example : ['go', 'Buzenval', 'from', 'Chatelet'] 2nd example : ['How', 'go', 'street', 'Saint', 'Augustin', 'from', 'Buzenval'] 3rd example : ['go', 'from', '33', 'street', '

我试图从结构和长度可变的列表中提取一些参数。基本上，这些参数是路线的出发地址和到达地址。此列表基于自然语言的句子构建，因此不遵循任何特定模板：

1st example : ['go', 'Buzenval', 'from', 'Chatelet']
2nd example : ['How', 'go', 'street', 'Saint', 'Augustin', 'from', 'Buzenval']
3rd example : ['go', 'from', '33', 'street', 'Republique', 'to', '12','street','Napoleon']

我已经设法为每种情况创建了另一个非常相似的列表，除了出发和到达被实际的单词“出发”和“到达”替换。通过以上示例，我获得：

1st example : ['go', 'arrival', 'from', 'departure']
2nd example : ['How', 'go', 'arrival', 'from', 'departure']
3rd example : ['go', 'from', 'departure', 'to', 'arrival']

现在我有了这两种列表，我想确定出发和到达：

1rst example : departure = ['Chatelet'], arrival = ['Buzenval']
2nd example : departure =  ['Buzenval'], arrival = ['street','Saint','Augustin']
3rd example : departure = ['33','street','Republique'], arrival = ['12','street','Napoleon']

基本上，参数是两个列表中所有不同的参数，但我需要确定哪一个是出发点，哪一个是到达点。我想Regex可以帮我解决这个问题，但我不知道怎么做

谢谢你的帮助

Regex在这方面肯定会有所帮助，但我尝试了一种简单的方法。如果您提到的模式适用于所有人，那么这是适用的。我把它作为第一个例子。您可以对其余部分应用相同的逻辑并修改代码：

代码：

first = ['go', 'Buzenval', 'from', 'Chatelet'] # First Example
start = first.index('go')
end = first.index('from')
arrival = base[start+1:end]
departure = base[end+1:]
print("Departure: {0} , Arrival: {1}".format(departure,arrival))

输出：

Departure: ['Chatelet'] , Arrival: ['Buzenval']

我找到了一个解决你的三个例子的方法。您应该更改的一件事是变量名，我不知道如何命名它们。（这是旧版本，速度慢且难以理解。后面的版本更好）

两种方式的作用完全相同：

for example in ((['go', 'Buzenval', 'from', 'Chatelet'],
                 ['go', 'arrival', 'from', 'departure']
                 ),
                (['How', 'go', 'street', 'Saint', 'Augustin', 'from', 'Buzenval'],
                 ['How', 'go', 'arrival', 'from', 'departure']
                 ),
                (['go', 'from', '33', 'street', 'Republique', 'to', '12', 'street', 'Napoleon'],
                 ['go', 'from', 'departure', 'to', 'arrival']
                 )):
    print(extract_places(*example))

这两种类型的打印：

(['Buzenval'], ['Chatelet'])
(['street', 'Saint', 'Augustin'], ['Buzenval'])
(['12', 'street', 'Napoleon'], ['33', 'street', 'Republique'])

来自

Python

解释器的示例：

>>> import itertools
>>> key = None
>>> arr = ['go', 'from', '33', 'street', 'Republique', 'to', '12','street','Napoleon']
>>>
>>> for k, group in itertools.groupby(arr, lambda x: x in ['go', 'to','from']):
...     if k:
...         key = list(group)[-1]
...         continue
...     if key is not None:
...         if key == 'from':
...             tag = 'departure'
...         else:
...             tag = 'arrival'
...         print tag, list(group)
...     key = None
...
departure ['33', 'street', 'Republique']
arrival ['12', 'street', 'Napoleon']

这应该适合您：

l1 =  ['go', 'Buzenval', 'from', 'Chatelet']
l2 =  ['How', 'go', 'street', 'Saint', 'Augustin', 'from', 'Buzenval']
l3 =  ['go', 'from', '33', 'street', 'Republique', 'to', '12','street','Napoleon']

def get_locations (inlist):
    marker = 0
    end_dep = 0
    start_dep = 0

    for word in inlist:
        if word =="go":
            if inlist[marker+1] != "from":
                end_dep = marker +1
            else:
                start_dep = marker +2

        if word =="from" and start_dep == 0:
            start_dep = marker + 1

        if word == "to":
            end_dep = marker + 1
        marker +=1

    if end_dep > start_dep:
        start_loc = inlist[start_dep:end_dep-1]
        end_loc = inlist[end_dep:]

    else:
        start_loc = inlist [start_dep:]
        end_loc = inlist[end_dep: start_dep -1]

    return start_loc, end_loc

directions = get_locations (l3) #change to l1 / l2 to see other outputs

print( "departure = " + str(directions[0]))
print( "arrival = " + str(directions[1]))

你好，山姆，谢谢你的回复！这确实适用于我给出的例子，但不幸的是，在我的语言中，“from”和“to”有许多同义词。我甚至不确定是否能得到“go”这个词，因为有时这个句子是“从x到y的路线是什么”。。。但是如果我不能用regex解决这个问题，我会接受你的解决方案。也许你必须检查数据，然后维护这些单词的列表或词典。因为正则表达式也遵循一种模式，不能只处理随机模式。嗨，Megalng，非常感谢！如果我理解得很好，我必须在“关键字”中输入两个列表中可能相似的所有单词，对吗？如果我是对的，并且设置keywords=list（set（names）.intersection（modes）），我应该能够将您的代码推广到很多用途cases@BenjaminBB我添加了一个新版本。对你来说，其中一个比另一个更容易理解？@BenjaminBB第二个版本也快十倍。我非常喜欢第二个！速度对我的程序很重要，所以越快越好。谢谢你花时间在这上面@这很有趣。我喜欢这样的任务。

>>> import itertools
>>> key = None
>>> arr = ['go', 'from', '33', 'street', 'Republique', 'to', '12','street','Napoleon']
>>>
>>> for k, group in itertools.groupby(arr, lambda x: x in ['go', 'to','from']):
...     if k:
...         key = list(group)[-1]
...         continue
...     if key is not None:
...         if key == 'from':
...             tag = 'departure'
...         else:
...             tag = 'arrival'
...         print tag, list(group)
...     key = None
...
departure ['33', 'street', 'Republique']
arrival ['12', 'street', 'Napoleon']

l1 =  ['go', 'Buzenval', 'from', 'Chatelet']
l2 =  ['How', 'go', 'street', 'Saint', 'Augustin', 'from', 'Buzenval']
l3 =  ['go', 'from', '33', 'street', 'Republique', 'to', '12','street','Napoleon']

def get_locations (inlist):
    marker = 0
    end_dep = 0
    start_dep = 0

    for word in inlist:
        if word =="go":
            if inlist[marker+1] != "from":
                end_dep = marker +1
            else:
                start_dep = marker +2

        if word =="from" and start_dep == 0:
            start_dep = marker + 1

        if word == "to":
            end_dep = marker + 1
        marker +=1

    if end_dep > start_dep:
        start_loc = inlist[start_dep:end_dep-1]
        end_loc = inlist[end_dep:]

    else:
        start_loc = inlist [start_dep:]
        end_loc = inlist[end_dep: start_dep -1]

    return start_loc, end_loc

directions = get_locations (l3) #change to l1 / l2 to see other outputs

print( "departure = " + str(directions[0]))
print( "arrival = " + str(directions[1]))