Python 字符串中短语前的先行词数_Python_List

Python 字符串中短语前的先行词数

python list

Python 字符串中短语前的先行词数,python,list,Python,List,假设我有一个短语列表： list = ['new york', 'school', 'new'] 还有一根绳子 text = 'i am going to a school in new york and therefore i have to buy a new uniform to go to new york' 我想找出每个短语前面的单词数（仅用于第一次出现），ie输出应为： new york = 7 school = 5 new = 7 你知道我如何才能有效地实现这一点吗？天真的方

假设我有一个短语列表：

list = ['new york', 'school', 'new']

还有一根绳子

text = 'i am going to a school in new york and therefore i have to buy a new uniform to go to new york'

我想找出每个短语前面的单词数（仅用于第一次出现），ie输出应为：

new york = 7
school = 5
new = 7

你知道我如何才能有效地实现这一点吗？

天真的方法，没有任何性能或NLP考虑：

lst = ['new york', 'school', 'new']  # do not use 'list' as a name
text = 'i am going to a school in new york and therefore i have to buy a new uniform to go to new york'

{p: len(text[:text.find(p)].strip().split()) for p in lst}
# {'new york': 7, 'school': 5, 'new': 7}

使用

计数

和

索引

：

lst = ['new york', 'school', 'new']
text = 'i am going to a school in new york and therefore i have to buy a new uniform to go to new york'

for x in lst:
    print(f"{x} = {text.count(' ', 0, text.index(x))}")

# new york = 7
# school = 5                                                   
# new = 7

count

统计

text

中的空白，从开始到第一次出现与该短语前面的单词数相同的短语

lst = ['new york', 'school', 'new']
text = 'i am going to a school in new york and therefore i have to buy a new uniform to go to new york'

这将为您提供正在搜索其计数的字符串和字符串的计数

for x in lst:
    print(x +": "+str(len(text[0:text.index(x)].split(' ')) -1))

首先，您需要一些东西来标记您的

文本

和

列表的元素

。您是否使用了一些函数/模块作为标记器？new不应该也是7吗？不，我只是在spacesYes上拆分文本，应该是7我修复了这个问题