Python 查找文本中短语之间的距离

Python 查找文本中短语之间的距离,python,count,distance,Python,Count,Distance,我有一个问题,如何计算文本中短语之间的单词数?例如,我有下一个文本: 埃隆·马斯克是一位技术企业家和投资者。他是SpaceX的创始人、首席执行官和首席设计师。埃隆·马斯克表示,SpaceX、特斯拉和SolarCity的目标围绕着他改变世界和人类的愿景 我想数一数“Elon面具”和“SpaceX”之间有多少个单词。然后返回smth,就像一个带数字的列表,然后找到平均单词距离。例如,[15,6] 我知道,在单词的情况下,我们可以在单词上拆分文本。但是如何处理短语呢?您可以根据点、感叹号和问号分割文本









正如用户Dominique提到的,有很多小细节你必须解释。我已经做了一个简单的程序,可以找到两个单词之间的距离。你想知道“Elon Musk”和“SpaceX”之间的距离。为什么不找出“Musk”和“SpaceX”之间的距离呢



示例(Python 3):

# Initial sentence
phrase = 'Elon Musk is a technology entrepreneur and investor. He is the founder, CEO, and lead designer of SpaceX. Elon Musk has stated that the goals of SpaceX, Tesla, and SolarCity revolve around his vision to change the world and humanity.'

# Removes common punctuation characters
phrase = ''.join(character for character in phrase if character not in ('!', '.' , ':' , ',', '"')) # Insert punctuation you want removed

# Creates a list of split words
word_list = phrase.split()

# Words you want to find the distance between (word_1 comes first in the sentence, then word_2)
word_1 = 'Musk'
word_2 = 'SpaceX'

# Calculates the distance between word_1 and word_2
distance = (word_list.index(word_2)) - (word_list.index(word_1))

# Prints distance between word_1 and word_2
print('Distance between "' + word_1 + '" and "' + word_2 + '" is ' + str(distance - 1) + ' words.')

# Initial sentence
phrase = 'Elon Musk is a technology entrepreneur and investor. He is the founder, CEO, and lead designer of SpaceX. Elon Musk has stated that the goals of SpaceX, Tesla, and SolarCity revolve around his vision to change the world and humanity.'

# Removes common punctuation characters
phrase = ''.join(character for character in phrase if character not in ('!', '.' , ':' , ',', '"')) # Insert punctuation you want removed

# Creates a list of split words
word_list = phrase.split()

# Words you want to find the distance between (word_1 comes first in the sentence, then word_2)
word_1 = 'Musk'
word_2 = 'SpaceX'

# Calculates the distance between word_1 and word_2
distance = (word_list.index(word_2)) - (word_list.index(word_1))

# Prints distance between word_1 and word_2
print('Distance between "' + word_1 + '" and "' + word_2 + '" is ' + str(distance - 1) + ' words.')


正如用户Dominique提到的,有很多小细节你必须解释。我制作了一个简单的程序来计算两个单词之间的距离。你想计算“Elon Musk”和“SpaceX”之间的距离。为什么不直接计算“Musk”和“SpaceX”之间的距离呢



示例(Python 3):

# Initial sentence
phrase = 'Elon Musk is a technology entrepreneur and investor. He is the founder, CEO, and lead designer of SpaceX. Elon Musk has stated that the goals of SpaceX, Tesla, and SolarCity revolve around his vision to change the world and humanity.'

# Removes common punctuation characters
phrase = ''.join(character for character in phrase if character not in ('!', '.' , ':' , ',', '"')) # Insert punctuation you want removed

# Creates a list of split words
word_list = phrase.split()

# Words you want to find the distance between (word_1 comes first in the sentence, then word_2)
word_1 = 'Musk'
word_2 = 'SpaceX'

# Calculates the distance between word_1 and word_2
distance = (word_list.index(word_2)) - (word_list.index(word_1))

# Prints distance between word_1 and word_2
print('Distance between "' + word_1 + '" and "' + word_2 + '" is ' + str(distance - 1) + ' words.')

# Initial sentence
phrase = 'Elon Musk is a technology entrepreneur and investor. He is the founder, CEO, and lead designer of SpaceX. Elon Musk has stated that the goals of SpaceX, Tesla, and SolarCity revolve around his vision to change the world and humanity.'

# Removes common punctuation characters
phrase = ''.join(character for character in phrase if character not in ('!', '.' , ':' , ',', '"')) # Insert punctuation you want removed

# Creates a list of split words
word_list = phrase.split()

# Words you want to find the distance between (word_1 comes first in the sentence, then word_2)
word_1 = 'Musk'
word_2 = 'SpaceX'

# Calculates the distance between word_1 and word_2
distance = (word_list.index(word_2)) - (word_list.index(word_1))

# Prints distance between word_1 and word_2
print('Distance between "' + word_1 + '" and "' + word_2 + '" is ' + str(distance - 1) + ' words.')



def find_distance(sentence, word1, word2):
    distances = []
    while sentence != "":
        _, _, sentence = sentence.partition(word1)
        text, _, _ = sentence.partition(word2)
        if text != "":
    return distances

print(find_distance(phrase, "Elon Musk", "SpaceX"))

请注意,像Elon Musk这样的案例的行为是一个技术型的Elon Musk企业家……必须定义。你想选哪一种?第一个还是第二个?


def find_distance(sentence, word1, word2):
    distances = []
    while sentence != "":
        _, _, sentence = sentence.partition(word1)
        text, _, _ = sentence.partition(word2)
        if text != "":
    return distances

print(find_distance(phrase, "Elon Musk", "SpaceX"))
请注意,像Elon Musk这样的案例的行为是一个技术型的Elon Musk企业家……必须定义。你想选哪一种?第一个还是第二个