从列表中查找元素并将附近的元素存储在列表中-Python
我有一个标记化单词的列表,我正在搜索其中的一些单词,并将附近的3个元素存储到找到的单词中。代码是: 要查找的单词--要查找的单词列表 代币——我必须从一个大列表中查找单词从列表中查找元素并将附近的元素存储在列表中-Python,python,Python,我有一个标记化单词的列表,我正在搜索其中的一些单词,并将附近的3个元素存储到找到的单词中。代码是: 要查找的单词--要查找的单词列表 代币——我必须从一个大列表中查找单词 for x in words_to_find: if x in tokens: print "Matched word is", x indexing = tokens.index(x) print "This is index :", i
for x in words_to_find:
if x in tokens:
print "Matched word is", x
indexing = tokens.index(x)
print "This is index :", indexing
count = 0
lower_limit = indexing - 3
upper_limit = indexing + 3
print "Limits are", lower_limit,upper_limit
for i in tokens:
if count >= lower_limit and count <= upper_limit:
print "I have entered the if condition"
print "Count is : ",count
wording = tokens[count]
neighbours.append(wording)
else:
count +=1
break
count +=1
final_neighbour.append(neighbours)
print "I am in access here", final_neighbour
用于x的单词查找:
如果令牌中有x:
打印“匹配字为”,x
索引=标记。索引(x)
打印“这是索引:”,索引
计数=0
下限=索引-3
上限=索引+3
打印“限制为”、下限、上限
对于令牌中的i:
如果count>=下限且count,则每个单词的邻居都会改变。因此,让它每一个字都无效。而且count也应该分配给indexing-3,如果大于等于0,则为下限,否则为0,因为找到的单词中的前三个单词和后三个单词是您需要的
for x in words_to_find:
neighbours=[] # the neighbour for the new word will change, therefore make it null!
if x in tokens:
print "Matched word is", x
indexing = tokens.index(x)
print "This is index :", indexing
lower_limit = indexing - 3
upper_limit = indexing + 3
count = lower_limit if lower_limit >=0 else 0# lower_limit starts from the index-3 of the word found!
print "Limits are", lower_limit,upper_limit,count
for i in tokens:
if count >= lower_limit and count <= upper_limit:
print "I have entered the if condition"
print "Count is : ",count
wording = tokens[count]
neighbours.append(wording)
else:
count +=1
break
count +=1
final_neighbour.append(neighbours)
print "I am in access here", final_neighbour
建议
您可以使用列表切片来获取匹配单词前后的3个单词。这也将提供所需的输出
lower_limit = lower_limit if lower_limit >=0 else 0
neighbours.append(tokens[lower_limit:upper_limit+1])
就是
final_neighbour=[]
for x in words_to_find:
neighbours=[] # the neighbour for the new word will change, therefore make it null!
if x in tokens:
print "Matched word is", x
indexing = tokens.index(x)
print "This is index :", indexing
lower_limit = indexing - 3
upper_limit = indexing + 3
lower_limit = lower_limit if lower_limit >=0 else 0# lower_limit starts from the index-3 of the word found!
print "Limits are", lower_limit,upper_limit
neighbours.append(tokens[lower_limit:upper_limit+1])
final_neighbour.append(neighbours)
print "I am in access here", final_neighbour
希望有帮助 在for循环中有一行在下面
neighbours.append(wording)
什么是“邻居”
您应该在append语句之前对其进行初始化(特别是在循环之外…更喜欢在代码的开头使用,在这里您定义了标记和单词\u to\u find),如下所示
neighbours[]
我们可以使用切片来获取邻居,而不是使用计数进行迭代
tokens = [u'प्रीमियम',u'एंड',u'गिव',u'फ्रॉम',u'महाराष्ट्रा',u'मुंबई',u'इंश्योरेंस',u'कंपनी',u'फॉर',u'दिस']
words_to_find = [u'फ्रॉम',u'महाराष्ट्रा']
final_neighbours = {}
for i in words_to_find:
if i in tokens:
print "Matched word : ",i
idx = tokens.index(i)
print "this is index : ",idx
idx_lb = idx-3
idx_ub = idx+4
print "Limits : ",idx_lb,idx_ub
only_neighbours = tokens[idx_lb : idx_ub]
only_neighbours.remove(i)
final_neighbours[i]= only_neighbours
for k,v in final_neighbours.items():
print "\nKey:",k
print "Values:"
for i in v:
print i,
Output:
Matched word : फ्रॉम
this is index : 3
Limits : 0 7
Matched word : महाराष्ट्रा
this is index : 4
Limits : 1 8
Key: महाराष्ट्रा
Values:
एंड गिव फ्रॉम मुंबई इंश्योरेंस कंपनी
Key: फ्रॉम
Values:
प्रीमियम एंड गिव महाराष्ट्रा मुंबई इंश्योरेंस
当你说“问题”时,问题是什么?你没有告诉我们问题是什么。什么东西在这里不起作用?你得到了什么结果?@JennerFelton,我得到了附近元素的上下限的正确索引,但最后的邻域列表是一个空列表。我试图为邻居列表中的每个find单词添加附近的单词,并为找到的所有这些单词创建一个列表。请帮我做这个或任何更好的方式做这个样本IO将是有帮助的<代码>要查找的单词
,标记
?要查找的单词=[u'प्रीमियम',u'एंड',u'गिव',u'फ्रॉम',u'महाराष्ट्रा',u'मुंबई',u'इंश्योरेंस',u'कंपनी',u'फॉर',u'दिस']请为该示例输入提供一个示例输入和所需的输出。列表理解将其转换为列表列表:[[u'\u091f\u0942',u'\u0930\u093f\u0915\u0949\u0930\u094d\u0921',u'\u092f\u094b\u0930',u'\u092a\u094d\u0930\u0940\u092e\u093f\u092e',u'\u092f\u0921',u'\u092f\u0935']另外,因为我处理的是印地语单词,它存储在优尼科。任何直接方法都可以直接查看印地语单词。当我逐个访问元素时,它会以印地语打印,但正在寻找更好的方法。该方法工作良好。谢谢。我唯一关心的是,我有印地语单词,我必须将它们转换为unicode才能使用它。现在我的输出这看起来像是这样的一个(u’\u’\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu’\uuu’\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu’\uuu’\uuuu’\uuuuuuu’\uuuuuuuuuu’\uuuuuuuuuuuuuuuuuuuuuuuuuu0930\uuuuuu0930\u0930\uu0930\u0930\u0930\uu0930\u0930\uu0930\uuuuuuuuuuuuuuuuuu0930\uuuuuuuuuuuuuu0930\u0907\u091f',u'\u0938\u0930',u'\u092f\u094b\u0930']“有没有办法看到印地语单词如果你单独打印这些单词,你就可以看到印地语单词。你已经编辑了我的代码。谢谢你的帮助。你能解释一下背后的逻辑吗?单独打印时是否将unicode转换成ASCII。但是AsciiunCode中不支持印地语字母是ASCII的超集。要了解unicode,选中此项,印地语字符属于Devanagiri unicode块,
tokens = [u'प्रीमियम',u'एंड',u'गिव',u'फ्रॉम',u'महाराष्ट्रा',u'मुंबई',u'इंश्योरेंस',u'कंपनी',u'फॉर',u'दिस']
words_to_find = [u'फ्रॉम',u'महाराष्ट्रा']
final_neighbours = {}
for i in words_to_find:
if i in tokens:
print "Matched word : ",i
idx = tokens.index(i)
print "this is index : ",idx
idx_lb = idx-3
idx_ub = idx+4
print "Limits : ",idx_lb,idx_ub
only_neighbours = tokens[idx_lb : idx_ub]
only_neighbours.remove(i)
final_neighbours[i]= only_neighbours
for k,v in final_neighbours.items():
print "\nKey:",k
print "Values:"
for i in v:
print i,
Output:
Matched word : फ्रॉम
this is index : 3
Limits : 0 7
Matched word : महाराष्ट्रा
this is index : 4
Limits : 1 8
Key: महाराष्ट्रा
Values:
एंड गिव फ्रॉम मुंबई इंश्योरेंस कंपनी
Key: फ्रॉम
Values:
प्रीमियम एंड गिव महाराष्ट्रा मुंबई इंश्योरेंस