Python 为列表中的每个单词搜索webtext
我必须编写代码,从用户选择的网站收集文本,并将搜索这三个选定的单词文本。然后,它必须输出每个单词及其在网站上出现的次数Python 为列表中的每个单词搜索webtext,python,list,text-mining,Python,List,Text Mining,我必须编写代码,从用户选择的网站收集文本,并将搜索这三个选定的单词文本。然后,它必须输出每个单词及其在网站上出现的次数 My attempt at writing this program leaves me with an output that tells me that 0 of the words listed are present on the webtext even when I know they do appear. Does anyone have an idea as t
My attempt at writing this program leaves me with an output that tells me that 0 of the words listed are present on the webtext even when I know they do appear. Does anyone have an idea as to how to make it work?
import requests
def main():
Asentence="This,is,a,sentence,of,some,kind!"
print(type(Asentence))
print(Asentence)
ListOfWords=Asentence.split(",")
print(type(ListOfWords))
print(ListOfWords)
print(ListOfWords[0])
print(ListOfWords[-1])
print(ListOfWords[3])
SomeOtherList=["Sally", "Fred"]
print(type(SomeOtherList))
print(SomeOtherList)
print(SomeOtherList[0])
for thing in SomeOtherList:
print(thing)
n= eval(input("How many websites would you like to enter? :"))
while n > 0:
Word()
n=n-1
#------------------------------------------
def Word():
answer=input("please enter the websites to examine in the http format ")
response=requests.get(answer)
txt = response.text
print(txt)
mywords=Firstpart(list)
num=FindAWord(txt,mywords)
print("There are", num, "words called",mywords)
#----------------------------------------
def FindAWord(TheWebText,word):
print(TheWebText)
print(type(TheWebText))
MyList=TheWebText.split(sep=" ")
print(MyList[0:100])
count=0
for item in MyList:
if(item==word in Firstpart(list)):
print(item)
count=count+1
return count
#----------------------------------
def Firstpart(list):
wordchoice=[]
firstword=input("Please enter the first word you would like to look for")
wordchoice.append(firstword)
secondword=input("Please enter the second word you would like to look for")
wordchoice.append(secondword)
thirdword=input("Please enter the third word you would like to look for")
wordchoice.append(thirdword)
return wordchoice
main()
Thank you so much in advance.
您可以使用collections模块中的Counter来帮助您
import requests
from collections import Counter
def main():
url = input('Please enter the url to the website you want to search: ')
if not 'http' in url:
url = 'http://' + url
words = []
for i in range(1,4):
words.append(input('Please enter word number {}: '.format(i)))
resp = requests.get(url)
counter = Counter(resp.text.split())
for word in words:
print(word, 'found', counter[word], 'times')
if __name__ == '__main__':
main()
您可以使用collections模块中的Counter来帮助您
import requests
from collections import Counter
def main():
url = input('Please enter the url to the website you want to search: ')
if not 'http' in url:
url = 'http://' + url
words = []
for i in range(1,4):
words.append(input('Please enter word number {}: '.format(i)))
resp = requests.get(url)
counter = Counter(resp.text.split())
for word in words:
print(word, 'found', counter[word], 'times')
if __name__ == '__main__':
main()
Joakim给出了这一点,这有助于使您的代码更易于阅读和理解,但我将首先为您提供一个它不起作用的原因
在Word()
函数中,变量mywords
是用户输入的单词列表。当您将其传递给FindAWord
函数时,您给出的是一个列表,而不是一个单词。然后,当您比较if(item==word)(该行的第一部分(list)中确实不应该有)时,您正在检查单个单词是否等于一个列表
您可以通过执行以下操作来修复该部分:
def Word():
answer=input("please enter the websites to examine in the http format ")
response=requests.get(answer)
txt = response.text
print(txt)
mywords=Firstpart(list)
for word in mywords:
num=FindAWord(txt,word)
print("There are", num, "words called",word)
def FindAWord(TheWebText,word):
print(TheWebText)
print(type(TheWebText))
MyList=TheWebText.split(sep=" ")
print(MyList[0:100])
count=0
for item in MyList:
if(item==word):
print(item)
count=count+1
return count
您应该真正关注使变量名更具描述性,以帮助您(和其他人)更容易地阅读代码。如您所见,您在FindAWord
aword
中命名了参数,该参数是单数,给人的印象是它是一个单词。相反,这是一个单词列表。如果是users\u words
或其他什么东西,你会立即发现If(item==users\u words)
Joakim给出了一个让你的代码更容易阅读和理解的方法,但首先我会给你一个它不起作用的原因
在Word()
函数中,变量mywords
是用户输入的单词列表。当您将其传递给FindAWord
函数时,您给出的是一个列表,而不是一个单词。然后,当您比较if(item==word)
(该行的第一部分(list)中确实不应该有)时,您正在检查单个单词是否等于一个列表
您可以通过执行以下操作来修复该部分:
def Word():
answer=input("please enter the websites to examine in the http format ")
response=requests.get(answer)
txt = response.text
print(txt)
mywords=Firstpart(list)
for word in mywords:
num=FindAWord(txt,word)
print("There are", num, "words called",word)
def FindAWord(TheWebText,word):
print(TheWebText)
print(type(TheWebText))
MyList=TheWebText.split(sep=" ")
print(MyList[0:100])
count=0
for item in MyList:
if(item==word):
print(item)
count=count+1
return count
您应该真正关注使变量名更具描述性,以帮助您(和其他人)更容易地阅读代码。如您所见,您在FindAWord
aword
中命名了参数,该参数是单数,给人的印象是它是一个单词。相反,这是一个单词列表。如果是users\u words
或其他什么东西,您会立即看到If(item==users\u words)
中有错误