Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/316.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在python中使用beautifulsoup从网页中获取刮取数据_Python_Web_Web Scraping_Beautifulsoup_Request - Fatal编程技术网

如何在python中使用beautifulsoup从网页中获取刮取数据

如何在python中使用beautifulsoup从网页中获取刮取数据,python,web,web-scraping,beautifulsoup,request,Python,Web,Web Scraping,Beautifulsoup,Request,我试图从下面给定的链接中获取数据, 我需要将其保存到csv文件中 我得到了所有电影的名称,但在下面的其他格式中,请参见下面的: 我得到的是csv格式的以下文件: T h e " " S h a w s h a n k " " R e d e m p t i o n T h e " " G o d f a t h e r T h e " " G o d f a t h e r : " " P

我试图从下面给定的链接中获取数据,

我需要将其保存到csv文件中

我得到了所有电影的名称,但在下面的其他格式中,请参见下面的: 我得到的是csv格式的以下文件:

T h e " " S h a w s h a n k " " R e d e m p t i o n

T h e " " G o d f a t h e r

T h e " " G o d f a t h e r : " " P a r t " " I I

T h e " " D a r k " " K n i g h t

1 2 " " A n g r y " " M e n

S c h i n d l e r ' s " " L i s t
应该是:

The Shawshank Redemption

The Godfather

The God father: Part II

The Dark Knight
我试过:

from bs4 import BeautifulSoup
import requests
import csv

url = 'https://www.imdb.com/chart/top'
res = requests.get(url)
soup = BeautifulSoup(res.text)
movie = soup.find_all(class_='titleColumn')

for names in movie:
    for name in names.find_all('a'):
        movies=list(name.text)
        # print(movies)

        # IN CSV
        with open('TopMovies.csv', 'a') as csvFile:
            writer = csv.writer(csvFile, delimiter = ' ')
            writer.writerow(movies)
        csvFile.close()
        print(movies)

print("Successfully inserted")
如果我的代码有任何变化,请告诉我


谢谢

问题在第行
movies=list(name.text)
-您正在创建列表,其中每个项目都是字符串
name.text
中的字符

您可以使用列表理解
movies=[name.text代表名称中的名称。find_all('a')]
,而不是此
list()

这将正确创建TopMovies.csv

LibreOffice的屏幕截图:


问题在第
行movies=list(name.text)
中-您正在创建列表,其中每个项目都是字符串
name.text
中的字符

您可以使用列表理解
movies=[name.text代表名称中的名称。find_all('a')]
,而不是此
list()

这将正确创建TopMovies.csv

LibreOffice的屏幕截图:


只需将
movies=list(name.text)
替换为
movies=name.text
,即可获得预期输出只需将
movies=list(name.text)
替换为
movies=name.text
,即可获得预期输出
from bs4 import BeautifulSoup
import requests
import csv

url = 'https://www.imdb.com/chart/top'
res = requests.get(url)
soup = BeautifulSoup(res.text)
movie = soup.find_all(class_='titleColumn')

for names in movie:
    movies = [name.text for name in names.find_all('a')]
    # print(movies)

    # IN CSV
    with open('TopMovies.csv', 'a') as csvFile:
        writer = csv.writer(csvFile, delimiter = ' ')
        writer.writerow(movies)
    csvFile.close()
    print(movies)

print("Successfully inserted")