如何在python中使用beautifulsoup从网页中获取刮取数据_Python_Web_Web Scraping_Beautifulsoup_Request

如何在python中使用beautifulsoup从网页中获取刮取数据

python web web-scraping

如何在python中使用beautifulsoup从网页中获取刮取数据,python,web,web-scraping,beautifulsoup,request,Python,Web,Web Scraping,Beautifulsoup,Request,我试图从下面给定的链接中获取数据，我需要将其保存到csv文件中我得到了所有电影的名称，但在下面的其他格式中，请参见下面的：我得到的是csv格式的以下文件： T h e " " S h a w s h a n k " " R e d e m p t i o n T h e " " G o d f a t h e r T h e " " G o d f a t h e r : " " P

我试图从下面给定的链接中获取数据，

我需要将其保存到csv文件中

我得到了所有电影的名称，但在下面的其他格式中，请参见下面的：我得到的是csv格式的以下文件：

T h e " " S h a w s h a n k " " R e d e m p t i o n

T h e " " G o d f a t h e r

T h e " " G o d f a t h e r : " " P a r t " " I I

T h e " " D a r k " " K n i g h t

1 2 " " A n g r y " " M e n

S c h i n d l e r ' s " " L i s t

应该是：

The Shawshank Redemption

The Godfather

The God father: Part II

The Dark Knight

我试过：

from bs4 import BeautifulSoup
import requests
import csv

url = 'https://www.imdb.com/chart/top'
res = requests.get(url)
soup = BeautifulSoup(res.text)
movie = soup.find_all(class_='titleColumn')

for names in movie:
    for name in names.find_all('a'):
        movies=list(name.text)
        # print(movies)

        # IN CSV
        with open('TopMovies.csv', 'a') as csvFile:
            writer = csv.writer(csvFile, delimiter = ' ')
            writer.writerow(movies)
        csvFile.close()
        print(movies)

print("Successfully inserted")

如果我的代码有任何变化，请告诉我

谢谢

问题在第行

movies=list（name.text）

-您正在创建列表，其中每个项目都是字符串

name.text

中的字符

您可以使用列表理解

movies=[name.text代表名称中的名称。find_all（'a'）]

，而不是此

list（）

：

这将正确创建TopMovies.csv

LibreOffice的屏幕截图：

问题在第

行movies=list（name.text）

中-您正在创建列表，其中每个项目都是字符串

name.text

中的字符

您可以使用列表理解

movies=[name.text代表名称中的名称。find_all（'a'）]

，而不是此

list（）

：

这将正确创建TopMovies.csv

LibreOffice的屏幕截图：

只需将

movies=list（name.text）

替换为

movies=name.text

，即可获得预期输出只需将

movies=list（name.text）

替换为

movies=name.text

，即可获得预期输出

from bs4 import BeautifulSoup
import requests
import csv

url = 'https://www.imdb.com/chart/top'
res = requests.get(url)
soup = BeautifulSoup(res.text)
movie = soup.find_all(class_='titleColumn')

for names in movie:
    movies = [name.text for name in names.find_all('a')]
    # print(movies)

    # IN CSV
    with open('TopMovies.csv', 'a') as csvFile:
        writer = csv.writer(csvFile, delimiter = ' ')
        writer.writerow(movies)
    csvFile.close()
    print(movies)

print("Successfully inserted")