Python中从CSV加载迭代Url_Python_Web Scraping

Python中从CSV加载迭代Url

python web-scraping

Python中从CSV加载迭代Url,python,web-scraping,Python,Web Scraping,请帮帮我我在CSV文件中有一个数据url，该文件中有100行和1列，我想使用Python将数据行1从CSV加载到行100，如何编写代码行但是，在运行之后，重复只能在其中一行中工作一次，并且不会到达CSV中url的末尾，也不会继续到下一个url disc_information = html.find('div', class_='alert alert-info global-promo').text.strip().strip('\n') AttributeError: 'NoneTyp

请帮帮我我在CSV文件中有一个数据url，该文件中有100行和1列，我想使用Python将数据行1从CSV加载到行100，如何编写代码行

但是，在运行之后，重复只能在其中一行中工作一次，并且不会到达CSV中url的末尾，也不会继续到下一个url

disc_information = html.find('div', class_='alert alert-info global-promo').text.strip().strip('\n')
AttributeError: 'NoneType' object has no attribute 'text'

如果找不到html时出现错误，我该如何处理

下面的代码行我使用python，请帮助使循环刮运行到url列表的末尾

from bs4 import BeautifulSoup
import requests
import pandas as pd
import csv
import pandas


with open('Url Torch.csv','rt') as f:
  data = csv.reader(f, delimiter=',')
  for row in data:
      URL_GO = row[2]

def variable_Scrape(url):
    try:
        cookies = dict(cookie="............")
        request = requests.get(url, cookies=cookies)
        html = BeautifulSoup(request.content, 'html.parser')
        title = html.find('div', class_='title').text.strip().strip('\n')
        desc = html.find('div', class_='content').text
        link = html.find_all('img', class_='lazyload slide-item owl-lazy')
        normal_price = html.find('div', class_='amount public').text.strip().strip('\n')
        disc_information = html.find('div', class_='alert alert-info global-promo').text.strip().strip('\n')

    except AttributeError as e:
        print(e)
        #ConnectionAbortedError
        return False
    else:
        print(title)
        #print(desc)
        #print(link)
    finally:
        print(title)
        print(desc)
        print(link)
        print('Finally.....')
variable_Scrape(URL_GO)

如果看不到您的csv文件，很难给出准确答案，但请尝试以下方法：

import csv

f = open('you_file.csv')
csv_f = csv.reader(f)

for row in csv_f:
  print row[0]

这是密码

import csv

data = []  #create an empty list to store rows on it
with open('emails.csv') as csv_file:
    reader = csv.reader(csv_file)
    for row in reader:
        data.append(row) #add each row to the list

根据您关于在url不正常时传递循环的评论：

for url in data:   # data is the list where url stored
    try:
        # do your code here (requests, beautifulsoup) :
        # r = requests.get(url) ...
    except:
        pass
        # will go to the next loop (next url) if an error happens

这没问题，但是对于csv循环仅适用于1行，代码是更新的OK I will，但需要更多说明：您的csv文件中有URL集合，您想使用请求和美化组通过循环刮取每个URL？另外，您的csv文件是否包含标题？问题已解决，但当前出现了一个新问题，这是问题回溯（最近一次调用）：文件“e:***V-1.0.py”，第51行，打印（----）名称错误：未定义名称“-----”如果出现错误，如何跳过？对于csv文件的url循环问题已完成，但是，当循环在其中一个url行上运行时，html元素的某些部分不存在，但会使用python变量进行解析，从而发生错误：名称“NameError”未定义回溯（最后一次调用）：文件“e:\path.py”，第51行，正在打印（NameError）NameError:名称“NameError”未定义我想询问，如何传递未找到的解析html元素，以便解析过程转到下一个命令。请提供一个。具体是什么问题？已经有关于如何捕获和处理错误的资源可用。看见