Python ValueError:dict包含字段名中不包含的字段-Webscraping
从一个列表开始,我想查询一个网页,从每个列表中提取标签。结果应保存在csv文件中。但是,我遇到了这个错误:ValueError:dict包含字段名中没有的字段 我的理解是,来自查询的字典包含的键比字段名多,尽管字典的键数根据列表的不同而不同 代码如下:Python ValueError:dict包含字段名中不包含的字段-Webscraping,python,python-3.x,selenium,web-scraping,selenium-chromedriver,Python,Python 3.x,Selenium,Web Scraping,Selenium Chromedriver,从一个列表开始,我想查询一个网页,从每个列表中提取标签。结果应保存在csv文件中。但是,我遇到了这个错误:ValueError:dict包含字段名中没有的字段 我的理解是,来自查询的字典包含的键比字段名多,尽管字典的键数根据列表的不同而不同 代码如下: import csv from selenium import webdriver from time import sleep from parsel import Selector from selenium.webdriver.common
import csv
from selenium import webdriver
from time import sleep
from parsel import Selector
from selenium.webdriver.common.keys import Keys
from collections import defaultdict
####### reading from the input file ##########
columns = defaultdict(list) # each value in each column is appended to a list
# get the list of keywords from the csv file
with open('query.csv', 'r') as csvfile:
reader = csv.DictReader(csvfile) # read rows into a dictionary format
for row in reader: # read a row as {column1: value1, column2: value2,...}
for (k, v) in row.items(): # go over each column name and value
columns[k].append(v) # append the value into the appropriate list
# the list containing all of the keywords
search_query_list = columns['Keyword']
########## start scraping ###############
rb_results = []
# create a driver and let it open google chrome
driver = webdriver.Chrome("chromedriver")
# get linkedin website
driver.get('https://www.redbubble.com/')
sleep(0.5)
for i in range(len(search_query_list)):
next_query = search_query_list[i]
# get RB website
driver.get('https://www.redbubble.com/')
# get the search by its id
search_bar = driver.find_element_by_name("query")
sleep(0.5)
# enter the query to the search bar
search_bar.send_keys(next_query)
# press enter
search_bar.send_keys(Keys.RETURN)
sleep(1)
# from parsel's selector get the page source
sel1 = Selector(text=driver.page_source)
sleep(0.5)
# first shirt //
continue_link = driver.find_element_by_class_name('shared-components-ShopSearchSkeleton-ShopSearchSkeleton__composedComponentWrapper--1s_CI').click()
sleep(1)
sel2 = Selector(text=driver.page_source)
sleep(0.5)
################## get TAGS ###############
# Check tags for all products
try:
# get the tags for the search query
tags_rb = driver.find_element_by_class_name("shared-components-Tags-Tags__listContent--oLdDf").extract_first()
# if number of products is found print it and search for the prime
# print the number of products found
if tags_rb == None:
rb_results.append("0")
else:
rb_results = str(tags_rb)
except:
rb_results.append("error")
###### writing part ########
with open ("rb_results.csv","w", newline='') as resultFile:
writer = csv.DictWriter(resultFile, fieldnames=["Rb Results"],delimiter='\t')
writer.writeheader()
writer.writerows({'Rb Tags': item} for item in rb_results)
resultFile.close()
如何解决错误消息的问题:ValueError:dict包含字段名称中没有的字段:“Rb Tags”
非常感谢
writerows
中使用的键需要列在fieldnames
中的某个位置。现在fieldnames=[“Rb结果”]
,而“Rb标记”
不在该列表中。假设您只需要一个键,您应该用“Rb结果”
替换“Rb标签”
(或者相反)。非常感谢!它起作用了!我面临的问题是,它应该提取类shared-components-tags-tags\uuuu listContent--oLdDf(从网站上)的所有标记。相反,我看到它应用了if的最后一个条件。即:except:rb_results.append(“error”)。也许,这是一个新手问题,因为我对这个问题相当陌生,但是我如何确保csv捕获该类名中包含的所有标记?