Python Selenium是否将数据存储到CSV中的特定列?
我有两份打印文件,我想将它们写入一个CSV文件中a列和B列 我的问题是,当我在最后同时打印(第一次和第二次打印)时,我只得到一个元素,我猜是多次,因为它不在循环中Python Selenium是否将数据存储到CSV中的特定列?,python,csv,web-scraping,Python,Csv,Web Scraping,我有两份打印文件,我想将它们写入一个CSV文件中a列和B列 我的问题是,当我在最后同时打印(第一次和第二次打印)时,我只得到一个元素,我猜是多次,因为它不在循环中 print((text), (link[0:-9])) 结果: LMFCIIC PWFERT-BK LMFCIIC PMFEP-BK LMFCIIC LMF8CC-BL LMFCIIC PMFEP-GY LMFCIIC ASPCP-NV LMFCIIC LWBASK-PK LMFCIIC LWBATA-PK LMFCIIC LWBA
print((text), (link[0:-9]))
结果:
LMFCIIC PWFERT-BK
LMFCIIC PMFEP-BK
LMFCIIC LMF8CC-BL
LMFCIIC PMFEP-GY
LMFCIIC ASPCP-NV
LMFCIIC LWBASK-PK
LMFCIIC LWBATA-PK
LMFCIIC LWBATOP-PK
LMFCIIC LMF8CC-RD
from bs4 import BeautifulSoup
from selenium import webdriver
import html5lib
import time
import requests
import csv
driver_path = '/usr/local/bin/chromedriver 2'
driver = webdriver.Chrome(driver_path)
driver.implicitly_wait(10)
driver.get('https://www.example.com')
iframe = driver.find_elements_by_tag_name('iframe')
images = driver.find_elements_by_tag_name('img')
with open('file_name.csv', 'w', newline='') as csvfile:
field_names = ['text', 'src']
writer = csv.DictWriter(csvfile, fieldnames=field_names)
writer.writerow({'text': 'text', 'src': 'src'})
for i in range(0, len(iframe)):
f = driver.find_elements_by_tag_name('iframe')[i]
img_src = images[i].get_attribute('src')
# do the src splitting here
img_src = img_src.split('=')[1]
driver.switch_to.frame(i)
text = driver.find_element_by_tag_name('body').text
text = text.replace("Code: ", "")
text = text.replace("No Copy Images to TW Server", "")
print(text)
writer.writerow({'text': text, 'src': img_src})
driver.switch_to_default_content()
driver.quit()
我的第一次打印是这样的:我想将它打印到列A
PWFERT-BK
PMFEP-BK
LMF8CC-BL
PMFEP-GY
ASPCP-NV
LWBASK-PK
LWBATA-PK
LWBATOP-PK
LMF8CC-RD
我的第二次打印是这样的:我想将它打印到列B
LMFCIIC
LWBASK
LWBATA
LWBATOP
LMFCIIC
这是我的全部代码:
from bs4 import BeautifulSoup
from selenium import webdriver
import html5lib
import time
import requests
driver_path = '/usr/local/bin/chromedriver 2'
driver = webdriver.Chrome(driver_path)
driver.implicitly_wait(10)
driver.get('https://www.tenniswarehouse-europe.com/zzz/producttracker_bl.html?ccode=SWIMG030')
try:
iframe = driver.find_elements_by_tag_name('iframe')
for i in range(0, len(iframe)):
f = driver.find_elements_by_tag_name('iframe')[i]
driver.switch_to.frame(i)
# your work to extract link
text = driver.find_element_by_tag_name('body').text
text = text.replace("Code: ","")
text = text.replace("No Copy Images to TW Server","")
print(text)
driver.switch_to_default_content()
finally:
driver.quit()
resp = requests.get('https://www.tenniswarehouse-europe.com/zzz/producttracker_bl.html?ccode=SWIMG030')
soup = BeautifulSoup(resp.text,"lxml")
for frame in soup.findAll('img'):
link = (frame['src'])
link = link.split('=')[1]
print ((link[0:-9]))
- 我使用了www.example.com,因为我的网络无法访问该链接
驱动程序时,切换到.frame(i)
您基本上是在访问iframe html元素。和普通html页面一样,您也可以访问其内部元素
从你之前的问题来看,iframe就像
<body>
<a href="http://www.test2.com" target="_blank">
<img src="https://img2.test2.com/LWBAD-1.jpg"></a>
<br/>Code: LWBAD
并将其存储在csv文件中
代码:
LMFCIIC PWFERT-BK
LMFCIIC PMFEP-BK
LMFCIIC LMF8CC-BL
LMFCIIC PMFEP-GY
LMFCIIC ASPCP-NV
LMFCIIC LWBASK-PK
LMFCIIC LWBATA-PK
LMFCIIC LWBATOP-PK
LMFCIIC LMF8CC-RD
from bs4 import BeautifulSoup
from selenium import webdriver
import html5lib
import time
import requests
import csv
driver_path = '/usr/local/bin/chromedriver 2'
driver = webdriver.Chrome(driver_path)
driver.implicitly_wait(10)
driver.get('https://www.example.com')
iframe = driver.find_elements_by_tag_name('iframe')
images = driver.find_elements_by_tag_name('img')
with open('file_name.csv', 'w', newline='') as csvfile:
field_names = ['text', 'src']
writer = csv.DictWriter(csvfile, fieldnames=field_names)
writer.writerow({'text': 'text', 'src': 'src'})
for i in range(0, len(iframe)):
f = driver.find_elements_by_tag_name('iframe')[i]
img_src = images[i].get_attribute('src')
# do the src splitting here
img_src = img_src.split('=')[1]
driver.switch_to.frame(i)
text = driver.find_element_by_tag_name('body').text
text = text.replace("Code: ", "")
text = text.replace("No Copy Images to TW Server", "")
print(text)
writer.writerow({'text': text, 'src': img_src})
driver.switch_to_default_content()
driver.quit()
您可以使用pandas library,它有一个名为df.to_csv(“filename.csv”)的方法,您可以使用该方法将其保存到csvDon,而不是打印到csv。使用而不是重新发明轮子。@zvone-你介意告诉我应该在哪里更改代码以节省时间而不是重新发明轮子吗?任何能引导我走上正确道路的人,请?@Andie31请解决代码最后4行中的缩进问题如果您想进一步解释csv writer,您可以参考我之前的答案,尝试将代码的第一部分拼凑在一起,有问题:(
AttributeError:module'csv'在第15行没有属性'DictWriter'
您介意检查它吗?使用import csv
我这样做了:(很幸运,您忘了包含iframe=driver。通过标记名('iframe')查找元素。