Python 蟒蛇。将值更新到现有excel列
我有一个像这样的excel文件。 我正在使用此代码从web上为D和E列刮取新数据Python 蟒蛇。将值更新到现有excel列,python,python-3.x,excel,web-scraping,Python,Python 3.x,Excel,Web Scraping,我有一个像这样的excel文件。 我正在使用此代码从web上为D和E列刮取新数据 import csv import time from selenium.webdriver.chrome.options import Options from selenium import webdriver from selenium.webdriver.common.keys import Keys urls =['https://www.linkedin.com/in/felipe-fs', 'ht
import csv
import time
from selenium.webdriver.chrome.options import Options
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
urls =['https://www.linkedin.com/in/felipe-fs',
'https://www.linkedin.com/in/lucascacao',
'https://www.linkedin.com/in/silvia-florido-107a2355',
'https://www.linkedin.com/in/alesillva',
'https://www.linkedin.com/in/marcellogpassos',
'https://www.linkedin.com/in/ana-luiza-fidelis-de-sousa',
'https://www.linkedin.com/in/thiagoanjos',
'https://www.linkedin.com/in/eduardoneves',
'https://www.linkedin.com/in/gabriel-de-santana-weizenmann-73aab7116',
'https://www.linkedin.com/in/felipebluiz']
header_added = False
timestr = time.strftime("%Y%m%d-%H%M%S")
chrome_options = Options()
chrome_options.add_experimental_option("useAutomationExtension", False)
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_argument("user-data-dir=C:\\Users\\user\\AppData\\Local\\Google\\Chrome\\User Data")
driver = webdriver.Chrome(executable_path='C:/chromedriver.exe', options=chrome_options)
driver.maximize_window()
for url in urls:
driver.get(url)
try:
n = False
company = driver.find_element_by_xpath('//ul[@class="pv-top-card--experience-list"]')
if not n:
res = company.text
n = True
html = driver.find_element_by_tag_name('html')
for i in range(2):
html.send_keys(Keys.PAGE_DOWN)
time.sleep(3)
experience = driver.find_elements_by_tag_name('h4')
duration = experience[0].text
with open('test.xlxs', 'a', encoding='utf-8-sig') as f:
w = csv.writer(f)
w.writerow(['', '', '', res, duration])
except:
print("User has an older job")
D和E中的数据仅用于4-5行,因此可以覆盖该数据。如何使用新的刮取数据更新这些列?我看到了很多答案,但不确定如何创建一个包含已删除数据的数据框。我需要从同一个文件中读取url并删除数据,然后将其添加到现有列中。代码中的url列表只是一个测试
编辑:-我环顾四周,决定简单地将xlsx转换为csv更方便。所以我安装了xlrd 1.2版并完成了这项工作。输入:使用了字符串url和D和C列的新数据 输出:更新行
from openpyxl import load_workbook,Workbook
def update(url,dataDcol,dataEcol):
try:
#get date
wk = load_workbook('oldfile.xlsx')
wh = wk.active
for row in wh['C']:
if row.value == url:
wh['D{}'.format(row.row)] = dataDcol
wh['E{}'.format(row.row)] = dataEcol
break
wk.save('oldfile.xlsx')
wk.close()
except Exception as e:
print('error :' + str(e))
我认为Excel是python的一大难题。我将excel转换为csv。我想我会和csv一起工作。谢谢你的回答。