Python 3.x 在python中提取类中的所有元素
为了提取类中的第一个元素,我做了以下工作:Python 3.x 在python中提取类中的所有元素,python-3.x,selenium,csv,web-scraping,Python 3.x,Selenium,Csv,Web Scraping,为了提取类中的第一个元素,我做了以下工作: if var_source == "Image": outcsvfile = 'Image_Ids' + file + '_' + timestamp +'.csv' with open(outcsvfile, 'w', encoding='utf-8', newline='') as csvfile: csv_writer = csv.writer(csvfile)
if var_source == "Image":
outcsvfile = 'Image_Ids' + file + '_' + timestamp +'.csv'
with open(outcsvfile, 'w', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow(['ax','physical_id'])
for i in range(len(var_ax)):
browser.get('https://test.com' + str(mpid) + '&ax=' + var_ax[i])
self.master.update()
self.status.config(text = str(i+1) + "/" + str(len(var_ax)) + " Extracting AX: " + var_ax[i])
try:
ph_id = browser.find_element_by_xpath("//div[contains(@class, 'a-image-wrapper')]").get_attribute("alt")
print(i+1,': extract AX:',var_ax[i])
with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow([var_ax[i],ph_id])
except:
print(i+1,': extract AX:',var_ax[i])
with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow([var_ax[i],'[missing AX]'])
我有两个问题:
if var_source == "Image":
outcsvfile = 'Image_Ids-' + file + '_' + timestamp +'.csv'
with open(outcsvfile, 'w', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow(['ax','physical_id','image_count'])
for i in range(len(var_ax)):
browser.get('https://test.com' + str(mpid) + '&ax=' + var_ax[i])
self.master.update()
self.status.config(text = str(i+1) + "/" + str(len(var_ax)) + " Extracting AX: " + var_ax[i])
try:
ph_id = browser.find_element_by_xpath("//div[contains(@class, 'a-image-wrapper')]").get_attribute("alt")
ids1 = browser.find_elements_by_class_name("physical-id")
ids1Text = []
for a in ids1:
ids1Text.append(a.text)
nr = str(len(ids1))
ax = ', '.join(ids1Text)
print(i+1,': extract AX:',var_ax[i])
with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow([var_ax[i], ax, nr])
except:
print(i+1,': extract AX:',var_ax[i])
with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow([var_ax[i],'[missing AX]'])
<div alt="51d5gBEzhjL" style="width:220px;float:left;margin-left:34px;margin-bottom:10px;border:1px solid #D0D0D0" class="a-image-wrapper a-lazy-loaded MAIN GLOBAL 51d5gBEzhjL"><h1 class="a-size-medium a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold">MAIN</h1><h1 class="a-size-base a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold"> ou GLOBAL / Merch 1</h1></div>
<h1 class="a-size-medium a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold">FACT</h1>
<h1 class="a-size-base a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold"> ou GLOBAL / Merch 1</h1>
<span class="a-declarative" data-action="a-modal"><center><img class="ecx" id="51S+wTs36zL" src="https://test.com/images/I/51S+wTs36zL._AA200_.jpg" alt="51S+wTs36zL"></center></span>
<center>
<img class="ecx" id="51S+wTs36zL" src="https://test.com/images/I/51S+wTs36zL._AA200_.jpg" alt="51S+wTs36zL">
</center>
</span>
<h5 class="physical-id">51S+wTs36zL</h5>
<h1 class="a-size-medium a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold" style="background:#D0D0D0">UPLOADED</h1>
<h1 class="a-size-base a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold">19/Apr/2016:17:45:40</h1>
</div>
MAIN ou GLOBAL/Merch 1
事实
ou全球/水星1
51S+wTs36zL
上传
2016年4月19日:17:45:40
这对我很有效,解决了我的两个问题:
if var_source == "Image":
outcsvfile = 'Image_Ids-' + file + '_' + timestamp +'.csv'
with open(outcsvfile, 'w', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow(['ax','physical_id','image_count'])
for i in range(len(var_ax)):
browser.get('https://test.com' + str(mpid) + '&ax=' + var_ax[i])
self.master.update()
self.status.config(text = str(i+1) + "/" + str(len(var_ax)) + " Extracting AX: " + var_ax[i])
try:
ph_id = browser.find_element_by_xpath("//div[contains(@class, 'a-image-wrapper')]").get_attribute("alt")
ids1 = browser.find_elements_by_class_name("physical-id")
ids1Text = []
for a in ids1:
ids1Text.append(a.text)
nr = str(len(ids1))
ax = ', '.join(ids1Text)
print(i+1,': extract AX:',var_ax[i])
with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow([var_ax[i], ax, nr])
except:
print(i+1,': extract AX:',var_ax[i])
with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow([var_ax[i],'[missing AX]'])
如何共享URL或代表
HTML
sample?已添加。我不知道这是什么-1,但无论如何,这里有一个专业提示:永远不要共享代码的图像(在这里和其他任何地方,你有同事),因为这是绝对无用的和史诗般的熏陶。话虽如此,还是去掉那个图像,将HTML
粘贴为纯文本。附言:否决票很可能是为了代码的形象。看到-1之前,我已经上传了图像。我已经在pastebin上编辑并上传了源代码(对于我的帖子来说太大了),如果你的代码太大,无法在问题中发布,那么这不是一个最小的代表性示例,请查看,以便我们能够更好地帮助你。