Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x 在python中提取类中的所有元素_Python 3.x_Selenium_Csv_Web Scraping - Fatal编程技术网

Python 3.x 在python中提取类中的所有元素

Python 3.x 在python中提取类中的所有元素,python-3.x,selenium,csv,web-scraping,Python 3.x,Selenium,Csv,Web Scraping,为了提取类中的第一个元素,我做了以下工作: if var_source == "Image": outcsvfile = 'Image_Ids' + file + '_' + timestamp +'.csv' with open(outcsvfile, 'w', encoding='utf-8', newline='') as csvfile: csv_writer = csv.writer(csvfile)

为了提取类中的第一个元素,我做了以下工作:

if var_source == "Image":
    outcsvfile = 'Image_Ids' + file + '_' + timestamp +'.csv'
    with open(outcsvfile, 'w', encoding='utf-8', newline='') as csvfile:
            csv_writer = csv.writer(csvfile) 
            csv_writer.writerow(['ax','physical_id'])
    for i in range(len(var_ax)):    
        browser.get('https://test.com' + str(mpid) + '&ax=' + var_ax[i])
        self.master.update()
        self.status.config(text = str(i+1) + "/" + str(len(var_ax)) + " Extracting AX: " + var_ax[i])
        try:
            ph_id = browser.find_element_by_xpath("//div[contains(@class, 'a-image-wrapper')]").get_attribute("alt")
            print(i+1,': extract AX:',var_ax[i])
            with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
                csv_writer = csv.writer(csvfile) 
                csv_writer.writerow([var_ax[i],ph_id])
        except:
            print(i+1,': extract AX:',var_ax[i])
            with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
                csv_writer = csv.writer(csvfile) 
                csv_writer.writerow([var_ax[i],'[missing AX]'])
我有两个问题:

    if var_source == "Image":
        outcsvfile = 'Image_Ids-' + file + '_' + timestamp +'.csv'
        with open(outcsvfile, 'w', encoding='utf-8', newline='') as csvfile:
                csv_writer = csv.writer(csvfile) 
                csv_writer.writerow(['ax','physical_id','image_count'])
        for i in range(len(var_ax)):    
            browser.get('https://test.com' + str(mpid) + '&ax=' + var_ax[i])
            self.master.update()
            self.status.config(text = str(i+1) + "/" + str(len(var_ax)) + " Extracting AX: " + var_ax[i])
            try:
                ph_id = browser.find_element_by_xpath("//div[contains(@class, 'a-image-wrapper')]").get_attribute("alt")
                ids1 = browser.find_elements_by_class_name("physical-id")
                ids1Text = []
                for a in ids1:
                    ids1Text.append(a.text)
                nr = str(len(ids1))
                ax = ', '.join(ids1Text)
                print(i+1,': extract AX:',var_ax[i])
                with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
                    csv_writer = csv.writer(csvfile)
                    csv_writer.writerow([var_ax[i], ax, nr])
            except:
                print(i+1,': extract AX:',var_ax[i])
                with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
                    csv_writer = csv.writer(csvfile) 
                    csv_writer.writerow([var_ax[i],'[missing AX]'])
  • 如何提取同一单元格中由逗号分隔的所有物理标识(单元格B2=“physical_id1,physical_id2,physical_id3”)
  • 我如何计算C列中导出的物理_ID的总数(例如:对于C2,我们将有3个,因为在B2中,我们导出了3个物理_ID)
  • 源代码:

    <div alt="51d5gBEzhjL" style="width:220px;float:left;margin-left:34px;margin-bottom:10px;border:1px solid #D0D0D0" class="a-image-wrapper a-lazy-loaded MAIN GLOBAL 51d5gBEzhjL"><h1 class="a-size-medium a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold">MAIN</h1><h1 class="a-size-base a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold"> ou GLOBAL / Merch 1</h1></div>
    <h1 class="a-size-medium a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold">FACT</h1>
    <h1 class="a-size-base a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold"> ou GLOBAL / Merch 1</h1>
    <span class="a-declarative" data-action="a-modal"><center><img class="ecx" id="51S+wTs36zL" src="https://test.com/images/I/51S+wTs36zL._AA200_.jpg" alt="51S+wTs36zL"></center></span>
    <center>
    <img class="ecx" id="51S+wTs36zL" src="https://test.com/images/I/51S+wTs36zL._AA200_.jpg" alt="51S+wTs36zL">
    </center>
    </span>
    <h5 class="physical-id">51S+wTs36zL</h5>
    <h1 class="a-size-medium a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold" style="background:#D0D0D0">UPLOADED</h1>
    <h1 class="a-size-base a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold">19/Apr/2016:17:45:40</h1>
    </div>
    
    MAIN ou GLOBAL/Merch 1
    事实
    ou全球/水星1
    51S+wTs36zL
    上传
    2016年4月19日:17:45:40
    
    这对我很有效,解决了我的两个问题:

        if var_source == "Image":
            outcsvfile = 'Image_Ids-' + file + '_' + timestamp +'.csv'
            with open(outcsvfile, 'w', encoding='utf-8', newline='') as csvfile:
                    csv_writer = csv.writer(csvfile) 
                    csv_writer.writerow(['ax','physical_id','image_count'])
            for i in range(len(var_ax)):    
                browser.get('https://test.com' + str(mpid) + '&ax=' + var_ax[i])
                self.master.update()
                self.status.config(text = str(i+1) + "/" + str(len(var_ax)) + " Extracting AX: " + var_ax[i])
                try:
                    ph_id = browser.find_element_by_xpath("//div[contains(@class, 'a-image-wrapper')]").get_attribute("alt")
                    ids1 = browser.find_elements_by_class_name("physical-id")
                    ids1Text = []
                    for a in ids1:
                        ids1Text.append(a.text)
                    nr = str(len(ids1))
                    ax = ', '.join(ids1Text)
                    print(i+1,': extract AX:',var_ax[i])
                    with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
                        csv_writer = csv.writer(csvfile)
                        csv_writer.writerow([var_ax[i], ax, nr])
                except:
                    print(i+1,': extract AX:',var_ax[i])
                    with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
                        csv_writer = csv.writer(csvfile) 
                        csv_writer.writerow([var_ax[i],'[missing AX]'])
    

    如何共享URL或代表
    HTML
    sample?已添加。我不知道这是什么-1,但无论如何,这里有一个专业提示:永远不要共享代码的图像(在这里和其他任何地方,你有同事),因为这是绝对无用的和史诗般的熏陶。话虽如此,还是去掉那个图像,将
    HTML
    粘贴为纯文本。附言:否决票很可能是为了代码的形象。看到-1之前,我已经上传了图像。我已经在pastebin上编辑并上传了源代码(对于我的帖子来说太大了),如果你的代码太大,无法在问题中发布,那么这不是一个最小的代表性示例,请查看,以便我们能够更好地帮助你。