Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/selenium/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python:如何为工作簿分配收藏夹?_Python_Selenium_Web Scraping - Fatal编程技术网

Python:如何为工作簿分配收藏夹?

Python:如何为工作簿分配收藏夹?,python,selenium,web-scraping,Python,Selenium,Web Scraping,我已经编写了我的第一个Selenium脚本,用Python练习webscraping。这样做的目的是从一个桌面公共档案中删除所有的工作簿、视图和收藏夹。我设法提取了这三个关键变量,但我不知道如何为各自的工作簿分配收藏夹,因为并非所有工作簿都至少有一个收藏夹 例如,“百老汇上的斯凯勒”没有最喜欢的词,但如果我在字典中匹配工作簿和最喜欢的词,它将得到下一个最佳值,即4 f、 文本!=“”仅删除列表末尾的空值 解决这个问题的最佳方法是什么 from selenium import webdriver

我已经编写了我的第一个Selenium脚本,用Python练习webscraping。这样做的目的是从一个桌面公共档案中删除所有的工作簿、视图和收藏夹。我设法提取了这三个关键变量,但我不知道如何为各自的工作簿分配收藏夹,因为并非所有工作簿都至少有一个收藏夹

例如,“百老汇上的斯凯勒”没有最喜欢的词,但如果我在字典中匹配工作簿和最喜欢的词,它将得到下一个最佳值,即4

f、 文本!=“”仅删除列表末尾的空值

解决这个问题的最佳方法是什么

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time

driver = webdriver.Chrome(executable_path=r',mypath')

driver.get("https://public.tableau.com/profile/skybjohnson#!/")

#load entire website:

while True:

   try:
       show_more = WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.ID, "load-more-vizzes")))
       driver.find_element_by_id("load-more-vizzes")
       driver.execute_script("window.scrollTo(0,document.body.scrollHeight)")
       WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.ID, "load-more-vizzes")))

   except Exception as e:
       print(e)
       break

#get workbook titles
titles = driver.find_elements_by_class_name("workbook-title")

workbook_titles = [i.text for i in titles if i.text != ""]
print(workbook_titles)

#get number of views per workbook
views = driver.find_elements_by_class_name('workbook-view-count')

workbook_views = [int(v.text.split()[0]) for v in views if v.text != ""]
print(workbook_views)

#get number of favourites per workbook
favs = driver.find_elements_by_xpath('//SPAN[@ng-bind="controller.workbook.numberOfFavorites"]')

workbook_favs = [f.text for f in favs if f.text != ""]
print(workbook_favs)

首先,您可以获取所有VIZZE,然后获取子标题、视图和收藏夹。此外,您还必须检查是否存在视图计数和收藏夹。您可以找到改进的滚动和正确的方式来获取视图计数(如果没有视图,则为0)和收藏夹(如果没有收藏夹,则为0):

wait = WebDriverWait(driver, 10)
with driver:
    driver.get("https://public.tableau.com/profile/skybjohnson#!/")

    wait.until(EC.presence_of_element_located((By.ID, "load-more-vizzes")))
    while driver.find_element_by_id("load-more-vizzes").is_displayed():
        driver.execute_script("arguments[0].scrollIntoView()", driver.find_element_by_id("load-more-vizzes"))

    vizzes = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".viz-container li.media-viz")))
    for viz in vizzes:
        if not viz.is_displayed():
            continue

        title = viz.find_element_by_css_selector('[ng-bind="controller.workbook.title"]').text

        views_count_list = viz.find_elements_by_css_selector('[ng-bind="controller.workbook.viewCount"]')
        views_count = views_count_list[0].text if len(views_count_list) > 0 else 0

        number_of_favorites_list = viz.find_elements_by_css_selector('[ng-bind="controller.workbook.numberOfFavorites"]')
        number_of_favorites = number_of_favorites_list[0].text if len(number_of_favorites_list) > 0 else 0

        print(title, views_count, number_of_favorites)