Python 删除存在某些问题的网站_Python_Python 3.x_Selenium_Web Scraping_Automation

Python 删除存在某些问题的网站

python python-3.x selenium web-scraping automation

Python 删除存在某些问题的网站,python,python-3.x,selenium,web-scraping,automation,Python,Python 3.x,Selenium,Web Scraping,Automation,我想用Python（response或Selenium库）将这位作者的所有文章刮取并保存到PDF文件中。然而，当我点击底部的“显示更多”按钮时，8次之后，它不再显示更多的文章，因此我无法访问所有文章（想法是自动化selenium，点击它直到所有文章都显示出来，然后将它们全部刮除）。有解决办法吗？我可以按时间顺序访问所有文章并将其删除的其他方法？我的想法是分析链接是否来自其他来源，但我不知道。但是，我成功地抓取了显示的文章。提前谢谢使用findElements并搜索…，这将为您提供所有标题

我想用Python（response或Selenium库）将这位作者的所有文章刮取并保存到PDF文件中。
然而，当我点击底部的“显示更多”按钮时，8次之后，它不再显示更多的文章，因此我无法访问所有文章（想法是自动化selenium，点击它直到所有文章都显示出来，然后将它们全部刮除）。有解决办法吗？我可以按时间顺序访问所有文章并将其删除的其他方法？
我的想法是分析链接是否来自其他来源，但我不知道。但是，我成功地抓取了显示的文章。

提前谢谢

使用findElements并搜索

…

，这将为您提供所有标题的列表。每次单击“显示更多”按钮时，列表的大小将扩展10左右。因此，我们的想法是只需单击按钮，直到列表的大小不变。使用while循环。比如：

List<WebElements> oldList = Driver.findElements(by.cssSelector("h2.css- 
    1j9dxys.e1xfvim30"));

List<WebElements> newList = new ArrayList<>();

WebElement button = Driver.findElement(by.xpath("//button[text()='Show More']"));

while(newList.size!=oldList.size){
    button.click();
    newList = List<WebElements> newList = Driver.findElements(by.cssSelector("h2.css- 
    1j9dxys.e1xfvim30));
}

List oldList=Driver.findElements（by.cssSelector（“h2.css-
1j9dxys.e1xfvim30”）；
List newList=newarraylist（）；
WebElement button=Driver.findElement（by.xpath（//button[text（）='Show More']）；
while（newList.size！=oldList.size）{
按钮。单击（）；
newList=List newList=Driver.findElements（by.cssSelector（“h2.css-
1j9dxys.e1xfvim30）；
}

我可能在代码中有一些错误，但想法是存在的。祝你好运！

尝试登录。如果你登录到NYTimes，你可以单击“显示更多”超过8次。代码试用请？发布你尝试过的代码？