Python Beautfulsoup正在返回双链接_Python_Beautifulsoup

Python Beautfulsoup正在返回双链接

python

Python Beautfulsoup正在返回双链接,python,beautifulsoup,Python,Beautifulsoup,我试图学习如何刮网站，因此不使用API。我正在尝试抓取易趣的网站，我的脚本将打印双URL。我在Google/StackOverflow帮助上进行了尽职调查和搜索，但找不到任何解决方案。提前谢谢 driver.get('https://www.ebay.com/sch/i.html?_from=R40&_nkw=watches&_sacat=0&_pgn=' + str(i)) soup = BeautifulSoup(driver.page_source, 'lxml')

我试图学习如何刮网站，因此不使用API。我正在尝试抓取易趣的网站，我的脚本将打印双URL。我在Google/StackOverflow帮助上进行了尽职调查和搜索，但找不到任何解决方案。提前谢谢

driver.get('https://www.ebay.com/sch/i.html?_from=R40&_nkw=watches&_sacat=0&_pgn=' + str(i))
soup = BeautifulSoup(driver.page_source, 'lxml')
driver.maximize_window()

tempList = []

for link in soup.find_all('a', href=True):
    if 'itm' in link['href']:
        print(link['href'])
        tempList.append(link['href'])

整个代码：

在搜索所有链接时只需添加类名。希望这对您有所帮助

i=1
driver.get('https://www.ebay.com/sch/i.html?_from=R40&_nkw=watches&_sacat=0&_pgn=' + str(i))
soup = BeautifulSoup(driver.page_source, 'lxml')
driver.maximize_window()

tempList = []

for link in soup.find_all('a',class_='s-item__link', href=True):
    if 'itm' in link['href']:
        print(link['href'])
        tempList.append(link['href'])

print(len(tempList))

我假设有多个“相同”的链接（有图像和文本链接到同一篇文章）。使用一个工具来消除重复项。