Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/319.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Beautfulsoup正在返回双链接_Python_Beautifulsoup - Fatal编程技术网

Python Beautfulsoup正在返回双链接

Python Beautfulsoup正在返回双链接,python,beautifulsoup,Python,Beautifulsoup,我试图学习如何刮网站,因此不使用API。我正在尝试抓取易趣的网站,我的脚本将打印双URL。我在Google/StackOverflow帮助上进行了尽职调查和搜索,但找不到任何解决方案。提前谢谢 driver.get('https://www.ebay.com/sch/i.html?_from=R40&_nkw=watches&_sacat=0&_pgn=' + str(i)) soup = BeautifulSoup(driver.page_source, 'lxml')

我试图学习如何刮网站,因此不使用API。我正在尝试抓取易趣的网站,我的脚本将打印双URL。我在Google/StackOverflow帮助上进行了尽职调查和搜索,但找不到任何解决方案。提前谢谢

driver.get('https://www.ebay.com/sch/i.html?_from=R40&_nkw=watches&_sacat=0&_pgn=' + str(i))
soup = BeautifulSoup(driver.page_source, 'lxml')
driver.maximize_window()

tempList = []

for link in soup.find_all('a', href=True):
    if 'itm' in link['href']:
        print(link['href'])
        tempList.append(link['href'])

整个代码:

在搜索所有链接时只需添加类名。希望这对您有所帮助

i=1
driver.get('https://www.ebay.com/sch/i.html?_from=R40&_nkw=watches&_sacat=0&_pgn=' + str(i))
soup = BeautifulSoup(driver.page_source, 'lxml')
driver.maximize_window()

tempList = []

for link in soup.find_all('a',class_='s-item__link', href=True):
    if 'itm' in link['href']:
        print(link['href'])
        tempList.append(link['href'])

print(len(tempList))

我假设有多个“相同”的链接(有图像和文本链接到同一篇文章)。使用一个工具来消除重复项。