Python 从谷歌搜索中抓取前10个链接以获取大量记录_Python_Web Scraping

Python 从谷歌搜索中抓取前10个链接以获取大量记录

python web-scraping

Python 从谷歌搜索中抓取前10个链接以获取大量记录,python,web-scraping,Python,Web Scraping,我有63000个字符串，我想遍历所有字符串，在google中搜索每个字符串，并将前十个链接存储在一个新列中，以“，”分隔。我已经尝试了下面的代码，但是速度太慢了。请建议我使用其他方法来删除链接 try: from googlesearch import search except ImportError: print("No module named 'google' found") query = "xyz" for j in

我有63000个字符串，我想遍历所有字符串，在google中搜索每个字符串，并将前十个链接存储在一个新列中，以“，”分隔。我已经尝试了下面的代码，但是速度太慢了。请建议我使用其他方法来删除链接

    try:
        from googlesearch import search
    except ImportError: 
        print("No module named 'google' found")
    query = "xyz"
    for j in search(query, tld="co.in", num=10, stop=1, pause=5):
        print(j)

试试这个：

try:
    from googlesearch import search
except ImportError: 
    print("No module named 'google' found")
query = "xyz"
for j in search(query, tld="co.in", num=10, stop=1, pause=1):
    print(j)

“暂停（浮动）-在HTTP请求之间等待的间隔。间隔太长会使搜索速度变慢，但是间隔太短可能会导致Google阻止您的IP。您的里程数可能会有所不同！”

暂停=5”的作用是什么？若要加快代码速度，请减小此值。非常感谢您的回答，但由于数据大小的原因，即使暂停时间为1，也需要很长的时间进行刮取。您可以尝试删除暂停或进一步减小暂停时间？我曾尝试将暂停时间设置为0.5，但经过一些迭代后，我得到503：服务不可用错误运行速度有多快当你使用1时？还有，你希望它的速度有多快？我想刮63000块大约需要70个小时。