Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/319.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python使用BeautifulSoup4和urllib2制作网络蜘蛛_Python_Beautifulsoup_Urllib2 - Fatal编程技术网

Python使用BeautifulSoup4和urllib2制作网络蜘蛛

Python使用BeautifulSoup4和urllib2制作网络蜘蛛,python,beautifulsoup,urllib2,Python,Beautifulsoup,Urllib2,这是我到目前为止的代码 import urlparse import urllib2 c_s_l = ["anchorage", "fairbanks","kenai","juneau"] #Craigslist state list #for city in c_s_l: #insert from c_s_l into url[] in city #print "https://%s.craigslist.org/search/cta?que

这是我到目前为止的代码

import urlparse
import urllib2
c_s_l = ["anchorage", "fairbanks","kenai","juneau"]
#Craigslist state list

#for city in c_s_l:                      #insert from c_s_l into url[] in city
    #print "https://%s.craigslist.org/search/cta?query=sprinter" % city #substitutes city in city list in url
for city in c_s_l:                      #insert from c_s_l into url[] in city
    base = "https://%s.craigslist.org"% city #substitutes city in city list in url
    url = base + "/search/cta?query=sprinter" 
    response = urllib2.urlopen(url)
    html = response.read()
    soup = BeautifulSoup(html, 'html.parser')
    for a in soup.find_all('a', class_='result-title hdrlnk'):
        print a

我最终想扩大到craigslist的所有网站。但现在我正在想办法消除我不想要的东西。短跑车的东西。感谢您的建议和帮助

查看您的craigslist使用条款。禁止浏览craigslist。它就在顶部,在“使用”下。