Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/image-processing/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 美丽的汤-在带字符串的标签中找到标签?第N个孩子?_Python_Web Scraping_Beautifulsoup_Python Requests - Fatal编程技术网

Python 美丽的汤-在带字符串的标签中找到标签?第N个孩子?

Python 美丽的汤-在带字符串的标签中找到标签?第N个孩子?,python,web-scraping,beautifulsoup,python-requests,Python,Web Scraping,Beautifulsoup,Python Requests,我在使用下面的HTML刮片时遇到了一些问题 res = <div class="gunDetails"> <h4>Specifications</h4> <ul class="features"> <li><label>Make:</label><span itemprop="brand">Gamo</span></li>

我在使用下面的HTML刮片时遇到了一些问题

 res =   <div class="gunDetails">
    <h4>Specifications</h4>
    <ul class="features">
        <li><label>Make:</label><span itemprop="brand">Gamo</span></li>
        <li><label>Model:</label><span itemprop="model">Coyote Black Tactical</span></li>
        <li><label>Licence:</label><span>No Licence</span></li>
        <li><label>Orient.:</label><span>Ambidextrous</span></li>
        <li><label>Scope:</label><span>Unknown&nbsp;3-9x32</span></li>
        <li><label>Origin:</label><span>Spanish</span></li>
        <li><label>Cased:</label><span>Other</span></li>
        <li><label>Trigger:</label><span>1</span></li>
        <li><label>Condition:</label><span itemprop="itemCondition">Used</span></li>
    </ul>
  </div>
输出

Gamo
Coyote Black Tactical
No License
Ambidextrous
Unknown 3-9x32
Spanish
Other
1
Used
是否可以为每个标签文本创建一个变量? 什么样的

gun_make = gun_details.findAll('label', String="Make:")
print(gun_make).text
这是完整的代码:

from bs4 import BeautifulSoup
import requests
import csv

all_links=[]
labels = []
spans = []
url="https://www.guntrader.uk/dealers/redcar/spencers-sporting-guns/guns?page={}"

for page in range(1,3):
  res=requests.get(url.format(page)).text
  soup=BeautifulSoup(res,'html.parser')
  for link in soup.select('a[href*="/dealers/redcar/spencers-sporting-guns/guns/shotguns"]'):
  all_links.append("https://www.guntrader.uk" + link['href'])


print(len(all_links))
for a_link in all_links:
  res = requests.get(a_link).text
  soup = BeautifulSoup(res, 'html.parser')
  gun_details = soup.select('div.gunDetails')
  for l in gun_details.select('label'):
   labels.append(l.text.replace(':',''))
  for s in gun_details.select('span'):
   spans.append(s.text)

my_dict = dict(zip(labels, spans))
with open('gundealer.csv','w') as csvfile:
 writer = csv.DictWriter(csvfile, fieldnames=None)
 for key in mydict.keys():
   csvfile.write(f"{key},{my_dict[key]}\n")
本节似乎独立工作,给出了正确的(ish)输出:

对于输出:

Make: Gamo
但是我不知道我在做什么来扰乱来自循环的初始响应,使上面的代码片段不起作用

让我们试试这个:

res =  """ <div class="gunDetails">
    <h4>Specifications</h4>
    <ul class="features">
        <li><label>Make:</label><span itemprop="brand">Gamo</span></li>
        <li><label>Model:</label><span itemprop="model">Coyote Black Tactical</span></li>
        <li><label>Licence:</label><span>No Licence</span></li>
        <li><label>Orient.:</label><span>Ambidextrous</span></li>
        <li><label>Scope:</label><span>Unknown&nbsp;3-9x32</span></li>
        <li><label>Origin:</label><span>Spanish</span></li>
        <li><label>Cased:</label><span>Other</span></li>
        <li><label>Trigger:</label><span>1</span></li>
        <li><label>Condition:</label><span itemprop="itemCondition">Used</span></li>
    </ul>
  </div>
""

from bs4 import BeautifulSoup as bs
import csv

labels = []
spans = []
soup = bs(res, 'html.parser')
gun_details = soup.select_one('div.gunDetails')
for l in gun_details.select('label'):
    labels.append(l.text.replace(':',''))
for s in gun_details.select('span'):
    spans.append(s.text)

my_dict = dict(zip(labels, spans))
with open('mycsvfile.csv','w') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=None)
        for key in my_dict.keys():
            csvfile.write(f"{key},{my_dict[key]}\n")

您希望输出的内容是什么?您真的希望能够将每个项目标识为一个单独的项目并打印其文本。我甚至想知道我如何得到整个
  • 并在制作时将其拆分:只给我span输出我不确定你到底在寻找什么,但你可以创建一个包含键/值对的字典,如
    “make:”:“Gamo”,“Model:”:“Coyote Black Tactical”
    ,等等,我用我的完整代码更新了它,看看是否有人能看到我去哪里,现在还不清楚;能发布准确的预期输出吗?嗨,杰克,谢谢你。在这种情况下效果最好。我遇到的一个问题是,在我上面的代码中加入了关于rage(1,3)页面完整脚本中循环的内容。。。。。。继续获取错误“NoneType”对象没有属性“select”,我想这就是我调用for循环中的链接的方式。我将再次编辑完整脚本,以显示当前脚本及其更新,包括您在其中有24个链接;我使用了这些链接的随机选择,没有得到错误。请尝试查找发生错误的确切链接。@AndrewGlass-So,您找到有问题的url了吗?@AndrewGlass-当然,但您必须将其作为单独的问题(So策略)发布,我来看看。而且,如果你完成了这个,你应该接受它的答案(如果可以接受的话)。
    Make: Gamo
    
    res =  """ <div class="gunDetails">
        <h4>Specifications</h4>
        <ul class="features">
            <li><label>Make:</label><span itemprop="brand">Gamo</span></li>
            <li><label>Model:</label><span itemprop="model">Coyote Black Tactical</span></li>
            <li><label>Licence:</label><span>No Licence</span></li>
            <li><label>Orient.:</label><span>Ambidextrous</span></li>
            <li><label>Scope:</label><span>Unknown&nbsp;3-9x32</span></li>
            <li><label>Origin:</label><span>Spanish</span></li>
            <li><label>Cased:</label><span>Other</span></li>
            <li><label>Trigger:</label><span>1</span></li>
            <li><label>Condition:</label><span itemprop="itemCondition">Used</span></li>
        </ul>
      </div>
    ""
    
    from bs4 import BeautifulSoup as bs
    import csv
    
    labels = []
    spans = []
    soup = bs(res, 'html.parser')
    gun_details = soup.select_one('div.gunDetails')
    for l in gun_details.select('label'):
        labels.append(l.text.replace(':',''))
    for s in gun_details.select('span'):
        spans.append(s.text)
    
    my_dict = dict(zip(labels, spans))
    with open('mycsvfile.csv','w') as csvfile:
            writer = csv.DictWriter(csvfile, fieldnames=None)
            for key in my_dict.keys():
                csvfile.write(f"{key},{my_dict[key]}\n")
    
    Make    Gamo
    Model   Coyote Black Tactical
    Licence No Licence
    Orient. Ambidextrous
    Scope   Unknown 3-9x32
    Origin  Spanish
    Cased   Other
    Trigger 1
    Condition   Used