Python 当文本'；ss不在HTML元素中_Python_Html_Web Scraping_Data Mining

Python 当文本'；ss不在HTML元素中

python html web-scraping

Python 当文本'；ss不在HTML元素中,python,html,web-scraping,data-mining,Python,Html,Web Scraping,Data Mining,我想浏览一下这个网站：我想刮产品Sku，价格，列出价格元素。我设法削价，但我有其他两个问题，特别是产品Sku，因为它不在范围内。它就在一个舱里，能刮一下吗？如果有，你能帮我吗正如您所见，产品Sku没有跨度 <div class="vm3pr-2"> <div class="product-price" id="productPrice1499"> <div class="product-sk

我想浏览一下这个网站：我想刮产品Sku，价格，列出价格元素。我设法削价，但我有其他两个问题，特别是产品Sku，因为它不在范围内。它就在一个舱里，能刮一下吗？如果有，你能帮我吗

正如您所见，产品Sku没有跨度

<div class="vm3pr-2"> <div class="product-price" id="productPrice1499">
<div class="product-sku"><span class="bold">Product SKU</span> : 2203-20<br></div>

在这里，您可以这样做，以获得产品sku与各自的价格。我用漂亮的汤刮

视图.py

import requests
from bs4 import BeautifulSoup
from django.shortcuts import render

base_url = 'https://www.hectorjones.co.nz/milwaukee-hand-tools-and-accessories.html'


def home(request):
    response = requests.get(base_url)
    data = response.text
    soup = BeautifulSoup(data, features='html.parser')
    post_listings = soup.find_all('div', {'class': 'product-price'})
    final_postings = []

    for post in post_listings:
        product_sku = post.find('div', {'class': 'product-sku'}).text
        price = post.find('span', {'class': 'PricesalesPrice'}).text
        final_postings.append((product_sku, price))
    context = {
        'final_postings': final_postings,
       }

    return render(request, 'display.html', context)

display.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>hectorjones.co.nz/</title>
</head>
<body>
{% for post in final_postings %}
    <ul>
         <li><p>{{ post.0 }}<br> Price : {{ post.1 }}
         </p></li>
    </ul>

{% endfor %}

</body>
</html>


新西兰赫克托琼斯公司/
{最终发布中发布的百分比%}

{{post.0}
价格：{{post.1}


{%endfor%}

在这里，您可以按相应的价格获取产品sku。我用漂亮的汤刮

视图.py

import requests
from bs4 import BeautifulSoup
from django.shortcuts import render

base_url = 'https://www.hectorjones.co.nz/milwaukee-hand-tools-and-accessories.html'


def home(request):
    response = requests.get(base_url)
    data = response.text
    soup = BeautifulSoup(data, features='html.parser')
    post_listings = soup.find_all('div', {'class': 'product-price'})
    final_postings = []

    for post in post_listings:
        product_sku = post.find('div', {'class': 'product-sku'}).text
        price = post.find('span', {'class': 'PricesalesPrice'}).text
        final_postings.append((product_sku, price))
    context = {
        'final_postings': final_postings,
       }

    return render(request, 'display.html', context)

display.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>hectorjones.co.nz/</title>
</head>
<body>
{% for post in final_postings %}
    <ul>
         <li><p>{{ post.0 }}<br> Price : {{ post.1 }}
         </p></li>
    </ul>

{% endfor %}

</body>
</html>


新西兰赫克托琼斯公司/
{最终发布中发布的百分比%}

{{post.0}
价格：{{post.1}


{%endfor%}

嗨，Burak以前写的代码是用django编写的。正如你所要求的，这是你必须在cmd中运行的代码，它将打印该网站上可用的产品列表

首先确保安装以下两个python软件包：

pip安装请求

pip安装bs4

scraping.py

import requests
from bs4 import BeautifulSoup
from django.shortcuts import render

base_url = 'https://www.hectorjones.co.nz/milwaukee-hand-tools-and-accessories.html'


def home(request):
    response = requests.get(base_url)
    data = response.text
    soup = BeautifulSoup(data, features='html.parser')
    post_listings = soup.find_all('div', {'class': 'product-price'})
    final_postings = []

    for post in post_listings:
        product_sku = post.find('div', {'class': 'product-sku'}).text
        price = post.find('span', {'class': 'PricesalesPrice'}).text
        final_postings.append((product_sku, price))
    context = {
        'final_postings': final_postings,
       }

    return render(request, 'display.html', context)

如果您在任何步骤中感到困惑，请告诉我。

Happy codingHi Burak以前编写的代码是用django编写的。正如你所要求的，这是你必须在cmd中运行的代码，它将打印该网站上可用的产品列表

首先确保安装以下两个python软件包：

pip安装请求

pip安装bs4

scraping.py

import requests
from bs4 import BeautifulSoup
from django.shortcuts import render

base_url = 'https://www.hectorjones.co.nz/milwaukee-hand-tools-and-accessories.html'


def home(request):
    response = requests.get(base_url)
    data = response.text
    soup = BeautifulSoup(data, features='html.parser')
    post_listings = soup.find_all('div', {'class': 'product-price'})
    final_postings = []

    for post in post_listings:
        product_sku = post.find('div', {'class': 'product-sku'}).text
        price = post.find('span', {'class': 'PricesalesPrice'}).text
        final_postings.append((product_sku, price))
    context = {
        'final_postings': final_postings,
       }

    return render(request, 'display.html', context)

如果您在任何步骤中感到困惑，请告诉我。

非常感谢你的回答，湿婆。这不是我习惯的格式。是否应该将这两个文件放在同一个目录中，然后打开cmd并键入“python views.py”？我知道它和django有关，但是有没有更简单的方法让它工作呢？谢谢@shiva shrestha任何帮助都会非常有用非常感谢你的回答shiva。这不是我习惯的格式。是否应该将这两个文件放在同一个目录中，然后打开cmd并键入“python views.py”？我知道它和django有关，但是有没有更简单的方法让它工作呢？谢谢@shiva shrestha任何帮助都是非常有用的。顺便问一下，我很高兴能帮助你。有什么想法我如何编写价目表（它只出现在某些项目中）也可以编写吗？当我尝试在您构建的for循环中执行此操作时，它会抛出一个错误，因为不是每个项目都有标价。我不明白您所说的脚本，有些项目也有标价。它可以刮吗？谢谢。我很高兴能帮你。顺便问一下，有什么想法吗？我如何编写价目表的脚本（它只出现在某些项目中），也可以编写脚本吗？当我尝试在您构建的for循环中执行此操作时，它会抛出一个错误，因为不是每个项目都有标价。我不明白您所说的脚本，有些项目也有标价。它可以刮吗？谢谢