Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从同一类中提取每个相同的标记_Python_Python 3.x_Csv_Beautifulsoup - Fatal编程技术网

Python 从同一类中提取每个相同的标记

Python 从同一类中提取每个相同的标记,python,python-3.x,csv,beautifulsoup,Python,Python 3.x,Csv,Beautifulsoup,我想从这个页面中提取所有的'datasrc'。然后将结果保存到csv。这个页面上有几个“datasrc”都在同一个类中,我不知道如何处理它 import requests from bs4 import BeautifulSoup import csv import pandas as pd from csv import writer def test_list(): with open('largeXDDDDDDDDDD.csv','w') as f1: writ

我想从这个页面中提取所有的'datasrc'。然后将结果保存到csv。这个页面上有几个“datasrc”都在同一个类中,我不知道如何处理它

import requests
from bs4 import BeautifulSoup
import csv
import pandas as pd
from csv import writer


def test_list():
    with open('largeXDDDDDDDDDD.csv','w') as f1:
        writer=csv.writer(f1, delimiter='\t',lineterminator='\n',)
        #df = pd.read_csv("C:\\Users\\Lukasz\\Desktop\\PROJEKTY PYTHON\\W TRAKCIE\\large.csv")
        #url = df['LINKS'][1]
        url='https://paypalshop.x.yupoo.com/albums/81513820?uid=1'
        response = requests.get(url)
        data = response.text
        soup = BeautifulSoup(data, 'lxml')
        szukaj=soup.find_all('div',{'class':"showalbum__children image__main"})
        for XD in szukaj:
            q=(soup.find_all("data-src"))
            print(q)
        #q= soup.find("img", {"class": "autocover image__img image__portrait"})
        #q=(tag.get('data-src'))
test_list()```

HTML:
<div class="showalbum__children image__main" data-id="30842210">
     <div class="image__imagewrap" data-type="photo">
      <img alt="" class="autocover image__img image__portrait" data-album-id="83047567" data-frame="1" data-height="1080" data-origin-src="//photo.yupoo.com/ven-way/aac32ed1/2d2ed235.jpg" data-path="/ven-way/aac32ed1/2d2ed235.jpg" data-src="//photo.yupoo.com/ven-way/aac32ed1/big.jpg" data-type="photo" data-videoformats="" data-width="1080" src="//photo.yupoo.com/ven-way/aac32ed1/small.jpg"/>
      <div class="image__clickhandle" data-photoid="30842210" style="width: 1080px; padding-bottom: 100.00%" title="点击查看详情">
      </div>
导入请求
从bs4导入BeautifulSoup
导入csv
作为pd进口熊猫
从csv导入编写器
def测试列表():
以open('largexddd.csv','w')作为f1:
writer=csv.writer(f1,分隔符='\t',行终止符='\n',)
#df=pd.read\u csv(“C:\\Users\\Lukasz\\Desktop\\PROJEKTY PYTHON\\W TRAKCIE\\large.csv”)
#url=df['LINKS'][1]
url='1〕https://paypalshop.x.yupoo.com/albums/81513820?uid=1'
response=requests.get(url)
data=response.text
soup=BeautifulSoup(数据'lxml')
szukaj=soup.find_all('div',{'class':“showalbum__儿童图像___主”})
对于szukaj的XD:
q=(soup.find_all(“数据src”))
打印(q)
#q=soup.find(“img”,“class”:“autoover image\uuu img image\uu肖像”})
#q=(tag.get('data-src'))
测试列表()```
HTML:

为您当前正在查看的子类之一使用类选择器,使其处于正确的级别。我使用
select
和dict访问器符号来检索属性。您不能在编写语法时使用它

from bs4 import BeautifulSoup
import csv
import pandas as pd
from csv import writer


def test_list():
    #with open('largeXDDDDDDDDDD.csv','w') as f1:
        #writer=csv.writer(f1, delimiter='\t',lineterminator='\n',)
        url='https://paypalshop.x.yupoo.com/albums/81513820?uid=1'
        response = requests.get(url)
        data = response.content
        soup = BeautifulSoup(data, 'lxml')
        szukaj = soup.select('.image__portrait')
        for x in szukaj:
            q = x['data-src']
            print(q)

test_list()

为当前正在查看的子类之一使用类选择器,使其处于正确的级别。我使用
select
和dict访问器符号来检索属性。您不能在编写语法时使用它

from bs4 import BeautifulSoup
import csv
import pandas as pd
from csv import writer


def test_list():
    #with open('largeXDDDDDDDDDD.csv','w') as f1:
        #writer=csv.writer(f1, delimiter='\t',lineterminator='\n',)
        url='https://paypalshop.x.yupoo.com/albums/81513820?uid=1'
        response = requests.get(url)
        data = response.content
        soup = BeautifulSoup(data, 'lxml')
        szukaj = soup.select('.image__portrait')
        for x in szukaj:
            q = x['data-src']
            print(q)

test_list()

无关但要编写一致的csv文件,您必须使用
newline=''
Unrelated打开它,但要编写一致的csv文件,您必须使用
newline=''
Great打开它。你能解释一下如何将结果导出到CSV文件中吗。。我可以看到这两行被注释掉了吗?在添加'writer.writerow(q)'之后是在Excel中打印奇怪的东西。太好了。你能解释一下如何将结果导出到CSV文件中吗。。我可以看到这两行被注释掉了吗?在添加'writer.writerow(q)'之后是在Excel中打印奇怪的东西。