Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x 使用python将动态数据插入mysql_Python 3.x_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 3.x 使用python将动态数据插入mysql

Python 3.x 使用python将动态数据插入mysql,python-3.x,web-scraping,beautifulsoup,Python 3.x,Web Scraping,Beautifulsoup,已编辑>>>>> 我编写了一些返回两个输出但出现错误的代码 我的代码的主要问题是什么 from urllib.request import urlopen as uReq from bs4 import BeautifulSoup as soup import os import sys import codecs from urllib.request import urlopen import pymysql import mysql.connector for i in range(1)

已编辑>>>>>

我编写了一些返回两个输出但出现错误的代码

我的代码的主要问题是什么

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
import os
import sys
import codecs
from urllib.request import urlopen
import pymysql
import mysql.connector

for i in range(1): #electronic

    my_url = "https://www.xxxxx.com/mobile_phones/?facet_is_mpg_child=0&viewType=gridView&page="

    uClient = uReq(my_url + str(i))

    page_html = uClient.read()

    uClient.close()

    page_soup = soup(page_html, "html.parser")

    containers = page_soup.findAll("div" , {"class" : "sku -gallery" })

    for container in containers:

        name = container.img["alt"]

        title_container = container.findAll("span", {"class" : "brand"})

        Brand = title_container[0].text

        price = container.findAll("span",{"class" : "price"} )

        price_one = price[0].text.strip()

        price_old = container.findAll("span",{"class" : "price -old "})
        price_two = '0'
        if len(price_old) > 0:
            price_two = price_old[0].text.strip()

        rank = container.findAll("span",{"class" : "rating-aggregate"})
        ranking = 'N/A'
        if len(rank) > 0:
            ranking = rank[0].text.strip()

conn = pymysql.connect(host="localhost",user="root",passwd="",db="prod")
x = conn.cursor()
#name1 = name()
#brand1 = Brand()
#price_one1 = price_one1()
#price_two1= price_one1()
#rank1 = rank()

x.execute("INSERT INTO list (productname,brand,price1,price2,rank) VALUES (%s,%s,%s,%s.%s)" , (name,Brand,price_one,price_two,ranking))
conn.commit()
conn.close()
C:\Users\xxxx\AppData\Local\Programs\Python\Python35\Python.exe C:/Users/xxxx/.PyCharm2018.2/config/scratches/bd.py回溯(大多数 最近调用(最后一次):文件 “C:/Users/xxxx/.PyCharm2018.2/config/scratches/bd.py”,第54行,在 x、 执行(“插入列表(产品名称、品牌、价格1、价格2、排名)值(%s、%s、%s、%s.%s)”,(名称、品牌、价格1、价格2、排名))
文件 “C:\Users\xxxx\AppData\Local\Programs\Python35\lib\site packages\pymysql\cursors.py”, 执行中的第170行 result=self.\u查询(query)文件“C:\Users\xxxx\AppData\Local\Programs\Python\Python35\lib\site packages\pymysql\cursors.py”, 第328行,在查询中 conn.query(q)文件“C:\Users\xxxx\AppData\Local\Programs\Python35\lib\site packages\pymysql\connections.py”, 查询中的第516行 self.\u受影响的\u行=self.\u读取\u查询\u结果(未缓冲=未缓冲)文件 “C:\Users\xxxx\AppData\Local\Programs\Python35\lib\site packages\pymysql\connections.py”, 第727行,输入读取查询结果 result.read()文件“C:\Users\xxxx\AppData\Local\Programs\Python35\lib\site packages\pymysql\connections.py”, 第1066行,已读 第一个\u packet=self.connection.\u读取\u packet()文件“C:\Users\xxxx\AppData\Local\Programs\Python\Python35\lib\site packets\pymysql\connections.py”, 第683行,输入读取数据包 packet.check_error()文件“C:\Users\xxxx\AppData\Local\Programs\Python\Python35\lib\site packages\pymysql\protocol.py”, 第220行,检查错误 err.raise\u mysql\u异常(self.\u data)文件“C:\Users\xxxx\AppData\Local\Programs\Python\Python35\lib\site packages\pymysql\err.py”, 第109行,在raise_mysql_异常中 raise errorclass(errno,errval)pymysql.err.ProgrammingError:(1064,“您的SQL语法有错误,请检查该错误的手册 对应于要使用的正确语法的MariaDB服务器版本 第1行“'2')”附近)

进程已完成,退出代码为1


问题在于变量
rank
。你应该通过
排名
,但不知何故你错过了。 根据你给出的代码

rank = container.findAll("span",{"class" : "rating-aggregate"}) # resultset
if len(rank) > 0:
    ranking = rank[0].text.strip() #result
所以变化是

x.execute("INSERT INTO list (productname,brand,price1,price2,rank) VALUES (%s,%s,%s,%s.%s)" , (name,Brand,price_one,price_two,ranking))
你准备好出发了!我有一些建议给你。如果使用的是
if
条件,请始终为在条件语句中声明的变量提供else条件或默认值。否则,当条件失败时,您可能会出错。像

rank = container.findAll("span",{"class" : "rating-aggregate"})
ranking = rank[0].text.strip() if len(rank) > 0 else 'N/A'
或者


干杯

此代码将信息存储在csv文件中,但现在我需要将其保存到mysql

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
import os
import sys
import unicodecsv as csv
import codecs
from urllib.request import urlopen


for i in range(3): #electronic

    my_url = "https://www.xxxx.com/mobile_phones/?facet_is_mpg_child=0&viewType=gridView&page="

    uClient = uReq(my_url + str(i))

    page_html = uClient.read()

    uClient.close()

    page_soup = soup(page_html, "html.parser")

    containers = page_soup.findAll("div" , {"class" : "sku -gallery" })

    filename = "mobile.csv"
    f = codecs.open(filename, "a" , "utf-8-sig")
    headers = "name, Brand, price_one, price_two, ranking\n"
    f.write(headers)


    for container in containers:

        name = container.img["alt"]

        title_container = container.findAll("span", {"class" : "brand"})

        Brand = title_container[0].text

        price = container.findAll("span",{"class" : "price"} )

        price_one = price[0].text.strip()

        price_old = container.findAll("span",{"class" : "price -old "})
        price_two = 0
        if len(price_old) > 0:
            price_two = price_old[0].text.strip()

        rank = container.findAll("span",{"class" : "rating-aggregate"})
        if len(rank) > 0:
            ranking = rank[0].text.strip()

        print("name " + name)
        print("Brand "+ Brand)
        print("price_one " + price_one)
        print("price_two {}".format(price_two))  #----> 
        print("ranking " + ranking)

        f.write(name + "," + Brand.replace(",", "|") + "," + price_one.replace(",", "") + "," + price_two.replace(",", "") + "," + ranking + "\n")

f.close()

我再次更改了x.execute(“插入列表(产品名称、品牌、价格1、价格2、排名)值(%s、%s、%s、%s.%.s)”,(名称、品牌、价格1、价格2、排名))文件C:\Users\xxxx\AppData\Local\Programs\Python35\lib\site packages\pymysql\cursors.py中的第53行的ERRR
文件“C:/Users/xxxx/.PyCharm2018.2/config/scratch/bd.py”,第170行,在execute result=self.\u query(query)
中,您可以发布生成的实际查询吗?使用print()而不是x.execute()我改为
rank=container.findAll(“span”,{“class”:“rating aggregate”})如果len(rank)>0:ranking=rank[0],则ranking=rank[0]。text.strip()
和ERORR first post UPDATE再次出现错误,您可以在查询中找到
(1064,“您的SQL语法有错误;请检查与您的MariaDB服务器版本相对应的手册,以在错误消息的第1行“'2')”附近使用正确的语法。
。只有在您提供生成的实际查询时,我才能帮您。
print(“插入列表(产品名称、品牌、价格1、价格2、等级)”值。”(%s、%s、%s、%s.%s)”,(名称、品牌、价格一、价格二、排名))
result==>>在列表中插入(产品名称、品牌、价格一、价格二、排名)值(%s、%s、%s、%s.%s)(“Galaxy A8(2018)64GB黄金”、“三星\xa0”、“53300000”、“0”、“0”、“2”)进程已完成,退出代码为0。问题在于从查询中形成的查询。请尝试形成查询。
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
import os
import sys
import unicodecsv as csv
import codecs
from urllib.request import urlopen


for i in range(3): #electronic

    my_url = "https://www.xxxx.com/mobile_phones/?facet_is_mpg_child=0&viewType=gridView&page="

    uClient = uReq(my_url + str(i))

    page_html = uClient.read()

    uClient.close()

    page_soup = soup(page_html, "html.parser")

    containers = page_soup.findAll("div" , {"class" : "sku -gallery" })

    filename = "mobile.csv"
    f = codecs.open(filename, "a" , "utf-8-sig")
    headers = "name, Brand, price_one, price_two, ranking\n"
    f.write(headers)


    for container in containers:

        name = container.img["alt"]

        title_container = container.findAll("span", {"class" : "brand"})

        Brand = title_container[0].text

        price = container.findAll("span",{"class" : "price"} )

        price_one = price[0].text.strip()

        price_old = container.findAll("span",{"class" : "price -old "})
        price_two = 0
        if len(price_old) > 0:
            price_two = price_old[0].text.strip()

        rank = container.findAll("span",{"class" : "rating-aggregate"})
        if len(rank) > 0:
            ranking = rank[0].text.strip()

        print("name " + name)
        print("Brand "+ Brand)
        print("price_one " + price_one)
        print("price_two {}".format(price_two))  #----> 
        print("ranking " + ranking)

        f.write(name + "," + Brand.replace(",", "|") + "," + price_one.replace(",", "") + "," + price_two.replace(",", "") + "," + ranking + "\n")

f.close()