漂亮的汤刮到MySQL使用Python。如果字符串包含do else do其他操作,则该选项将一直保持不变

漂亮的汤刮到MySQL使用Python。如果字符串包含do else do其他操作,则该选项将一直保持不变,python,mysql,beautifulsoup,Python,Mysql,Beautifulsoup,我只是想知道,在我被打败的时候,是否有人能帮助我——这是python的新手,在过去的4个小时里,我经历了反复试验,现在我已经到了迷失方向的地步 我粘在上面的部件是: 枪管长度英寸 枪管长度 下面的片段: barrellengths = soup.find(barrellength_span) gun_barrellengths = barrellengths.text if barrellengths else '' gun_barrellengths_inches =

我只是想知道,在我被打败的时候,是否有人能帮助我——这是python的新手,在过去的4个小时里,我经历了反复试验,现在我已经到了迷失方向的地步

我粘在上面的部件是:

枪管长度英寸 枪管长度

下面的片段:


    barrellengths = soup.find(barrellength_span)
    gun_barrellengths = barrellengths.text if barrellengths else ''
    gun_barrellengths_inches = ''
    gun_barrellengthfraction = ''

#if " in" present split the string to print the inches
    def barrel_length_inches_text(gun_barrellengths_inches):
     if " in" in gun_barrellengths:
      gun_barrellengths_inches = gun_barrellengths.split()[0]
     else:
      gun_barrellengths_inches = '0'

#if " present check to see if there can be a split for the fraction else remove the " and continue
    def barrel_length_inches_symbol(gun_barrellengths_inches):
     if '"' in gun_barrellengths:
      try:
       gun_barrellengths.split()
       gun_barrellengths_inches = gun_barrellengths.split()[0]
      except: 
       IndexError
       gun_barrellengths_inches = gunbarrellengths.replace('"','')

#which method to use
    def barrel_length_inches(gun_barrellengths_inches):
     if(len(gun_barrellengths) == 0):
      gun_barrellength_inches = ''
     elif " in" in gun_barrellengths:
      barrel_length_inches_text(gun_barrellengths_inches)
     elif '"' in gun_barrellengths:
      barrel_length_inches_symbol(gun_barrellengths_inches)


#if there is a decimal point in barrellengths
    def barrel_length_fraction_symbol(gun_barrellength_fraction): 
     if '.' in gun_barrellengths:
      try:
       gun_barrellengths.split()
       gun_barrellengthfraction = gun_barrellengths.split()[1]
       gun_barrellengthfraction = gun_barrelfraction.replace('"','')
       gun_barrellengthfraction = 0+gun_barrelfraction
      except: 
       IndexError
       gun_barrellengthfraction = '0'

#if there is text in barrel length fraction
    def gun_barrel_length_fraction_text(gun_barrellength_fraction):
     if ' in' in gun_barrellengths:
      try:
       gun_barrellengthfraction = gun_barrellengths.split()[1]
       if "1/2" in gun_barrellengthfraction:
        gun_barrellengthfraction = gun_barrellengthfraction.replace("1/2", "0.5")
       elif "1/4" in gun_barrellengthfraction:
        gun_barrellengthfraction = gun_barrellengthfraction.replace("1/4", "0.25")
       elif "3/4" in gun_barrellengthfraction:
        gun_barrellengthfraction = gun_barrellengthfraction.replace("3/4", "0.75")
       elif "in" in gun_barrellengthfraction:
        gun_barrellengthfraction = '0'
      except:
        IndexError
        gun_barrellengthfraction = '0'

#decide which function works for the fraction
    def barrel_length_fraction(gun_barrellengthfraction):
     if(len(gun_barrellengths) == 0):
      gun_barrellengthfraction = ''
     elif "." in gun_barrellengths:
      gun_barrel_length_fraction_symbol(gun_barrellengthfraction)
     elif "in" in gun_barrellengths:
      gun_barrel_length_fraction_text(gun_barrellengthfraction)
当我把它推出时,我得到的是一个也没有,我的数组输出也没有,但是那里肯定有数据

我试图检查数据,如果有'32',删除'32',因此它只是'32',分数是0 如果数据显示为32.5“在“.”上拆分字符串,则将32变为gun_barrellength_inches变量,并将.5变为gun_barrellength fraction变量

或者,如果数据显示为'32 in',则删除'in',并使32炮管长度为'u in',炮管长度分数为0。 如果“32 1/2英寸”在第一个上分开,并使32炮管长度为0.5英寸,则将“1/2英寸”替换为0.5英寸,并使炮管长度分数为0.5英寸

URL会随着经销商的不同而变化-我有一个代码,它为这个特定的经销商工作,我很高兴(beeen在一些帮助下开发了这个代码几个星期)。但是当我与另一个经销商测试这个脚本时,它在这些方面失败了(在股票上也是一样的-但将在这些方面实现这些变化)

以下是完整代码:

from bs4 import BeautifulSoup
import requests
import shutil
import csv
import pandas
from pandas import DataFrame
import re
import os
import io
import urllib
import locale
import math
os.environ["PYTHONIOENCODING"] = "utf-8"
import mysql.connector
from mysql.connector import errorcode


cnx = mysql.connector.connect(user='user', password='password', host='127.0.0.1', port='3306', database='DatabaseName')

cursor = cnx.cursor()

page = 1
all_links = []
url="https://www.gunstar.co.uk/view-trader/global-rifle-snipersystems/58782?page={}"

with requests.Session() as session:
  while True:
    print(url.format(page))
    res=session.get(url.format(page))
    soup=BeautifulSoup(res.content,'html.parser')
    gun_details = soup.select('div.details')
    for link in gun_details:
     all_links.append("https://www.gunstar.co.uk" + link.select_one('a')['href'])

    if len(soup.select('a.al-pagination-item'))==0:
        break
    page += 1

print(len(all_links))

gunstar_id = 0

for a_link in all_links:

    gunstar_id += 1

    def category_span(category):
       return category.name=='span' and 'Category' in category.parent.contents[0] 

    def subCategory_span(subCategory):
       return subCategory.name=='span' and 'Subcategory' in subCategory.parent.contents[0] 

    def make_span(make):
       return make.name=='span' and 'Make' in make.parent.contents[0] 

    def model_span(model):
       return model.name=='span' and 'Model' in model.parent.contents[0] 

    def mechanism_span(mechanism):
       return mechanism.name=='span' and 'Mechanism' in mechanism.parent.contents[0] 

    def calibre_span(calibre):
       return calibre.name=='span' and 'Calibre' in calibre.parent.contents[0] 

    def licence_span(licence):
       return licence.name=='span' and 'Certificate' in licence.parent.contents[0] 

    def orientation_span(orientation):
       return orientation.name=='span' and 'Orientation' in orientation.parent.contents[0] 

    def barrellength_span(barrellength):
       return barrellength.name=='span' and 'Barrel length' in barrellength.parent.contents[0] 

    def stocklength_span(stocklength):
       return stocklength.name=='span' and 'Stock length' in stocklength.parent.contents[0] 

    def gunlength_span(gunlength):
       return gunlength.name=='span' and 'Gun length' in gunlength.parent.contents[0] 

    def weight_span(weight):
       return weight.name=='span' and 'Weight' in weight.parent.contents[0] 

    def chamber_span(chamber):
       return chamber.name=='span' and 'Chamber length' in chamber.parent.contents[0] 

    def chokes_span(chokes):
       return chokes.name=='span' and 'Chokes' in chokes.parent.contents[0] 

    def ejection_span(ejection):
       return ejection.name=='span' and 'Ejection' in ejection.parent.contents[0] 

    def trigger_span(trigger):
       return trigger.name=='span' and 'Trigger' in trigger.parent.contents[0] 

    def condition_span(condition):
       return condition.name=='span' and 'Condition' in condition.parent.contents[0] 

    def price_span(price):
       return price.name=='span' and 'Price' in price.parent.contents[0] 

    res = requests.get(a_link)
    soup = BeautifulSoup(res.content, 'html.parser')

    gun_details = soup.findAll('div', {"class":"mb al-spec flex"})

    categorys = soup.find(category_span)
    gun_categorys = categorys.text if categorys else ''

    subCategorys = soup.find(subCategory_span)
    gun_subCategorys = subCategorys.text if subCategorys else ''

    makes = soup.find(make_span)
    gun_makes = makes.text if makes else ''

    models = soup.find(model_span)
    gun_models = models.text if models else ''

    mechanisms = soup.find(mechanism_span)
    gun_mechanisms = mechanisms.text if mechanisms else ''

    calibres = soup.find(calibre_span)
    gun_calibres = calibres.text if calibres else ''
    if "12 Bore/gauge" in gun_calibres:
        gun_calibres = gun_calibres.replace("12 Bore/gauge", "12 Gauge")
    else:
        gun_calibres


    #licences = soup.find(licence_span)
    #gun_licences = licences.text if licences else ''
    if "Rifles" in gun_categorys:
      gun_licences = "FAC"
    elif "Shotguns" in gun_categorys:
      gun_licences = "FAC/SGC"
    else:
      gun_licences = ''

    orientations = soup.find(orientation_span)
    gun_orientations = orientations.text if orientations else ''

    barrellengths = soup.find(barrellength_span)
    gun_barrellengths = barrellengths.text if barrellengths else ''
    gun_barrellengths_inches = ''
    gun_barrellengthfraction = ''

#if " in" present split the string to print the inches
    def barrel_length_inches_text(gun_barrellengths_inches):
     if " in" in gun_barrellengths:
      gun_barrellengths_inches = gun_barrellengths.split()[0]
     else:
      gun_barrellengths_inches = '0'

#if " present check to see if there can be a split for the fraction else remove the " and continue
    def barrel_length_inches_symbol(gun_barrellengths_inches):
     if '"' in gun_barrellengths:
      try:
       gun_barrellengths.split()
       gun_barrellengths_inches = gun_barrellengths.split()[0]
      except: 
       IndexError
       gun_barrellengths_inches = gunbarrellengths.replace('"','')

#which method to use
    def barrel_length_inches(gun_barrellengths_inches):
     if(len(gun_barrellengths) == 0):
      gun_barrellength_inches = ''
     elif " in" in gun_barrellengths:
      barrel_length_inches_text(gun_barrellengths_inches)
     elif '"' in gun_barrellengths:
      barrel_length_inches_symbol(gun_barrellengths_inches)


#if there is a decimal point in barrellengths
    def barrel_length_fraction_symbol(gun_barrellength_fraction): 
     if '.' in gun_barrellengths:
      try:
       gun_barrellengths.split()
       gun_barrellengthfraction = gun_barrellengths.split()[1]
       gun_barrellengthfraction = gun_barrelfraction.replace('"','')
       gun_barrellengthfraction = 0+gun_barrelfraction
      except: 
       IndexError
       gun_barrellengthfraction = '0'

#if there is text in barrel length fraction
    def gun_barrel_length_fraction_text(gun_barrellength_fraction):
     if ' in' in gun_barrellengths:
      try:
       gun_barrellengthfraction = gun_barrellengths.split()[1]
       if "1/2" in gun_barrellengthfraction:
        gun_barrellengthfraction = gun_barrellengthfraction.replace("1/2", "0.5")
       elif "1/4" in gun_barrellengthfraction:
        gun_barrellengthfraction = gun_barrellengthfraction.replace("1/4", "0.25")
       elif "3/4" in gun_barrellengthfraction:
        gun_barrellengthfraction = gun_barrellengthfraction.replace("3/4", "0.75")
       elif "in" in gun_barrellengthfraction:
        gun_barrellengthfraction = '0'
      except:
        IndexError
        gun_barrellengthfraction = '0'

#decide which function works for the fraction
    def barrel_length_fraction(gun_barrellengthfraction):
     if(len(gun_barrellengths) == 0):
      gun_barrellengthfraction = ''
     elif "." in gun_barrellengths:
      gun_barrel_length_fraction_symbol(gun_barrellengthfraction)
     elif "in" in gun_barrellengths:
      gun_barrel_length_fraction_text(gun_barrellengthfraction)



    stocklengths = soup.find(stocklength_span)
    gun_stocklengths = stocklengths.text if stocklengths else ''
    if " in" in gun_stocklengths:
      gun_stocklength_inches = gun_stocklengths.split()[0]
    else:
      gun_stocklength_inches = '' 

    if(len(gun_stocklengths) == 0):
     gun_stocklength_fraction = ''
    else:
     gun_stocklength_fraction = gun_stocklengths.split()[1]
     if "1/2" in gun_stocklength_fraction:
       gun_stocklength_fraction = gun_stocklength_fraction.replace("1/2", "0.5")
     elif "1/4" in gun_stocklength_fraction:
       gun_stocklength_fraction = gun_stocklength_fraction.replace("1/4", "0.25")
     elif "3/4" in gun_stocklength_fraction:
       gun_stocklength_fraction = gun_stocklength_fraction.replace("3/4", "0.75")
     elif "in" in gun_stocklength_fraction:
       gun_stocklength_fraction = ''


    gunlengths = soup.find(gunlength_span)
    gun_gunlengths = gunlengths.text if gunlengths else ''



    weights = soup.find(weight_span)
    gun_weights = weights.text if weights else ''
    if " kilo" in gun_weights:
     gun_weight_lb = gun_weights.split()[0]
     gun_weight_lb = float(gun_weight_lb)
     gun_weight_lb = gun_weight_lb * 2.2046226218
     gun_weight_lb = float(gun_weight_lb)
     gun_weight_lb_oz = str(gun_weight_lb)
     gun_weight_lb_round = math.ceil(gun_weight_lb) #converting to whole int
     gun_weight_lb_round = str(gun_weight_lb_round)
    else:
     gun_weight_lb_round = ''

    if(len(gun_weight_lb_round) == 0):
     gun_weight_oz = ''
    else:
     gun_weight_oz = "0."+gun_weight_lb_oz.split('.')[1] #cant split cause not float
     gun_weight_oz = float(gun_weight_oz)
     gun_weight_oz = gun_weight_oz * 16
     gun_weight_oz = math.ceil(gun_weight_oz)


    chambers = soup.find(chamber_span)
    gun_chambers = chambers.text if chambers else ''
    if " in" in gun_chambers:
     gun_chambers = gun_chambers.split()[0]
    else:
     gun_chambers = ''

    chokess = soup.find(chokes_span)
    gun_chokess = chokess.text if chokess else ''

    ejections = soup.find(ejection_span)
    gun_ejections = ejections.text if ejections else ''

    triggers = soup.find(trigger_span)
    gun_triggers = triggers.text if triggers else ''

    conditions = soup.find(condition_span)
    gun_conditions = conditions.text if conditions else ''

    prices = soup.find(price_span)
    gun_prices = prices.text if prices else ''
    if "£ " in gun_prices:
     gun_prices = gun_prices.split()[1]
     if "," in gun_prices:
      gun_prices = gun_prices.replace(',', '')
     if " each" in gun_prices:
      gun_prices = gun_prices.replace(' each', '')
    else:
     gun_prices = ''

    gun_description = soup.find('div', {'class':'al-addet-desc-text t-bd-14 mb-4'})
    if (gun_description is not None):
     gun_description_text = gun_description.text
    else:
     gun_description_text = ''
    if "...Read full description" in gun_description_text:
     gun_description_text = gun_description_text.replace("...Read full description", "")
    else:
     gun_description_text

    imgs = soup.findAll("img", {"class":"al-mediabar-item-img js-trigger-slideshow"}) #Keep - Photos save
#    gundir = soup.find("title").text #keep - folder creation for each advert using title
#    gun_folders = os.makedirs(gundir)


#    for img in imgs:
#      clean = re.compile('src=".*?"')
#      strings = clean.findall(str(img))    
#      for string in strings:
#         imgUrl = string.split('"')[1]
#         filename = imgUrl.split('/')[-1]
#         resp = requests.get(imgUrl, stream=True)
#         local_file = open('{}/{}'.format(gundir ,filename), 'wb')
#         resp.raw.decode_content = True
#         shutil.copyfileobj(resp.raw, local_file)
#         del resp

    array = [gunstar_id, gun_categorys, gun_subCategorys, gun_makes, gun_models, gun_mechanisms, gun_calibres, gun_licences, gun_orientations, barrel_length_inches(gun_barrellengths_inches), barrel_length_fraction(gun_barrellengthfraction), gun_stocklength_inches, gun_stocklength_fraction, gun_gunlengths, gun_weight_lb_round, gun_weight_oz, gun_chambers, gun_chokess, gun_ejections, gun_triggers, gun_conditions, gun_prices, gun_description_text]

    print(array)

    cursor.execute("INSERT INTO tbl_gunstar_test VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)", array)

cursor.close()
cnx.close()

我还没有读过所有的东西。但我只是在看你的代码,我看到了错误。当你处理
时,尝试:except:
并且你想排除特定的异常,你可以做
除了MyException:mycode
而不是
除了:MyException mycode
。如果你想让except包含每种类型的异常,只要做
除了:mycode
h就行了我@BenoitDrogou-谢谢你的输入。刚刚拼凑起来。头融化了。现在要推上答案。是过度设计,把它弄得一团糟。
    barrellengths = soup.find(barrellength_span)
    gun_barrellengths = barrellengths.text if barrellengths else ''
    gun_barrellengths_inches = ''
    gun_barrellengthfraction = ''

#if " in" present split the string to print the inches
    def barrel_length_inches(gun_barrellengths_inches):
     if " in" in gun_barrellengths:
      gun_barrellengths_inches = gun_barrellengths.split()[0]
      return gun_barrellengths_inches
     elif '.' not in gun_barrellengths:
      gun_barrellengths_inches = gun_barrellengths.split()[0]
      gun_barrellengths_inches = gun_barrellengths_inches.replace('"','')
      return gun_barrellengths_inches
     elif '.' in gun_barrellengths:
      gun_barrellengths_inches = gun_barrellengths.split('.')[0]
      return gun_barrellengths_inches
     else:
      gun_barrellengths_inches = ''
      return gun_barrellengths_inches

    print(barrel_length_inches(gun_barrellengths_inches))

#if there is a decimal point in barrellengths
    def barrel_length_fraction(gun_barrellengthfraction): 
     if '.' in gun_barrellengths:
       gun_barrellengthfraction = gun_barrellengths.split('.')[1]
       gun_barrellengthfraction = gun_barrellengthfraction.replace('"','')
       gun_barrellengthfraction = '0.'+gun_barrellengthfraction
       return gun_barrellengthfraction
     elif ' in' in gun_barrellengths:
       gun_barrellengthfraction = gun_barrellengths.split()[1]
       if "1/2" in gun_barrellengthfraction:
        gun_barrellengthfraction = gun_barrellengthfraction.replace("1/2", "0.5")
        return gun_barrellengthfraction
       elif "1/4" in gun_barrellengthfraction:
        gun_barrellengthfraction = gun_barrellengthfraction.replace("1/4", "0.25")
        return gun_barrellengthfraction
       elif "3/4" in gun_barrellengthfraction:
        gun_barrellengthfraction = gun_barrellengthfraction.replace("3/4", "0.75")
        return gun_barrellengthfraction
       elif "in" in gun_barrellengthfraction:
        gun_barrellengthfraction = ''
        return gun_barrellengthfraction
     else:
       gun_barrellengthfraction = ''
       return gun_barrellengthfraction

    print(barrel_length_fraction(gun_barrellengthfraction))