Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/341.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python中字符串数组前拆分字符串_Python_String_Split - Fatal编程技术网

Python中字符串数组前拆分字符串

Python中字符串数组前拆分字符串,python,string,split,Python,String,Split,你好,我有另一个分裂字符串的问题 首先,它是由第二个大写字母拆分的(感谢Omire): 在我得到另一份产品清单之前,很好,因为很难分割 一些示例产品: Veltebøyle Toyota Hi Lux Double Cab 2019 - Frontbøyle EC Godkjent Super Bar Inox AUDI Q5 2008-2015 Frontbøyle Høy Medium Bar Mark Inox CHEVROLET Captiva 2011 > Stigtrinn G

你好,我有另一个分裂字符串的问题

首先,它是由第二个大写字母拆分的(感谢Omire):

在我得到另一份产品清单之前,很好,因为很难分割

一些示例产品:

Veltebøyle Toyota Hi Lux Double Cab 2019 -
Frontbøyle EC Godkjent Super Bar Inox AUDI Q5 2008-2015
Frontbøyle Høy Medium Bar Mark Inox CHEVROLET Captiva 2011 >
Stigtrinn Grand Pedana Inox VOLKSWAGEN Amarok Trend Line 2010 >
Stigtrinn Grand Pedana Inox CITROËN C-Crosser 2008 >
Frontbøyle Polert Standard Toyota Hilux 10-15
第一个产品没问题,但其他产品有更多的大写字母

最好的方法是降低模型列表前的所有字母,我可以用模型车创建数组(可能没有第一个)

致:

或者它可能在车前分解成环(从阵列)

拆分为:

 Stigtrinn Grand Pedana Inox 
 VOLKSWAGEN Amarok Trend Line 2010 >
UPADETE:

我创建如下函数:

def split_car(string):
   car_array=['Audi','Bmw','Chevrolet','Citroen','Dacia','Daihatsu','Dodge','Fiat','Ford','Honda','Hyundai','Isuzu','Iveco','Jeep','Kia','Land Rover','Mazda','Mercedes','Mitsubishi','Nissan','Opel','Peugeot','Porsche','Renault','Seat','Skoda','SsangYong','Subaru','Suzuki','Toyota','Volkswagen','Volvo',]
   for car in car_array:
       if car in string:
           a, b = string.split(" " + car + "", 1)
           b = car + b
           return (a, b)
但是现在我需要忽略字符串car couse的上下字母,有时Bmw可以是Bmw或Bmw,我怎么能做到呢?

有如此多样的(又称“凌乱”)输入,最好不要依赖对方以一致的方式使用大写和小写。这是一个查找汽车品牌核准列表(“白名单”)中每个单词的实现,同时忽略任何案例差异

text = [ 'Veltebøyle Toyota Hi Lux Double Cab 2019 -',
    'Frontbøyle EC Godkjent Super Bar Inox AUDI Q5 2008-2015',
    'Frontbøyle Høy Medium Bar Mark Inox CHEVROLET Captiva 2011 >',
    'Stigtrinn Grand Pedana Inox VOLKSWAGEN Amarok Trend Line 2010 >',
    'Stigtrinn Grand Pedana Inox CITROËN C-Crosser 2008 >',
    'Frontbøyle Polert Standard Toyota Hilux 10-15' ]

brands = ['Audi','Bmw','Chevrolet','Citroen','Citroën','Dacia','Daihatsu','Dodge','Fiat',
    'Ford','Honda','Hyundai','Isuzu','Iveco','Jeep','Kia','Land Rover','Mazda','Mercedes',
    'Mitsubishi','Nissan','Opel','Peugeot','Porsche','Renault','Seat','Skoda','SsangYong',
    'Subaru','Suzuki','Toyota','Volkswagen','Volvo']

def split_by_brand (string):
    string = string.split()
    brands_cased = [brand.upper() for brand in brands]
    for index,word in enumerate(string):
        if word.upper() in brands_cased:
            return ' '.join(string[:index]),' '.join(string[index:])

for line in text:
    model,brand = split_by_brand (line)
    print ('model "{}", brand "{}"'.format(model,brand))
结果:

型号“Veltebøyle”,品牌“丰田Hi-Lux双驾驶室2019”
型号“Frontbøyle EC Godkjent Super Bar Inox”,品牌“奥迪Q5 2008-2015”
型号“Frontbøyle Høy中型酒吧Mark Inox”,品牌“雪佛兰Captiva 2011>”
型号“Stigtrinn Grand Pedana Inox”,品牌“大众阿玛洛克趋势线2010>”
型号“Stigtrinn Grand Pedana Inox”,品牌“雪铁龙C-Crosser 2008>”
型号“Frontbøyle Polert标准”,品牌“Toyota Hilux 10-15”
这不是容错的(也不容易做到)。例如,如果列表中缺少一个品牌,或者只是在原文中拼写错误,那么您将得到一个错误。

例如,我必须将拼写正确的
“Citroën”
添加到您的原始品牌列表中,使其与文本匹配。

是的,您比我做得更好。我只是对大写字母和小写字母有困难。只是我不知道我做错了什么,我试图提高所有,但不是正确的作品像在您的脚本Thx的帮助!
 Stigtrinn Grand Pedana Inox 
 VOLKSWAGEN Amarok Trend Line 2010 >
def split_car(string):
   car_array=['Audi','Bmw','Chevrolet','Citroen','Dacia','Daihatsu','Dodge','Fiat','Ford','Honda','Hyundai','Isuzu','Iveco','Jeep','Kia','Land Rover','Mazda','Mercedes','Mitsubishi','Nissan','Opel','Peugeot','Porsche','Renault','Seat','Skoda','SsangYong','Subaru','Suzuki','Toyota','Volkswagen','Volvo',]
   for car in car_array:
       if car in string:
           a, b = string.split(" " + car + "", 1)
           b = car + b
           return (a, b)
text = [ 'Veltebøyle Toyota Hi Lux Double Cab 2019 -',
    'Frontbøyle EC Godkjent Super Bar Inox AUDI Q5 2008-2015',
    'Frontbøyle Høy Medium Bar Mark Inox CHEVROLET Captiva 2011 >',
    'Stigtrinn Grand Pedana Inox VOLKSWAGEN Amarok Trend Line 2010 >',
    'Stigtrinn Grand Pedana Inox CITROËN C-Crosser 2008 >',
    'Frontbøyle Polert Standard Toyota Hilux 10-15' ]

brands = ['Audi','Bmw','Chevrolet','Citroen','Citroën','Dacia','Daihatsu','Dodge','Fiat',
    'Ford','Honda','Hyundai','Isuzu','Iveco','Jeep','Kia','Land Rover','Mazda','Mercedes',
    'Mitsubishi','Nissan','Opel','Peugeot','Porsche','Renault','Seat','Skoda','SsangYong',
    'Subaru','Suzuki','Toyota','Volkswagen','Volvo']

def split_by_brand (string):
    string = string.split()
    brands_cased = [brand.upper() for brand in brands]
    for index,word in enumerate(string):
        if word.upper() in brands_cased:
            return ' '.join(string[:index]),' '.join(string[index:])

for line in text:
    model,brand = split_by_brand (line)
    print ('model "{}", brand "{}"'.format(model,brand))