在Python中将数百万、十亿和万亿转换为一个数字

在Python中将数百万、十亿和万亿转换为一个数字,python,regex,formatting,Python,Regex,Formatting,我有一个列,它的值有“5.00 M”、“1.00 T”和“1.29 Juta”,我想用一种简单的方法将其转换为数值。我试过了 import re powers = {'M': 10 ** 9, 'T': 10 ** 12, 'Juta': 10 ** 6} var1 = ['4', '7149', '6184.09', '0.00', '8', '134944', '5187.33', '5.00 M', '17', '74104', '60773.22', '260.00 M', '7', '

我有一个列,它的值有“5.00 M”、“1.00 T”和“1.29 Juta”,我想用一种简单的方法将其转换为数值。我试过了

import re
powers = {'M': 10 ** 9, 'T': 10 ** 12, 'Juta': 10 ** 6}
var1 = ['4', '7149', '6184.09', '0.00', '8', '134944', '5187.33', '5.00 M', '17', '74104', '60773.22', '260.00 M', '7', '347334', '451922.68', '1.00 T', '80', '18469', '483386.83', '2.50 M', '12', '4716', '14946.30', '0.00', '18', '7119', '111617.66', '0.00', '31', '23131', '814413.09', '0.00', '21', '16281', '192020.50', '0.00', '20', '98381', '57850.37', '0.00', '31', '12501', '39384.40', '0.00', '31', '2851', '1.29 Juta', '0.00', '34', '9440', '171364.82', '0.00', '26', '25442', '54394.00', '0.00', '24', '2492', '165295.95', '0.00', '12', '675', '51301.40', '0.00', '7', '5', '8057.77', '0.00', '6', '704', '35579.19', '0.00', '5', '2133', '15683.20', '0.00', '3', '1356', '5021.00', '0.00', '3', '966', '5456.32', '0.00', '5', '2636', '4097.42', '0.00', '8', '1878', '4554.50', '0.00', '6', '3518', '13900.00', '0.00', '2', '1', '61000.00', '0.00', '3', '0', '1688.00', '0.00', '4', '10', '1488.33', '0.00', '0', '0', '0.00', '0.00', '0', '0', '0.00', '0.00', '2', '0', '4054.00', '0.00', '0', '0', '0.00', '0.00']

def f(num_str):
    match = re.search(r"([0-9\.]+)\s?(M|T|Juta)", num_str)
    if match is not None:
        quantity = match.group(0)
        magnitude = match.group(1)
        return float(quantity) * powers[magnitude]

for i in var1:
    x = f(i)
    print(x)
但我有一个错误:

None
None
None
None
None
None
None
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-23-8dd2f89076c3> in <module>
      1 for i in var1:
----> 2     x = f(i)
      3     print(x)

<ipython-input-22-cb419bc71fb8> in f(num_str)
      7         quantity = match.group(0)
      8         magnitude = match.group(1)
----> 9         return float(quantity) * powers[magnitude]

ValueError: could not convert string to float: '5.00 M'
无
没有一个
没有一个
没有一个
没有一个
没有一个
没有一个
---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
在里面
1对于var1中的i:
---->2x=f(i)
3份打印件(x)
在f中(num_str)
7数量=匹配。组(0)
8级=匹配组(1)
---->9返回浮动(数量)*功率[量级]
ValueError:无法将字符串转换为浮点:“5.00 M”
只需使用
组(1)
组(2)
,因为
组(0)
具有:


除了使用错误的组号,您的正则表达式还存在一些问题。您可以按如下方式修复它:

def f(num_str):
    # regex below has been replaced
    match = re.search(r"(\d+(?:.\d+)?)\s?(M|T|Juta)?", num_str)    # added a ? after Juta) and replaced regex for numeric part.
    if match is not None:
        quantity = match.group(1)
        if match.group(2):                # added a test before to check if magnitude exists
            magnitude = match.group(2)
            return float(quantity) * powers[magnitude]
        else:                             # added a else condition for without magnitude
            return float(quantity)
        
for i in var1:
    x = f(i)
    print(x)

事实上,数字部分的regex
[0-9\.]+
是不正确的。最好使用
\d+(?:。.d+)
替换为
\d+
作为整数部分,使用
(。.d+)
中包含的小数部分替换为可选小数部分,使其成为非捕获组。

另一个问题是正则表达式需要0或1个空格(不再需要)如果没有
M
t
Juta
后缀,它就会失败。我得到了
None 5000000000.0 None 260000000000.0 None 1000000000000.0 None None
,如果我不希望任何值与var1中的值相同,该怎么办?添加一个
else:
,以处理正则表达式不匹配的情况?
def f(num_str):
    # regex below has been replaced
    match = re.search(r"(\d+(?:.\d+)?)\s?(M|T|Juta)?", num_str)    # added a ? after Juta) and replaced regex for numeric part.
    if match is not None:
        quantity = match.group(1)
        if match.group(2):                # added a test before to check if magnitude exists
            magnitude = match.group(2)
            return float(quantity) * powers[magnitude]
        else:                             # added a else condition for without magnitude
            return float(quantity)
        
for i in var1:
    x = f(i)
    print(x)