在python中有没有更快的方法将字符串强制转换为浮点？_Python_Python 2.7_Optimization_Casting

在python中有没有更快的方法将字符串强制转换为浮点？

python python-2.7 optimization

在python中有没有更快的方法将字符串强制转换为浮点？,python,python-2.7,optimization,casting,Python,Python 2.7,Optimization,Casting,我从文件中读取数字，并将其转换为浮点数。数字是这样的 1326.617827, 1322.954823, 1320.512821, 1319.291819... 我在逗号处拆分每一行，然后通过列表创建浮动列表 def listFromLine(line): t = time.clock() temp_line = line.split(',') print "line operations: " + str(time.clock() - t) t = time.

我从文件中读取数字，并将其转换为浮点数。数字是这样的

1326.617827, 1322.954823, 1320.512821, 1319.291819...

我在逗号处拆分每一行，然后通过列表创建浮动列表

def listFromLine(line):
    t = time.clock()
    temp_line = line.split(',')
    print "line operations: " + str(time.clock() - t)
    t = time.clock()
    ret = [float(i) for i in temp_line]
    print "float comprehension: " + str(time.clock() - t)
    return ret

输出看起来像这样

line operations: 5.52103727549e-05
float comprehension: 0.00121321255003
line operations: 9.52025017378e-05
float comprehension: 0.000943885026522
line operations: 7.0782529173e-05
float comprehension: 0.000946716327689

转换为整数然后除以1.0的速度要快得多，但在我的情况下是无用的，因为我需要将数字保留在小数点之后

我看到并尝试过使用熊猫系列，但这比我以前做的要慢

In[38]: timeit("[float(i) for i in line[1:-2].split(',')]", "f=open('pathtofile');line=f.readline()", number=100)
Out[37]: 0.10676022701363763
In[39]: timeit("pandas.Series(line[1:-2].split(',')).apply(lambda x: float(x))", "import pandas;f=open('pathtofile');line=f.readline()", number=100)
Out[38]: 0.14640622942852133

如果可以加快文件的速度，则可以选择更改文件的格式，但最好在读取端加快文件的速度。

首先，要了解如何拆分行，可以使用csv模块读取文件，它通过指定分隔符读取文件并返回迭代器读取器对象，该对象包含以逗号分隔的所有行：

>>> import csv
>>> with open('filename', newline='') as csvfile:
...     spamreader = csv.reader(csvfile, delimiter=',')
...     for row in spamreader:
             #do stuff

然后，为了将数字转换为浮点，因为您希望在数字上应用内置函数float，所以最好使用map函数，在这种情况下，map函数的性能优于列表理解

因此，当您使用csv阅读时，可以对每行执行以下操作：

...     for row in spamreader:
             numbers=map(float,row)

另外，关于使用pandas及其性能，您可能知道，当您处理大型数据集（而不是小型数据集）时，类似于pandas或Numpy的工具性能更好，因为对于小型数据集，将python类型转换为C类型的成本大于计算结果的优势。有关更多信息，请阅读此问题和完整答案

您将希望使用numpy使用loadtxt创建一个浮点数组。

比如：

import numpy
array = numpy.loadtxt('/path/to/data.file', dtype=<type 'float'>, delimiter=',')

如果由于空格而无法使用，您可能希望尝试使用带有“autostrip”选项的genfromtxt：

这比手动或使用csvreader拆分/转换要快得多。

您尝试过numpys loadtxt或genfromtxt吗？列表有多长？为什么需要读得更快？@spectras列表中每行有1376个数字，但可以有任意数量的行。我目前使用的测试文件大约有15000个。