如何将python列表转换为Pandas系列_Python_Pandas

如何将python列表转换为Pandas系列

python pandas

如何将python列表转换为Pandas系列,python,pandas,Python,Pandas,我有一个python列表l。列表的前几个元素如下所示 [751883787] [751026090] [752575831] [751031278] [751032392] [751027358] [751052118] 我想将此列表转换为pandas.core.series.series，其中2领先0。我的最终结果如下 00751883787 00751026090 00752575831 00751031278 00751032392 00751027358 00751052118 我在

我有一个python列表l。列表的前几个元素如下所示

[751883787]
[751026090]
[752575831]
[751031278]
[751032392]
[751027358]
[751052118]

我想将此列表转换为pandas.core.series.series，其中2领先0。我的最终结果如下

00751883787
00751026090
00752575831
00751031278
00751032392
00751027358
00751052118

我在windows环境下使用Python3.x。你能建议我怎么做吗？另外，我的列表包含大约2000000个元素

您可以尝试：

list=[121,123,125,145]
series='00'+pd.Series(list).astype(str)
print(series)

输出：

0    00121
1    00123
2    00125
3    00145
dtype: object

这是一种方式

from itertools import chain; concat = chain.from_iterable
import pandas as pd

lst = [[751883787],
       [751026090],
       [752575831],
       [751031278]]

pd.DataFrame({'a': pd.Series([str(i).zfill(11) for i in concat(lst)])})

             a
0  00751883787
1  00751026090
2  00752575831
3  00751031278

一些基准测试，因为您的数据帧很大：

from itertools import chain; concat = chain.from_iterable
import pandas as pd

lst = [[751883787],
       [751026090],
       [752575831],
       [751031278],
       [751032392],
       [751027358],
       [751052118]]*300000

%timeit pd.DataFrame(lst, columns=['a'])['a'].astype(str).str.zfill(11)
# 1 loop, best of 3: 7.88 s per loop

%timeit pd.DataFrame({'a': pd.Series([str(i).zfill(11) for i in concat(lst)])})
# 1 loop, best of 3: 2.06 s per loop

首先对列使用

DataFrame

构造函数，然后强制转换为

string

，最后通过if嵌套

列表添加0
：
lst = [[751883787],
       [751026090],
       [752575831],
       [751031278],
       [751032392],
       [751027358],
       [751052118]]

s = pd.DataFrame(lst, columns=['a'])['a'].astype(str).str.zfill(11)
print (s)
0    00751883787
1    00751026090
2    00752575831
3    00751031278
4    00751032392
5    00751027358
6    00751052118
Name: a, dtype: object


如果只有一个列表
：
lst = [751883787,
       751026090,
       752575831,
       751031278,
       751032392,
       751027358,
       751052118]


s = pd.Series(lst).astype(str).str.zfill(11)
print (s)
0    00751883787
1    00751026090
2    00752575831
3    00751031278
4    00751032392
5    00751027358
6    00751052118
dtype: object

给出的两个答案都是有用的。。。下面是总结
import pandas as pd
mylist = [751883787,751026090,752575831,751031278]
mysers = pd.Series(mylist).astype(str).str.zfill(11)
print (mysers)

./test
0    00751883787
1    00751026090
2    00752575831
3    00751031278
dtype: object

另一种方法是，使用astype将系列的数据类型转换为str，并使用向量化的str.zfill填充00，尽管使用lamda更容易阅读
import pandas as pd
mylist = pd.DataFrame([751883787,751026090,752575831,751031278], columns=['coln'])
result = mylist.coln.apply(lambda x: str(int(x)).zfill(11))
print(result)

下面是结果
./test
0    00751883787
1    00751026090
2    00752575831
3    00751031278
Name: coln, dtype: object

@里卡多，非常感谢，很有效。但很抱歉，我目前的声誉不允许我选择答案是否有用或正确，第二次评估是否重要否NaN
s，否则失败；）@jezrael，我不能让你们知道对不起：（如果NaN
s列表中的值，那么第二个解决方案失败了。mylist=[7518837877510260907525831751031278，np.NaN]
你们打败了我@jezrael；），这就是为什么在我的开场白中我说给出的解决方案是有用的！我执行了几乎相同的操作，但使用了列表理解，但pandas
方法似乎慢了3倍。我想.astype（str）.str.zfill（11）
都会矢量化吗？你知道为什么吗？[Benchmarking results in my post]这是预期的输出，因为pandasstr
函数句柄NaN
s也是slowier。我明白了，astype
可能应该有一个na=False
参数！没有这样的参数