Python 2.7 如何在pandas(python)中将一列拆分为三列?
我有一个巨大的数据,其中有一列,每一行的格式如下:Python 2.7 如何在pandas(python)中将一列拆分为三列?,python-2.7,pandas,split,dataframe,analytics,Python 2.7,Pandas,Split,Dataframe,Analytics,我有一个巨大的数据,其中有一列,每一行的格式如下: 82283343~Electronics~Mobile Cases & Covers 我想在tilde将上述列拆分为三列(82283343,Electronics,Mobile Case&cover)。我编写了以下代码: df= df._id.map(lambda x: x.split('~')) 但这一点效率都不高,我最终关闭了航站楼。有更好的方法吗?我试着做一些测试,选择最好的方法 最快的方法是从列\u id创建列表,并通过本机
82283343~Electronics~Mobile Cases & Covers
我想在tilde将上述列拆分为三列(82283343
,Electronics
,Mobile Case&cover
)。我编写了以下代码:
df= df._id.map(lambda x: x.split('~'))
但这一点效率都不高,我最终关闭了航站楼。有更好的方法吗?我试着做一些测试,选择最好的方法 最快的方法是从列
\u id
创建列表,并通过本机python拆分(“~”)
进行拆分:
重复4次:
_id one two \
0 82283344~Electronics~Mobile Cases & Covers 82283344 Electronics
1 82283346~Electronics~Mobile Cases & Covers 82283346 Electronics
2 82283343~Electronics~Mobile Cases & Covers 82283343 Electronics
3 82283344~Electronics~Mobile Cases & Covers 82283344 Electronics
4 82283346~Electronics~Mobile Cases & Covers 82283346 Electronics
three
0 Mobile Cases & Covers
1 Mobile Cases & Covers
2 Mobile Cases & Covers
3 Mobile Cases & Covers
4 Mobile Cases & Covers
时间:
In [125]: %timeit DF(df)
...: %timeit AP(df)
...: %timeit EX(df)
...: %timeit SP(df)
...:
1 loops, best of 3: 332 ms per loop
1 loops, best of 3: 564 ms per loop
1 loops, best of 3: 668 ms per loop
1 loops, best of 3: 1.09 s per loop
我想将一列拆分为3列并再次保存到同一数据帧应该可以:
df=df[''u id'].str.split('~',3,expand=True)
请尝试此功能,如果您有任何问题,请告知我们。查看此功能。您需要做的是“延迟加载”文件,或者更准确地说,创建一个生成器方法,将文件分解为可管理的块。如果我的回答有用,请不要忘记。谢谢
def DF(df):
df[['one', 'two', 'three']] = pd.DataFrame([ x.split('~') for x in df['_id'].tolist() ])
def AP(df):
df['one'] = df._id.apply(lambda x: x.split('~')[0])
df['two'] = df._id.apply(lambda x: x.split('~')[1])
df['three'] = df._id.apply(lambda x: x.split('~')[2])
def EX(df):
df[['one', 'two', 'three']] = df._id.str.split('~', expand=True)
def SP(df):
df['one'] = df['_id'].str.split('~').str[0]
df['two'] = df['_id'].str.split('~').str[1]
df['three'] = df['_id'].str.split('~').str[2]
DF(df)
print df.head()
AP(df)
print df.head()
EX(df)
print df.head()
SP(df)
print df.head()
_id one two \
0 82283344~Electronics~Mobile Cases & Covers 82283344 Electronics
1 82283346~Electronics~Mobile Cases & Covers 82283346 Electronics
2 82283343~Electronics~Mobile Cases & Covers 82283343 Electronics
3 82283344~Electronics~Mobile Cases & Covers 82283344 Electronics
4 82283346~Electronics~Mobile Cases & Covers 82283346 Electronics
three
0 Mobile Cases & Covers
1 Mobile Cases & Covers
2 Mobile Cases & Covers
3 Mobile Cases & Covers
4 Mobile Cases & Covers
In [125]: %timeit DF(df)
...: %timeit AP(df)
...: %timeit EX(df)
...: %timeit SP(df)
...:
1 loops, best of 3: 332 ms per loop
1 loops, best of 3: 564 ms per loop
1 loops, best of 3: 668 ms per loop
1 loops, best of 3: 1.09 s per loop