Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/306.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何拆分包含字符串的列_Python_Pandas - Fatal编程技术网

Python 如何拆分包含字符串的列

Python 如何拆分包含字符串的列,python,pandas,Python,Pandas,我有一个数据框,如果出现\uu,则需要拆分列 Name = [('Hello'), ('Spider'), ('Captain'), ('Superman'), ('Hello_1'), ('Superman_1')] dfName = pd.DataFrame(Name, columns=['Name']) 我的 Name 0 Hello 1 Spider 2 Captain 3 Superman 4

我有一个数据框,如果出现
\uu
,则需要拆分列

Name = [('Hello'),
      ('Spider'),
      ('Captain'),
      ('Superman'),
       ('Hello_1'),
       ('Superman_1')]
dfName = pd.DataFrame(Name, columns=['Name'])
我的

    Name
0   Hello
1   Spider
2   Captain
3   Superman
4   Hello_1
5   Superman_1
预料之外

df1

df2

通过将不包含的掩码反转为
~
用于
df1
,将不包含的掩码反转为
df2
,用于掩码和过滤器。默认值为上次添加的
RangeIndex

m = dfName['Name'].str.contains('_')

#is sample data .reset_index(drop=True) not necessary, added for general solution
df1 = dfName[~m].reset_index(drop=True)
print(df1)
       Name
0     Hello
1    Spider
2   Captain
3  Superman

df2 = dfName[m].reset_index(drop=True)
print(df2)
         Name
0     Hello_1
1  Superman_1   

您可能需要首先将第一个列表拆分为两个子列表:

>>> name = 'Hello Spider Captain Superman Hello_1 Superman_1'.split()
>>> name
['Hello', 'Spider', 'Captain', 'Superman', 'Hello_1', 'Superman_1']
>>> col1 = [n for n in name if '_' not in n]
>>> col2 = [n for n in name if '_' in n]
>>> col1
['Hello', 'Spider', 'Captain', 'Superman']
>>> col2
['Hello_1', 'Superman_1']
>>> 
注意:每个约定变量应该是小写的,以区别于类。

您可以使用此代码拆分数据帧:

df1 = dfName[~dfName["Name"].str.contains('_1', na=False)].reset_index(drop=True)
df2 = dfName[dfName["Name"].str.contains('_1', na=False)].reset_index(drop=True)
df1的输出:

Name
0   Hello
1   Spider
2   Captain
3   Superman
df2的输出:

    Name
0   Hello_1
1   Superman_1
如果要删除索引,请添加.reset_index(drop=True)

可能重复的
Name
0   Hello
1   Spider
2   Captain
3   Superman
    Name
0   Hello_1
1   Superman_1
dfnamewithout_regex = dfName[~dfName['Name'].str.contains('_')]
dfnamewithout_regex
    Name
0   Hello
1   Spider
2   Captain
3   Superman

dfnamewith_regex = dfName[dfName['Name'].str.contains('_')]
dfnamewith_regex
Name
4   Hello_1
5   Superman_1