Python 使用字符串筛选数据帧内容_Python_Pandas_Dataframe

Python 使用字符串筛选数据帧内容

python pandas dataframe

Python 使用字符串筛选数据帧内容,python,pandas,dataframe,Python,Pandas,Dataframe,这些值是从Excel中读取的我想把这些内容分开如下 I have a dataframe of the following type, Input Output Output SE 0 Rat Cat Mat 1 rat cat mat 2 0

这些值是从

Excel

中读取的

我想把这些内容分开如下

I have a dataframe of the following type,
              Input         Output         Output SE
  0           Rat           Cat               Mat
  1           rat           cat               mat
  2           0             4.8               0.255
  3           3             7.2               0.32
  4           Bat           Cat               Sat
  5           bat           cat               sat
  6           0             1.8               0.275
  7           3             1.7               0.745

我目前正在使用iloc：

df1=
0            Rat            Cat               Mat
1            rat            cat               mat
2            0              4.8               0.255
3            3              7.2               0.32


df2=

   0         Bat              Cat               Sat
   1         bat              cat               sat
   2         0                1.8               0.275
   3         3                1.7               0.745

还有别的办法吗？我有一个非常大的具有相同模式的数据帧，我想在出现两行字符串时分割数据帧

编辑：输入数据帧重置

尝试通过

df1 = df.iloc[0:3]
df2 = df.iloc[4:8]

如果要在至少两行包含非数字值时立即拆分，则只需测试该值，然后在每个新的组长上拆分组：

现在，您可以使用groupby获取子数据帧的列表：

def isnum(ser):
    try:
        pd.to_numeric(ser)
        return True
    except ValueError:
        return False

num = df.apply(isnum)

# df.grp will be 1 if and only if it is the first of a group of at least 2 lines
#  containing non numeric values
df.loc[~(num|(~num.shift().fillna(True))|num.shift(-1).fillna(True)), 'grp'] = 1

# give a different value for each group:
grp = pd.Series(1, df.loc[~(num|(~num.shift().fillna(True))|num.shift(-1).fillna(True)),
                          'grp'].index)
grp = grp.cumsum().reindex(df.index).ffill()

这个解决方案很好。但是，包含数值的行数并不总是2。有时，它可能大于或小于2。@Natasha输出是否总是Cat？>@请您解释一下好吗？Cat？我的意思是输出总是第一行

Cat

No，不总是。也可能有没有Cat的行

raise IndexingError（key）pandas.core.indexing.IndexingError:（Input True Output False Output SE False dtype:bool，'grp'）

def isnum(ser):
    try:
        pd.to_numeric(ser)
        return True
    except ValueError:
        return False

num = df.apply(isnum)

# df.grp will be 1 if and only if it is the first of a group of at least 2 lines
#  containing non numeric values
df.loc[~(num|(~num.shift().fillna(True))|num.shift(-1).fillna(True)), 'grp'] = 1

# give a different value for each group:
grp = pd.Series(1, df.loc[~(num|(~num.shift().fillna(True))|num.shift(-1).fillna(True)),
                          'grp'].index)
grp = grp.cumsum().reindex(df.index).ffill()

dfs = dfs = [sub for _, sub in df.groupby(grp)]