python,在没有行时向右移动行,并通过反向填充来填充缺少的行
数据: 我想检测data=None的行,将这些行向右移动一列,保留缺少的第一列,然后通过向后填充来填充 结果:python,在没有行时向右移动行,并通过反向填充来填充缺少的行,python,pandas,dataframe,Python,Pandas,Dataframe,数据: 我想检测data=None的行,将这些行向右移动一列,保留缺少的第一列,然后通过向后填充来填充 结果: year all deceased living data 0 2018 7,107 4,394 2,713 None 1 2017 16,478 10,286 6,192 None 2 2016 15,944 9,971 5,973 None 3 Alabama
year all deceased living data
0 2018 7,107 4,394 2,713 None
1 2017 16,478 10,286 6,192 None
2 2016 15,944 9,971 5,973 None
3 Alabama To Date 5,926 3,471 2,455
124 1990 85 49 36 None
125 1989 80 57 23 None
126 1988 86 68 18 None
127 Arkansas To Date 2,963 1,931 1,032
128 1989 16 12 4 None
129 1988 16 11 5 None
最后,我将删除year=todate的行,使其成为正式的数据集
谢谢。这里涉及几个步骤。在我看来,最简单的方法是首先定义
状态
系列,然后删除子标题行,最后将适当的列转换为数字
state year all deceased living
0 None 2018 7,107 4,394 2,713
1 None 2017 16,478 10,286 6,192
2 None 2016 15,944 9,971 5,973
3 Alabama To Date 5,926 3,471 2,455
124 Alabama 1990 85 49 36
125 Alabama 1989 80 57 23
126 Alabama 1988 86 68 18
127 Arkansas To Date 2,963 1,931 1,032
128 Arkansas 1989 16 12 4
129 Arkansas 1988 16 11 5
结果
import numpy as np
import locale
# set locale, for converting strings with commas to integers
locale.setlocale(locale.LC_NUMERIC, '')
# define state and front fill
df['state'] = np.where(pd.to_numeric(df['year'], errors='coerce').isnull(),
df['year'], np.nan)
df['state'] = df['state'].ffill()
# drop To Date rows and data column
df = df[~(df['all'] == 'To Date')].drop('data', 1)
# convert data to numeric
num_cols = ['year', 'all', 'deceased', 'living']
df[num_cols] = df[num_cols].applymap(locale.atoi)
可能重复的或至少与之相关的
print(df)
year all deceased living state
0 2018 7107 4394 2713 NaN
1 2017 16478 10286 6192 NaN
2 2016 15944 9971 5973 NaN
124 1990 85 49 36 Alabama
125 1989 80 57 23 Alabama
126 1988 86 68 18 Alabama
128 1989 16 12 4 Arkansas
129 1988 16 11 5 Arkansas