Python 将dataframe列中的某些多个值重命名为另一个单个值
我有一个数据帧,大小为1GB,下面是一个虚拟帧Python 将dataframe列中的某些多个值重命名为另一个单个值,python,pandas,numpy,Python,Pandas,Numpy,我有一个数据帧,大小为1GB,下面是一个虚拟帧 df <- data.frame(group=rep(c("A", "B", "C","D","E","F","G","H"), each=4),height=sample(100:150, 16)) df group height 1 A 105 2 A 119 3 B 108 4 B 114 5 C 109 6 C 111 7
df <- data.frame(group=rep(c("A", "B", "C","D","E","F","G","H"), each=4),height=sample(100:150, 16))
df
group height
1 A 105
2 A 119
3 B 108
4 B 114
5 C 109
6 C 111
7 D 148
8 D 121
9 E 133
10 E 101
11 F 143
12 F 135
13 G 147
14 G 141
15 H 150
16 H 145
任何关于R或熊猫的建议都非常好。
谢谢您带和布尔掩码的熊猫/Numpy解决方案:
print (df['group'] =='B')
1 False
2 False
3 False
4 False
5 True
6 True
7 True
8 True
9 False
10 False
11 False
12 False
Name: group, dtype: bool
df['group'] = np.where(df['group'] == 'B','NC','PC')
print (df)
group height
1 PC 113
2 PC 118
3 PC 128
4 PC 143
5 NC 109
6 NC 141
7 NC 142
8 NC 129
9 PC 127
10 PC 102
11 PC 108
12 PC 107
具有双np的解决方案。其中
:
df['group'] = np.where(df['group'].isin(['B','G','H']), 'NC',
np.where(df['group'] == 'A', 'PC', 'NON'))
print (df)
group height
1 PC 105
2 PC 119
3 NC 108
4 NC 114
5 NON 109
6 NON 111
7 NON 148
8 NON 121
9 NON 133
10 NON 101
11 NON 143
12 NON 135
13 NC 147
14 NC 141
15 NC 150
16 NC 145
在R中,您可以尝试:
首先转换为字符,然后直接替换值
df$group <- as.character(df$group);
df$group[df$group %in% c("B")] <- "NC"
您还可以替换组名,如下所示
df$group=as.character(df$group)
df$group[c(3:4,13:16)]='NC'
df$group[c(1:2)]='PC'
df$group[c(5:12)]='NON'
事实上,我的数据框相当大,它的值比A、B和C都多。我可以编辑我的问题。对不起,请你看看我的问题。我现在已经编辑了。是的,
NON
的意思是NaN
不是一个数字?不是NaN..它是一个字符串
df['group'] = np.where(df['group'].isin(['B','G','H']), 'NC',
np.where(df['group'] == 'A', 'PC', 'NON'))
print (df)
group height
1 PC 105
2 PC 119
3 NC 108
4 NC 114
5 NON 109
6 NON 111
7 NON 148
8 NON 121
9 NON 133
10 NON 101
11 NON 143
12 NON 135
13 NC 147
14 NC 141
15 NC 150
16 NC 145
df$group <- as.character(df$group);
df$group[df$group %in% c("B")] <- "NC"
df$group2 <- ifelse( df$group %in% c("B", "H", "G"), "NC", ifelse(df$group %in% c("A"), "PC", "NON"))
head(df, 10)
group height group2
1 A 139 PC
2 A 114 PC
3 A 132 PC
4 A 141 PC
5 B 107 NC
6 B 101 NC
7 B 122 NC
8 B 129 NC
9 C 100 NON
10 C 108 NON
df$group=as.character(df$group)
df$group[c(3:4,13:16)]='NC'
df$group[c(1:2)]='PC'
df$group[c(5:12)]='NON'