R 确定每个簇的每列中值的百分比_R_Dataframe_Dplyr_Tibble_Summarize

R 确定每个簇的每列中值的百分比

r dataframe

R 确定每个簇的每列中值的百分比,r,dataframe,dplyr,tibble,summarize,R,Dataframe,Dplyr,Tibble,Summarize,我需要为每个有条件的集群确定每列中值的百分比。可复制的示例如下所示。我有一张这样的桌子： > tab GI RT TR VR Cluster_number 1 1000086986 0.5814 0.5814 0.628 1 10 1000728257 0.5814 0.5814 0.628 1 13 1000074769 0.7879 0.7879 0.443

我需要为每个有条件的集群确定每列中值的百分比。可复制的示例如下所示。我有一张这样的桌子：

> tab
            GI     RT     TR    VR Cluster_number
1   1000086986 0.5814 0.5814 0.628              1
10  1000728257 0.5814 0.5814 0.628              1
13  1000074769 0.7879 0.7879 0.443              2
14  1000498642 0.7879 0.7879 0.443              2
22  1000074765 0.7941 0.3600 0.533              3
26  1000597385 0.7941 0.3600 0.533              3
31  1000502373 0.5000 0.5000 0.607              4
32  1000532631 0.6875 0.7059 0.607              4
33  1000597694 0.5000 0.5000 0.607              4
34  1000598724 0.5000 0.5000 0.607              4

> tab1
   Cluster_number RT_cond TR_cond VR_cond
1               1 0        0        100
2               2 100      100      0  
3               3 100      0        0
4               4 25       25       100

我需要这样的桌子：

> tab
            GI     RT     TR    VR Cluster_number
1   1000086986 0.5814 0.5814 0.628              1
10  1000728257 0.5814 0.5814 0.628              1
13  1000074769 0.7879 0.7879 0.443              2
14  1000498642 0.7879 0.7879 0.443              2
22  1000074765 0.7941 0.3600 0.533              3
26  1000597385 0.7941 0.3600 0.533              3
31  1000502373 0.5000 0.5000 0.607              4
32  1000532631 0.6875 0.7059 0.607              4
33  1000597694 0.5000 0.5000 0.607              4
34  1000598724 0.5000 0.5000 0.607              4

> tab1
   Cluster_number RT_cond TR_cond VR_cond
1               1 0        0        100
2               2 100      100      0  
3               3 100      0        0
4               4 25       25       100

其中，相应列中的值表示相应集群中GI的百分比，其中RT>=0.6、TR>=0.6和VR>=0.6。即，在第一个集群中，所有RT=0.6，因此最终表格中的对应值为25。我如何才能做到这一点？

您可以按簇编号对簇进行分组，并使用“跨越”计算百分比：

library(dplyr)
df %>%
  group_by(Cluster_number) %>%
  summarise(across(RT:VR, ~mean(. >= 0.6) * 100, .names = '{col}_cond'))
  #In older version of dplyr use summarise_at
  #summarise_at(vars(RT:VR), ~mean(. >= 0.6) * 100)


#  Cluster_number RT_cond TR_cond VR_cond
#           <int>   <dbl>   <dbl>   <dbl>
#1              1       0       0     100
#2              2     100     100       0
#3              3     100       0       0
#4              4      25      25     100

资料

您可以按簇编号对簇进行分组，并使用“跨越”计算百分比：

library(dplyr)
df %>%
  group_by(Cluster_number) %>%
  summarise(across(RT:VR, ~mean(. >= 0.6) * 100, .names = '{col}_cond'))
  #In older version of dplyr use summarise_at
  #summarise_at(vars(RT:VR), ~mean(. >= 0.6) * 100)


#  Cluster_number RT_cond TR_cond VR_cond
#           <int>   <dbl>   <dbl>   <dbl>
#1              1       0       0     100
#2              2     100     100       0
#3              3     100       0       0
#4              4      25      25     100

资料

使用dplyr包，您可以使用group_by语句，后跟SUMMARESE，然后使用新的rename_With函数重命名感兴趣的列

该死的，我需要在这20秒内取消我的答案。。你总是像闪电一样快：DDamn我需要在那20秒钟内取消我的答案。。你总是像闪电一样快：D