R 使用“剪切”为时间变量创建24个类别

R 使用“剪切”为时间变量创建24个类别,r,dataframe,cut,categorical-data,R,Dataframe,Cut,Categorical Data,在这里,我导入数据,对其进行一些操作这可能不是问题/修复所在 前两行设置了我的切割参数 lab_var_num <- (0:24) times_var <-c(0,100,200,300,400,500,600,700,800,900,1000,1100,1200,1300,1400,1500,1600,1700,1800,1900,2000,2100,2200,2300,2400,2500) all_files_ls <- read_csv("~/Desktop/bio

在这里,我导入数据,对其进行一些操作这可能不是问题/修复所在

前两行设置了我的切割参数

lab_var_num <- (0:24) 
times_var <-c(0,100,200,300,400,500,600,700,800,900,1000,1100,1200,1300,1400,1500,1600,1700,1800,1900,2000,2100,2200,2300,2400,2500)


all_files_ls <- read_csv("~/Desktop/bioinformatic_work/log_parse_files/sorted_by_habitat/all_trap/all_files_la_selva_log.csv")
#Eliminate bad data and capture in separate dataframe- "bad" data contained within all_files_ls_bad
all_files_ls_bad<-subset(all_files_ls,all_files_ls$temp<10|all_files_ls$temp>50)
all_files_ls <-subset(all_files_ls,all_files_ls$temp>10&all_files_ls$temp<50)

# convert our character data to date data- then change to POSIXct data type.
# all_dates <- strptime(all_files_ls$date,format="%m/%d/%Y")
# Data needs to be put into a cosnistant format of %m/%d/%Y before you can coerce it
# into POSIXct, or any other, data otherwise it will spit out errors.

all_files_ls$date <- strptime(all_files_ls$date,format="%m/%d/%Y")
all_files_ls$date <- as.POSIXct(all_files_ls$date)
# Create wet and dry season data sets.
all_files_ls_w <- subset(all_files_ls,date>="2015-05-01"&date<="2015-12-31"|date>="2016-05-01"&date<="2016-12-31")
all_files_ls_s <- subset(all_files_ls,date>="2015-01-01"&date<="2015-4-30"|date>="2016-01-01"&date<="2016-04-30")


# Subset into canopy and understory dataframes.

all_files_ls_s_c <- subset(all_files_ls_s,canopy_understory=="c"|canopy_understory=="C")
all_files_ls_s_u <- subset(all_files_ls_s,canopy_understory=="u"|canopy_understory=="U")

all_files_ls_w_c <- subset(all_files_ls_w,canopy_understory=="c"|canopy_understory=="C")
all_files_ls_w_u <- subset(all_files_ls_w,canopy_understory=="u"|canopy_understory=="U")

all_files_ls_s_c_summ <- all_files_ls_s_c %>% group_by(time)%>% summarise(standard_deviation = sd(temp,na.rm=TRUE),mean = mean(temp,na.rm=TRUE))
all_files_ls_s_u_summ <- all_files_ls_s_u %>% group_by(time)%>% summarise(standard_deviation = sd(temp,na.rm=TRUE),mean = mean(temp,na.rm=TRUE))
all_files_ls_w_c_summ <- all_files_ls_w_c %>% group_by(time)%>% summarise(standard_deviation = sd(temp,na.rm=TRUE),mean = mean(temp,na.rm=TRUE))
all_files_ls_w_u_summ <- all_files_ls_w_u %>% group_by(time)%>% summarise(standard_deviation = sd(temp,na.rm=TRUE),mean = mean(temp,na.rm=TRUE))

我使用group_by来获取每个时间点的汇总数据。然后尝试使用cut使其成为特定时间附近的每个数据点都指定给该时间。因此,如果时间为1801,则将其与1800一起分组。group_by函数仅将具有相同时间的每个数据点放在一起。我想要完成的是将附近的每个时间点分组在一起


我不明白为什么我得到了58个类别,而我期望得到24个。

您可以通过多个变量进行分组,而不是将data.frame的部分保存为单独的文件并对其执行相同的操作。您可以使用lubridate::month从基R中的每个日期中提取月份作为数字。您可以使用strptimedf$date、%Y-%m-%d'$mon+1,这使您只需使用ifelse创建一个新的分组变量,而不是使用重复的标签剪切,这将导致R>=3.4.0中的错误。一旦设置了所有分组变量,汇总就变得简单而简单

图书馆弹琴 df%>%按林冠/林下系数分组 从日期中提取数字月份。如果小于5,则将'season'设为else w,并按其分组。 季节=ifelselubridate::月日<5,'s','w', 按0100200、…、2400削减时间,并按返回的系数分组。 小时=切割时间,序号0,2400,100,包括。最低=真%>% SummarseTemp_mean=平均温度,每组的计算平均值和温度标准差。 温度(sd=sdtemp) >一个tibble:20x5 >组别:树冠林下,季节[?] >林冠下季节小时温度平均温度sd > >1 c w[0100]21.5 NA >2 c w 500600]20.1 NA >3 c w 700800]25.5 NA >4 c w 900,1e+03]29.0 NA >5 c w 1.1e+03,1.2e+03]28.0 NA >6CW1.3e+03,1.4e+03]28.5NA >7 c w 1.6e+03,1.7e+03]27.5 NA >8 c w 1.8e+03,1.9e+03]25.5 NA >9 c w 2e+03,2.1e+03]23.5 NA >10 c w 2.1e+03,2.2e+03]22.5 NA >11美国标准100200]23.6不适用 >12美国300400]24.1不适用 >13美国500600]24.1不适用 >14美国700800]24.6不适用 >15美国标准900,1e+03]24.6北美 >16美国1.1e+03,1.2e+03]26.1北美 >17美国1.3e+03,1.4e+03]26.6北美 >18美国1.5e+03,1.6e+03]25.6NA >19美国1.7e+03,1.8e+03]24.1NA >20美国1.9e+03,2e+03]24.1NA 样本数据的标准差为NA,因为每组只有一个观察值,但在较大的数据上应该可以很好地工作

资料


df在这里潜水不太深,看起来你是分组的,然后执行时间切割。因此,您应该为每个组获得时间缩短。我使用group_by来获取每个时间点的汇总数据。然后尝试使用cut使其成为特定时间附近的每个数据点都指定给该时间。所以,如果一个时间是101,它将与100一起分组。我将用这些信息更新这个问题。为什么不在日期上使用cut呢?此外,您可以将_按多个内容分组,而不是保存多个data.frames,这将为您节省大量重复的行。@alistaire我当然愿意接受这些想法,但我不确定如何进行。
all_files_ls_s_c_summ$time <- cut(as.numeric(all_files_ls_s_c_summ$time),breaks=c(times_var),labels = lab_var_num,include.lowest = TRUE)
all_files_ls_s_u_summ$time <- cut(as.numeric(all_files_ls_s_u_summ$time),breaks=c(times_var),labels = lab_var_num,include.lowest = TRUE)
all_files_ls_w_c_summ$time <- cut(as.numeric(all_files_ls_w_c_summ$time),breaks=c(times_var),labels = lab_var_num,include.lowest = TRUE)
all_files_ls_w_u_summ$time <- cut(as.numeric(all_files_ls_w_u_summ$time),breaks=c(times_var),labels = lab_var_num,include.lowest = TRUE)
  trap        serial_no                           file_name canopy_understory       date  time  temp humidity
1  LS_trap_10c 7C000000395C1641 trap10c_7C000000395C1641_150809.csv                 c 2015-05-28   600  20.1     <NA>
2  LS_trap_10c 7C000000395C1641 trap10c_7C000000395C1641_150809.csv                 c 2015-05-28   800  25.5     <NA>
3  LS_trap_10c 7C000000395C1641 trap10c_7C000000395C1641_150809.csv                 c 2015-05-28  1000  29.0     <NA>
4  LS_trap_10c 7C000000395C1641 trap10c_7C000000395C1641_150809.csv                 c 2015-05-28  1200  28.0     <NA>
5  LS_trap_10c 7C000000395C1641 trap10c_7C000000395C1641_150809.csv                 c 2015-05-28  1400  28.5     <NA>
6  LS_trap_10c 7C000000395C1641 trap10c_7C000000395C1641_150809.csv                 c 2015-05-28  1601  27.5     <NA>
7  LS_trap_10c 7C000000395C1641 trap10c_7C000000395C1641_150809.csv                 c 2015-05-28  1803  25.5     <NA>
8  LS_trap_10c 7C000000395C1641 trap10c_7C000000395C1641_150809.csv                 c 2015-05-28  2001  23.5     <NA>
9  LS_trap_10c 7C000000395C1641 trap10c_7C000000395C1641_150809.csv                 c 2015-05-28  2200  22.5     <NA>
10 LS_trap_10c 7C000000395C1641 trap10c_7C000000395C1641_150809.csv                 c 2015-05-29   000  21.5     <NA>
11  LS_trap_10u 9F00000039641541 trap10u_9F00000039641541_160110.csv                 u 2016-01-01  0159  23.6     <NA>
12  LS_trap_10u 9F00000039641541 trap10u_9F00000039641541_160110.csv                 u 2016-01-01  0359  24.1     <NA>
13  LS_trap_10u 9F00000039641541 trap10u_9F00000039641541_160110.csv                 u 2016-01-01  0559  24.1     <NA>
14  LS_trap_10u 9F00000039641541 trap10u_9F00000039641541_160110.csv                 u 2016-01-01  0759  24.6     <NA>
15  LS_trap_10u 9F00000039641541 trap10u_9F00000039641541_160110.csv                 u 2016-01-01  0959  24.6     <NA>
16  LS_trap_10u 9F00000039641541 trap10u_9F00000039641541_160110.csv                 u 2016-01-01  1159  26.1     <NA>
17  LS_trap_10u 9F00000039641541 trap10u_9F00000039641541_160110.csv                 u 2016-01-01  1359  26.6     <NA>
18  LS_trap_10u 9F00000039641541 trap10u_9F00000039641541_160110.csv                 u 2016-01-01  1559  25.6     <NA>
19  LS_trap_10u 9F00000039641541 trap10u_9F00000039641541_160110.csv                 u 2016-01-01  1759  24.1     <NA>
20 LS_trap_10u 9F00000039641541 trap10u_9F00000039641541_160110.csv                 u 2016-01-01  1959  24.1     <NA>
"","time","standard_deviation","mean"
"1","0",0.864956566100052,23.5574468085106
"2","0",1.14440510857225,22.81103515625
"3","0",0.984904980117555,22.2286831812256
"4","0",1.08678357585325,22.3990654205607
"5","1",1.05145037946718,22.0769704433498
"6","1",1.12960402993109,22.3836754643206
"7","2",1.03725039998279,21.7559322033898
"8","2",1.1068790873174,21.9357894736842
"9","3",1.12097157902533,21.6717980295567
"10","3",1.19621923944834,22.00751953125
"11","4",1.07458677721861,21.4380704041721
"12","4",1.13677253853809,21.6116959064328
"13","5",1.17900504899409,21.4315270935961
"14","5",1.28653071505367,21.79990234375
"15","6",1.20354620166699,21.9286831812256
"16","6",1.31676108631382,22.2322429906542
"17","7",1.86260704732764,23.7655596555966
"18","7",1.77861521566506,24.20419921875
"19","8",2.46883855937697,25.7301298701299
"20","8",2.46920498327612,26.1562427071179
"21","9",2.68395795782085,27.1479115479115
"22","0",0.949097628789142,23.3553191489362
"23","9",2.79945910162021,27.6413533834586
"24","10",2.79930128034239,27.7833981841764
"25","10",2.90435941493285,28.4350606394708
"26","11",2.79704441144441,28.2748466257669
"27","11",2.84178392019108,28.8
"28","12",2.88487423989003,28.5626131953428
"29","12",3.09977843678832,29.2737596471885
"30","13",2.78609514613334,28.6300613496933
"31","13",2.9274394403559,29.0124410933082
"32","14",2.46471466241151,28.0413748378729
"33","14",2.64014509330527,28.5502750275027
"34","15",2.24926437332819,27.1096296296296
"35","15",2.3886068967475,27.4907634307257
"36","16",1.9467999768684,26.0171875
"37","16",1.96854340222531,26.4749174917492
"38","17",1.43673026552318,24.7727385377943
"39","17",1.49178257598373,25.1431279620853
"40","18",1.23662593572858,24.0101694915254
"41","18",1.36276616154878,24.3736434108527
"42","19",1.07197213445298,23.5255266418835
"43","1",0.99431780638411,23.0787234042553
"44","19",1.13453791853054,23.854174573055
"45","20",1.01855291267246,23.1731421121252
"46","20",1.10799364301127,23.4543743078627
"47","21",0.998989468534969,22.9889714993804
"48","21",1.0452391633029,23.2751423149905
"49","22",0.993841145023006,22.6971316818774
"50","22",1.08423014353774,22.9405524861878
"51","23",1.01856406998964,22.517843866171
"52","2",1.03074836073784,22.8872340425532
"53","3",1.10188636506543,22.7382978723404
"54","4",1.11782711780932,22.5787234042553
"55","5",1.06571756649915,22.6106382978723
"56","6",1.16909794681656,23.8127659574468
"57","7",1.28653814110936,26.2702127659574
"58","8",1.39470055539637,28.0787234042553