R 将多个数据帧中的特定行合并到一个数据帧中_R_Dataframe

R 将多个数据帧中的特定行合并到一个数据帧中

r dataframe

R 将多个数据帧中的特定行合并到一个数据帧中,r,dataframe,R,Dataframe,我有20个数据帧dat.table1到dat.table20，如下所示： > dat.table1 Mean SD LB UB 1 -3.251915678 0.09831336 -3.44979982 -3.0579865 2 0.529393596 0.09403571 0.34492156 0.7138352 3 0.437666296 0.09555116 0.25218768 0.6230

我有20个数据帧dat.table1到dat.table20，如下所示：

> dat.table1

           Mean         SD          LB         UB
1  -3.251915678 0.09831336 -3.44979982 -3.0579865
2   0.529393596 0.09403571  0.34492156  0.7138352
3   0.437666296 0.09555116  0.25218768  0.6230282
4   0.386773612 0.09338021  0.20630132  0.5708987
5   0.259218892 0.10023005  0.06538325  0.4610775
6  -0.048387041 0.07875680 -0.20517662  0.1020621
7   0.086933460 0.08688864 -0.08462830  0.2565562
8   0.206235709 0.08200178  0.04710170  0.3658142
9   0.343474976 0.08204759  0.18539931  0.5062159
10 -0.354694572 0.08556581 -0.52609169 -0.1916891
11 -0.270542304 0.07349095 -0.41319234 -0.1291315
12  0.124547080 0.08323933 -0.04331230  0.2836064
13  0.005354652 0.10487004 -0.20677503  0.2061523
14  0.296131787 0.08235691  0.13605602  0.4593168
15  0.246056104 0.07536908  0.09803849  0.3959664
16  0.271052276 0.08347047  0.10437983  0.4354910
17 -0.005474416 0.09352408 -0.19415321  0.1736560

> dat.table2
          Mean         SD          LB         UB
1  -3.32373198 0.10477638 -3.53563786 -3.1241599
2   0.58316739 0.09466424  0.39814125  0.7690037
3   0.47869295 0.09768017  0.28395734  0.6701996
4   0.44479756 0.09489120  0.26172536  0.6336547
5   0.30072454 0.09964341  0.10674064  0.4980277
6  -0.05397720 0.07987092 -0.20952979  0.1038290
7   0.06624190 0.08466350 -0.10406855  0.2297836
8   0.18411601 0.07997405  0.02953943  0.3433614
9   0.35256600 0.07871029  0.20079165  0.5111548
10 -0.39566218 0.08567173 -0.56842809 -0.2281193
11 -0.29250153 0.07652253 -0.44428227 -0.1435696
12  0.07428006 0.08742497 -0.09829608  0.2419713
13 -0.03926006 0.11335154 -0.26894891  0.1716172
14  0.30625276 0.08212213  0.14760732  0.4674057
15  0.26511644 0.07824379  0.11330060  0.4216398
16  0.25476552 0.08699879  0.08646282  0.4240095
17 -0.05081449 0.10151042 -0.25162773  0.1451824

file_names <- list.files(pattern = "\\.csv")
read_file  <- function(x) {df <- read.csv(x, stringsAsFactors = FALSE); df$file = x; df}
file_list  <- lapply(files, read_file)

combined   <- do.call(rbind, file_list)

我的问题是，如何从所有数据帧中选择特定的行（如第1行），并在新的数据帧中按行组合它们

谢谢。

最好读取列表中的数据集，而不是在全局环境中创建/读取20个数据集，然后执行此类操作。已经创建了数据集，您可以执行以下操作：

lst <- mget(ls(pattern='^dat.table\\d+'))
res <- do.call(`rbind`,lapply(lst,function(x) x[1,]))

row.names(res) <- NULL

另一个选项是使用来自dplyr的切片

使现代化考虑到错误消息，我怀疑任何lst元素中的列名都是不同的。例如，如果我改变

 colnames(lst[[1]])[1] <- "Mean1"
 do.call(`rbind`,lapply(lst,function(x) x[1,]))
 #Error in match.names(clabs, names(xi)) : 
 #names do not match previous names

一种选择是，如果每个数据集的列顺序相似，则将列名更改为相同

  nm1 <- sapply(lst, function(x) colnames(x))[,2] #Because I changed the 1st element
  #column name
  lst1 <- lapply(lst, function(x) {colnames(x) <- nm1; x} )
  res <- do.call(`rbind`,lapply(lst1,function(x) x[1,]))
  row.names(res) <- NULL

最好读取列表中的数据集，而不是在全局环境中创建/读取20个数据集，然后执行此类操作。已经创建了数据集，您可以执行以下操作：

lst <- mget(ls(pattern='^dat.table\\d+'))
res <- do.call(`rbind`,lapply(lst,function(x) x[1,]))

row.names(res) <- NULL

另一个选项是使用来自dplyr的切片

使现代化考虑到错误消息，我怀疑任何lst元素中的列名都是不同的。例如，如果我改变

 colnames(lst[[1]])[1] <- "Mean1"
 do.call(`rbind`,lapply(lst,function(x) x[1,]))
 #Error in match.names(clabs, names(xi)) : 
 #names do not match previous names

一种选择是，如果每个数据集的列顺序相似，则将列名更改为相同

  nm1 <- sapply(lst, function(x) colnames(x))[,2] #Because I changed the 1st element
  #column name
  lst1 <- lapply(lst, function(x) {colnames(x) <- nm1; x} )
  res <- do.call(`rbind`,lapply(lst1,function(x) x[1,]))
  row.names(res) <- NULL

如果要避免从一开始就有20个类似命名的数据帧。。。你可以这样做：

> dat.table1

           Mean         SD          LB         UB
1  -3.251915678 0.09831336 -3.44979982 -3.0579865
2   0.529393596 0.09403571  0.34492156  0.7138352
3   0.437666296 0.09555116  0.25218768  0.6230282
4   0.386773612 0.09338021  0.20630132  0.5708987
5   0.259218892 0.10023005  0.06538325  0.4610775
6  -0.048387041 0.07875680 -0.20517662  0.1020621
7   0.086933460 0.08688864 -0.08462830  0.2565562
8   0.206235709 0.08200178  0.04710170  0.3658142
9   0.343474976 0.08204759  0.18539931  0.5062159
10 -0.354694572 0.08556581 -0.52609169 -0.1916891
11 -0.270542304 0.07349095 -0.41319234 -0.1291315
12  0.124547080 0.08323933 -0.04331230  0.2836064
13  0.005354652 0.10487004 -0.20677503  0.2061523
14  0.296131787 0.08235691  0.13605602  0.4593168
15  0.246056104 0.07536908  0.09803849  0.3959664
16  0.271052276 0.08347047  0.10437983  0.4354910
17 -0.005474416 0.09352408 -0.19415321  0.1736560

> dat.table2
          Mean         SD          LB         UB
1  -3.32373198 0.10477638 -3.53563786 -3.1241599
2   0.58316739 0.09466424  0.39814125  0.7690037
3   0.47869295 0.09768017  0.28395734  0.6701996
4   0.44479756 0.09489120  0.26172536  0.6336547
5   0.30072454 0.09964341  0.10674064  0.4980277
6  -0.05397720 0.07987092 -0.20952979  0.1038290
7   0.06624190 0.08466350 -0.10406855  0.2297836
8   0.18411601 0.07997405  0.02953943  0.3433614
9   0.35256600 0.07871029  0.20079165  0.5111548
10 -0.39566218 0.08567173 -0.56842809 -0.2281193
11 -0.29250153 0.07652253 -0.44428227 -0.1435696
12  0.07428006 0.08742497 -0.09829608  0.2419713
13 -0.03926006 0.11335154 -0.26894891  0.1716172
14  0.30625276 0.08212213  0.14760732  0.4674057
15  0.26511644 0.07824379  0.11330060  0.4216398
16  0.25476552 0.08699879  0.08646282  0.4240095
17 -0.05081449 0.10151042 -0.25162773  0.1451824

file_names <- list.files(pattern = "\\.csv")
read_file  <- function(x) {df <- read.csv(x, stringsAsFactors = FALSE); df$file = x; df}
file_list  <- lapply(files, read_file)

combined   <- do.call(rbind, file_list)

默认情况下，list.files直接搜索以.csv结尾的文件

read_file函数将读取给定路径的文件，并添加一列说明该文件来自哪个文件

然后，lapply将对文件名中的每个文件使用read_file函数

do.call将把上面返回的数据帧列表合并到一个数据帧中。

如果要避免从一开始就有20个类似命名的数据帧。。。你可以这样做：

> dat.table1

           Mean         SD          LB         UB
1  -3.251915678 0.09831336 -3.44979982 -3.0579865
2   0.529393596 0.09403571  0.34492156  0.7138352
3   0.437666296 0.09555116  0.25218768  0.6230282
4   0.386773612 0.09338021  0.20630132  0.5708987
5   0.259218892 0.10023005  0.06538325  0.4610775
6  -0.048387041 0.07875680 -0.20517662  0.1020621
7   0.086933460 0.08688864 -0.08462830  0.2565562
8   0.206235709 0.08200178  0.04710170  0.3658142
9   0.343474976 0.08204759  0.18539931  0.5062159
10 -0.354694572 0.08556581 -0.52609169 -0.1916891
11 -0.270542304 0.07349095 -0.41319234 -0.1291315
12  0.124547080 0.08323933 -0.04331230  0.2836064
13  0.005354652 0.10487004 -0.20677503  0.2061523
14  0.296131787 0.08235691  0.13605602  0.4593168
15  0.246056104 0.07536908  0.09803849  0.3959664
16  0.271052276 0.08347047  0.10437983  0.4354910
17 -0.005474416 0.09352408 -0.19415321  0.1736560

> dat.table2
          Mean         SD          LB         UB
1  -3.32373198 0.10477638 -3.53563786 -3.1241599
2   0.58316739 0.09466424  0.39814125  0.7690037
3   0.47869295 0.09768017  0.28395734  0.6701996
4   0.44479756 0.09489120  0.26172536  0.6336547
5   0.30072454 0.09964341  0.10674064  0.4980277
6  -0.05397720 0.07987092 -0.20952979  0.1038290
7   0.06624190 0.08466350 -0.10406855  0.2297836
8   0.18411601 0.07997405  0.02953943  0.3433614
9   0.35256600 0.07871029  0.20079165  0.5111548
10 -0.39566218 0.08567173 -0.56842809 -0.2281193
11 -0.29250153 0.07652253 -0.44428227 -0.1435696
12  0.07428006 0.08742497 -0.09829608  0.2419713
13 -0.03926006 0.11335154 -0.26894891  0.1716172
14  0.30625276 0.08212213  0.14760732  0.4674057
15  0.26511644 0.07824379  0.11330060  0.4216398
16  0.25476552 0.08699879  0.08646282  0.4240095
17 -0.05081449 0.10151042 -0.25162773  0.1451824

file_names <- list.files(pattern = "\\.csv")
read_file  <- function(x) {df <- read.csv(x, stringsAsFactors = FALSE); df$file = x; df}
file_list  <- lapply(files, read_file)

combined   <- do.call(rbind, file_list)

默认情况下，list.files直接搜索以.csv结尾的文件

read_file函数将读取给定路径的文件，并添加一列说明该文件来自哪个文件

然后，lapply将对文件名中的每个文件使用read_file函数

do.call将把上面返回的数据帧列表合并成一个数据帧。

感谢您的快速响应。res函数返回错误：match.namesclab，namesxi中的错误：名称与以前的名称不匹配good catch使用错误的名称。显然，其中一个数据帧具有估计值而不是平均值。但是现在，当我运行res时，我得到一个错误：x[1]中的错误：错误的数量dimensions@FadzliFuzi可能是data.frames的维度数不同，如错误所示。检查sapplylst1、dim，查看数字是否不同。其中一个数据集的dim为NULL。我已经将它强制为一个数据帧。现在res运行得很好。谢谢你的努力，我今天学到了很多。像你这样的人使stackoverflow成为一个很好的参考！再次感谢！感谢您的快速响应。res函数返回错误：match中的错误。namesclab，namesxi：名称与以前的名称不匹配good catch使用了错误的名称。显然，其中一个数据帧具有估计值而不是平均值。但是现在，当我运行res时，我得到一个错误：x[1]中的错误：错误的数量dimensions@FadzliFuzi可能是data.frames的维度数不同，如错误所示。检查sapplylst1、dim，查看数字是否不同。其中一个数据集的dim为NULL。我已经将它强制为一个数据帧。现在res运行得很好。谢谢你的努力，我今天学到了很多。像你这样的人使stackoverflow成为一个很好的参考！再次感谢！