Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/email/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在R中添加和合并两个数据帧_R_Dataframe - Fatal编程技术网

在R中添加和合并两个数据帧

在R中添加和合并两个数据帧,r,dataframe,R,Dataframe,我有两个数据帧: > df1 Long Short EURUSD 47295 16057 GBPUSD 17385 6861 USDJPY 7146 9369 USDCHF 2704 5162 USDCAD 4705 11947 AUDUSD 13041 6654 NZDUSD 7184 4000 > df2 Long Short EURUSD 318 408 GBPUSD 181 276 USDJPY 217

我有两个数据帧:

> df1
       Long Short
EURUSD 47295 16057
GBPUSD 17385  6861
USDJPY  7146  9369
USDCHF  2704  5162
USDCAD  4705 11947
AUDUSD 13041  6654
NZDUSD  7184  4000

> df2
       Long Short
EURUSD  318    408
GBPUSD  181    276
USDJPY  217    203
USDCHF   97     57
USDCAD  178    121
AUDUSD  142    202
NZDUSD   95    138
我需要最终数据帧如下所示:

> Final
       Long   Short
EURUSD 47613   16465

...    ...     ...

NZDUSD 7279    4138
合并/连接方法不起作用。谢谢你的帮助

如果数据没有行名(我个人的偏好,不总是可控的),这里有三种方法

您的数据:

df1 <- read.table(text = "Symbol Long Short
EURUSD 47295 16057
GBPUSD 17385  6861
USDJPY  7146  9369
USDCHF  2704  5162
USDCAD  4705 11947
AUDUSD 13041  6654
NZDUSD  7184  4000", header = TRUE, stringsAsFactors = FALSE)

df2 <- read.table(text = "Symbol Long Short
EURUSD  318    408
GBPUSD  181    276
USDJPY  217    203
USDCHF   97     57
USDCAD  178    121
AUDUSD  142    202
NZDUSD   95    138", header = TRUE, stringsAsFactors = FALSE)
方法2:基R合并 此方法不依赖于两种情况下的有序行或甚至行的存在。为了演示这一点,我将从其中一个数据帧中删除一行:

df2 <- df2[-3,]
以及工作:

library(dplyr)
full_join(df1, rename(df2, Long2 = Long, Short2 = Short), by = "Symbol") %>%
  mutate(
    Long = psum(Long, Long2, na.rm = TRUE),
    Short = psum(Short, Short2, na.rm = TRUE)
  ) %>%
  select(-Long2, -Short2)
#   Symbol  Long Short
# 1 EURUSD 47613 16465
# 2 GBPUSD 17566  7137
# 3 USDJPY  7146  9369
# 4 USDCHF  2801  5219
# 5 USDCAD  4883 12068
# 6 AUDUSD 13183  6856
# 7 NZDUSD  7279  4138
编辑 你问题中的数据不具有代表性。根据您的评论,您真正拥有的似乎是:

str(df1)
# 'data.frame': 7 obs. of  2 variables:
#  $ Long : Factor w/ 7 levels "2704","4705",..: 7 6 3 1 2 5 4
#  $ Short: Factor w/ 7 levels "4000","5162",..: 7 4 5 2 6 3 1
(为了将来的参考,如果您以明确的消费品形式提供数据,例如:

# dput(df1) ... possibly with options(deparse.max.lines=NULL) beforehand
structure(list(
  Long = structure(c(7L, 6L, 3L, 1L, 2L, 5L, 4L), .Label = c("2704", "4705", "7146", "7184", "13041", "17385", "47295"), class = "factor"),
  Short = structure(c(7L, 4L, 5L, 2L, 6L, 3L, 1L), .Label = c("4000", "5162", "6654", "6861", "9369", "11947", "16057"), class = "factor")),
  .Names = c("Long", "Short"),
  row.names = c("EURUSD", "GBPUSD", "USDJPY", "USDCHF", "USDCAD", "AUDUSD", "NZDUSD"),
  class = "data.frame")
要从您的
df1
获得我在上面读到的内容,只需执行以下操作:

# convert from nascent factors to numbers
df1[] <- lapply(df1[], function(a) as.numeric(as.character(a)))
# bring the row names into a column
df1$Symbol <- rownames(df1)
#将新生因子转换为数字

df1[]
df1+df2
不起作用吗?如果您的第一列是因子变量,它将按照@Vandenman的建议在尝试简单加法时输出
NA
。在这种情况下,请使用
cbind(df1[,1],df1[,2:3]+df2[,2:3])
。您的第一列(因子
是如何实现的没有列名称?它看起来像行名称,这不应该影响
df1+df2
这件事。如果Leo's不为您做这件事,您能通过包含
dput(head(x))
的输出和“不工作”的意思(警告、错误等)来让它更具可复制性吗?Yes@r2evans它们是行名,我手动这么做是因为数据被刮取了。给行名一个列名会有帮助吗?Leo的解决方案给我一个错误“error in'[.data.frame'(df1,2:3):未定义的列被选中”虽然它们在美学上看起来很好,但我不喜欢在一般情况下使用行名称:它们可能很脆弱,一些实用程序无法保留它们(因此您需要努力使它们保持有序,但并不总是显而易见).怎么样,Andrew.G,这解决了你的问题吗?我正在尝试让选项起作用。I+1是因为你付出了所有的努力,但我还不能勾选答案,因为我无法让它起作用。具体地说,我的数据是从动态网页中刮取的,所以我不能做第一步。就像在“键入数据”中一样,我尝试用usi剥离行名问题是数字被视为因素。当我尝试使用
df1[,c(1,2)]转换它们时,请阅读您的评论。(在我的回答中没有提到的原因是,您的问题中最初没有任何东西表明它们不是数字。如果您的样本数据是用类似于
dput
的东西给出的,那么会更清楚。)
df2 <- df2[-3,]
library(dplyr)
full_join(df1, rename(df2, Long2 = Long, Short2 = Short), by = "Symbol") %>%
  mutate(
    Long = psum(Long, Long2, na.rm = TRUE),
    Short = psum(Short, Short2, na.rm = TRUE)
  ) %>%
  select(-Long2, -Short2)
#   Symbol  Long Short
# 1 EURUSD 47613 16465
# 2 GBPUSD 17566  7137
# 3 USDJPY  7146  9369
# 4 USDCHF  2801  5219
# 5 USDCAD  4883 12068
# 6 AUDUSD 13183  6856
# 7 NZDUSD  7279  4138
str(df1)
# 'data.frame': 7 obs. of  2 variables:
#  $ Long : Factor w/ 7 levels "2704","4705",..: 7 6 3 1 2 5 4
#  $ Short: Factor w/ 7 levels "4000","5162",..: 7 4 5 2 6 3 1
# dput(df1) ... possibly with options(deparse.max.lines=NULL) beforehand
structure(list(
  Long = structure(c(7L, 6L, 3L, 1L, 2L, 5L, 4L), .Label = c("2704", "4705", "7146", "7184", "13041", "17385", "47295"), class = "factor"),
  Short = structure(c(7L, 4L, 5L, 2L, 6L, 3L, 1L), .Label = c("4000", "5162", "6654", "6861", "9369", "11947", "16057"), class = "factor")),
  .Names = c("Long", "Short"),
  row.names = c("EURUSD", "GBPUSD", "USDJPY", "USDCHF", "USDCAD", "AUDUSD", "NZDUSD"),
  class = "data.frame")
# convert from nascent factors to numbers
df1[] <- lapply(df1[], function(a) as.numeric(as.character(a)))
# bring the row names into a column
df1$Symbol <- rownames(df1)