将值赋给R中的特定列和行
我有一个数据框,如下所示将值赋给R中的特定列和行,r,R,我有一个数据框,如下所示 Name Amount Subscriptionperiod(Months) Subscriptionstart (Month) Tom 300 3 0 Tom 100 3 2 Jim 500 5 0 Jim 600 3
Name Amount Subscriptionperiod(Months) Subscriptionstart (Month)
Tom 300 3 0
Tom 100 3 2
Jim 500 5 0
Jim 600 3 1
我想安排如下数据。例如,Tom在一笔交易中支付了300美元,为期3个月。在2个月后的第二笔交易中,他在3个月内额外支付了100美元
吉姆也一样
Name M0 M1 M2 M3 M4 M5 M6
Tom 300 300 300 0 0 0 0
Tom 0 0 100 100 100 0 0
Jim 500 500 500 500 500 0 0
Jim 0 600 600 600 0 0 0
我不能改变。我使用下面的代码来完成第一部分。但是对于Jim,如何创建第二行,其中值从M2开始。100美元,从M2开始,持续到M4
for(i in 0:6) df <- within(df,assign(paste0("M",i),ifelse((Subscriptionperiod>i),amount,0)))
首先,让我们从最小数据帧开始:
df1 <- data.frame(name=c("Tom", "Tom", "Jim", "Jim"), amount=c(300, 100, 500, 600),
Subperiod=c(3, 3, 5, 3), SubStart = c(0, 2, 0, 1))
> df1
name amount Subperiod SubStart
1 Tom 300 3 0
2 Tom 100 3 2
3 Jim 500 5 0
4 Jim 600 3 1
现在,聪明的部分是创建一个函数,根据您的规则创建一个大向量来填充矩阵
special_spread <- function(df1){
bigrow <- c()
for(i in 1:nrow(df1)){
pt1 <- rep(0, df1$SubStart[i])
pt2 <- rep(df1$amount[i], df1$Subperiod[i])
pt3 <- rep(0, ncol(m) - (length(pt2)+length(pt1)) )
bigrow <- c(bigrow, pt1, pt2, pt3)
}
m1 <- as.data.frame(matrix(bigrow, nrow=4, ncol=7, byrow = TRUE))
m1 <- cbind(df1$name, m1)
colnames(m1) <- c("name", paste0("M", 0:6))
return(m1)
}
> special_spread(df1)
name M0 M1 M2 M3 M4 M5 M6
1 Tom 300 300 300 0 0 0 0
2 Tom 0 0 100 100 100 0 0
3 Jim 500 500 500 500 500 0 0
4 Jim 0 600 600 600 0 0 0
请告诉我这是否需要更多的解释,如果这或多或少回答了您的问题。我使用data.table和plyr包来实现这一点
library(data.table)
library(plyr)
df <- data.table(read.table(text='Name Amount Period Start
Tom 300 3 0
Tom 100 3 2
Jim 500 5 0
Jim 600 3 1', header=T, row.names = NULL))
#Create a row by repeating df$Amount, df$Period times and padding with 0
create_rows <- function(x, y){
c(rep(0, x$Start), rep(x$Amount, x$Period), rep(0, y - x$Period - x$Start))
}
#Create a new data.table and add Name column
df2 <- data.table(Name = df$Name)
#Create an array of month names
months <- paste('M', 0:6, sep = '')
#Use adply (from plyr) to apply create_rows() accross all rows of df2
#.expand = FALSE ensures the size of the returned data.frame is the right size
#.id = NULL stops adply() from creating an index column
#with = FALSE allows the variable month to be used to refer to columns
df2[, months := adply(df,
1, create_rows,
length(months),
.expand = FALSE,
.id =NULL),
with = FALSE]
您可以使用df3[is.nadf3]将所有NAs更改为0,而不是使用paste…,sep=,您也可以使用paste0…,这在数据帧中有效吗?因为我的源是一个框架,所以需要首先使用setDTdf将数据转换为data.table。Data.table是Data.frame的扩展。除少数例外情况外,数据帧操作仍能正常工作,如有必要,可以通过setDF进行简单的转换。@NGaffney嘿,我在rep0中遇到了这个错误,y-x$period-x$periodstart:invalid'times'arguments导入数据时我修改了列名,可能就是这样。粘贴并运行代码(包括数据导入)时会发生什么情况?在matrixspecialdf1中获取错误,nrow=4,ncol=7,byrow=TRUE:找不到函数special@Vaibhav现在试试,我用bigrow替换了specialdf1。我的打字错误。
m <- matrix(0, nrow=4, ncol=7)
special_spread <- function(df1){
bigrow <- c()
for(i in 1:nrow(df1)){
pt1 <- rep(0, df1$SubStart[i])
pt2 <- rep(df1$amount[i], df1$Subperiod[i])
pt3 <- rep(0, ncol(m) - (length(pt2)+length(pt1)) )
bigrow <- c(bigrow, pt1, pt2, pt3)
}
m1 <- as.data.frame(matrix(bigrow, nrow=4, ncol=7, byrow = TRUE))
m1 <- cbind(df1$name, m1)
colnames(m1) <- c("name", paste0("M", 0:6))
return(m1)
}
> special_spread(df1)
name M0 M1 M2 M3 M4 M5 M6
1 Tom 300 300 300 0 0 0 0
2 Tom 0 0 100 100 100 0 0
3 Jim 500 500 500 500 500 0 0
4 Jim 0 600 600 600 0 0 0
library(data.table)
library(plyr)
df <- data.table(read.table(text='Name Amount Period Start
Tom 300 3 0
Tom 100 3 2
Jim 500 5 0
Jim 600 3 1', header=T, row.names = NULL))
#Create a row by repeating df$Amount, df$Period times and padding with 0
create_rows <- function(x, y){
c(rep(0, x$Start), rep(x$Amount, x$Period), rep(0, y - x$Period - x$Start))
}
#Create a new data.table and add Name column
df2 <- data.table(Name = df$Name)
#Create an array of month names
months <- paste('M', 0:6, sep = '')
#Use adply (from plyr) to apply create_rows() accross all rows of df2
#.expand = FALSE ensures the size of the returned data.frame is the right size
#.id = NULL stops adply() from creating an index column
#with = FALSE allows the variable month to be used to refer to columns
df2[, months := adply(df,
1, create_rows,
length(months),
.expand = FALSE,
.id =NULL),
with = FALSE]