R sort()data.frame
我有以下数据框R sort()data.frame,r,sorting,object,dataframe,R,Sorting,Object,Dataframe,我有以下数据框 head(stockdatareturnpercent) SPY DIA IWM SMH OIH 2001-04-02 8.1985485 7.8349806 7.935566 21.223832 13.975655 2001-05-01 -0.5621328 1.7198760 2.141846 -10.904936 -4.565291
head(stockdatareturnpercent)
SPY DIA IWM SMH OIH
2001-04-02 8.1985485 7.8349806 7.935566 21.223832 13.975655
2001-05-01 -0.5621328 1.7198760 2.141846 -10.904936 -4.565291
2001-06-01 -2.6957979 -3.5838102 2.786250 4.671762 -23.241009
2001-07-02 -1.0248091 -0.1997433 -5.725078 -3.354391 -9.161594
2001-08-01 -6.1165559 -5.0276558 -2.461728 -6.218129 -13.956695
2001-09-04 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913
实际上有更多的股票,但为了说明原因,我不得不把它减下来。每个月我都想知道从最好到最差(或从最差到最好)的表现。我使用了sort()函数,这就是我想到的
N <- dim(stockdatareturnpercent)[1]
for (i in 1:N) {
s <- sort(stockdatareturnpercent[i,])
print(s)
}
UPS FDX XLP XLU XLV DIA IWM SPY XLE XLB XLI OIH XLK SMH MSFT
2001-04-02 0.6481585 0.93135 1.923136 4.712996 7.122751 7.83498 7.935566 8.198549 9.826701 10.13465 10.82522 13.97566 14.98789 21.22383 21.41436
SMH FDX OIH XLK XLE SPY XLU XLP DIA MSFT IWM UPS XLV XLB XLI
2001-05-01 -10.90494 -5.045544 -4.565291 -4.182041 -0.9492803 -0.5621328 0.6987724 1.457579 1.719876 2.088734 2.141846 3.73587 3.748309 3.774033 4.099748
OIH XLE XLI XLU XLP XLB DIA UPS SPY XLV FDX XLK IWM SMH MSFT
2001-06-01 -23.24101 -10.02403 -6.594324 -5.8602 -5.0532 -3.955192 -3.58381 -2.814685 -2.695798 -1.177474 0.4987542 1.935544 2.78625 4.671762 5.374764
MSFT OIH XLK IWM SMH XLV UPS XLE SPY XLU XLB XLI DIA FDX
2001-07-02 -9.793005 -9.161594 -7.17351 -5.725078 -3.354391 -2.016818 -1.692442 -1.159914 -1.024809 -0.9029407 -0.2723560 -0.2078283 -0.1997433 2.868898
XLP
2001-07-02 2.998604
N使用原始代码将每个排序的行保存在列表中
:
stockdatareturnpercent <- read.table(textConnection(" SPY DIA IWM SMH OIH
2001-04-02 8.1985485 7.8349806 7.935566 21.223832 13.975655
2001-05-01 -0.5621328 1.7198760 2.141846 -10.904936 -4.565291
2001-06-01 -2.6957979 -3.5838102 2.786250 4.671762 -23.241009
2001-07-02 -1.0248091 -0.1997433 -5.725078 -3.354391 -9.161594
2001-08-01 -6.1165559 -5.0276558 -2.461728 -6.218129 -13.956695
2001-09-04 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913"))
x <- vector("list", nrow(stockdatareturnpercent))
## use unlist to drop the data.frame structure
for (i in 1:nrow(stockdatareturnpercent)) {
x[[i]] <- sort(unlist(stockdatareturnpercent[i,]) )
}
## use the row names to name each list element
names(x) <- rownames(stockdatareturnpercent)
x
$`2001-04-02`
DIA IWM SPY OIH SMH
7.834981 7.935566 8.198548 13.975655 21.223832
$`2001-05-01`
SMH OIH SPY DIA IWM
-10.9049360 -4.5652910 -0.5621328 1.7198760 2.1418460
$`2001-06-01`
OIH DIA SPY IWM SMH
-23.241009 -3.583810 -2.695798 2.786250 4.671762
$`2001-07-02`
OIH IWM SMH SPY DIA
-9.1615940 -5.7250780 -3.3543910 -1.0248091 -0.1997433
$`2001-08-01`
OIH SMH SPY DIA IWM
-13.956695 -6.218129 -6.116556 -5.027656 -2.461728
$`2001-09-04`
SMH OIH IWM DIA SPY
-39.321172 -16.902913 -15.760037 -12.266327 -8.890063
返回一个矩阵,其中每列都是已排序的行。然后转置:
sortmat <- t(apply(stockdatareturnpercent, 1, sort))
sortmat为此使用order()
,因为sort()
在使用*apply时会删除名称:
id <- t(apply(Data,1,order))
lapply(1:nrow(id),function(i)Data[i,id[i,]])
找出在某一时刻哪一个是最好的
如果要使用循环,可以使用列表。正如Joshua所说,你在每个循环中都覆盖了s。初始化列表以首先存储结果。此循环给出的结果与上面代码中的lappy()
相同,但没有id矩阵。虽然使用apply还有其他好处,但速度没有提高:
N <- nrow(Data)
s <- vector("list",N)
for (i in 1:N) {
s[[i]] <- sort(Data[i,])
}
s
只包含最后一行,因为您在每次迭代中都会分配(重写)它。谢谢。虽然这些列似乎已排序,但现在无法判断这些回报属于哪只股票。例如,new data.frame中的第一列表示表现最差的值,但无法将其与股票相关联。最初的data.frame将列dimnames设置为stock ticker,因此我的方法非常愚蠢,因为它使我能够每月查看性能订单。我非常感谢您,因为您的解决方案为我提供了额外的知识和工具。我还要补一些。
sortdf <- as.data.frame(t(apply(stockdatareturnpercent, 1, sort)))
id <- t(apply(Data,1,order))
lapply(1:nrow(id),function(i)Data[i,id[i,]])
matrix(names(Data)[id],ncol=ncol(Data))
[,1] [,2] [,3] [,4] [,5]
[1,] "DIA" "IWM" "SPY" "OIH" "SMH"
[2,] "SMH" "OIH" "SPY" "DIA" "IWM"
[3,] "OIH" "DIA" "SPY" "IWM" "SMH"
[4,] "OIH" "IWM" "SMH" "SPY" "DIA"
[5,] "OIH" "SMH" "SPY" "DIA" "IWM"
[6,] "SMH" "OIH" "IWM" "DIA" "SPY"
N <- nrow(Data)
s <- vector("list",N)
for (i in 1:N) {
s[[i]] <- sort(Data[i,])
}
zz <- textConnection(" SPY DIA IWM SMH OIH
8.1985485 7.8349806 7.935566 21.223832 13.975655
-0.5621328 1.7198760 2.141846 -10.904936 -4.565291
-2.6957979 -3.5838102 2.786250 4.671762 -23.241009
-1.0248091 -0.1997433 -5.725078 -3.354391 -9.161594
-6.1165559 -5.0276558 -2.461728 -6.218129 -13.956695
-8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913 ")
Data <- read.table(zz,header=T)
close(zz)