R 使用条件从数据帧查找值
我有两个数据帧,如下所述:R 使用条件从数据帧查找值,r,matrix,dplyr,R,Matrix,Dplyr,我有两个数据帧,如下所述: DF_1> Sr.No. Stage Time Result Result_2 1 updated_date 1516868822411 1516868822361 1516868822350 2 id 1516868822411 ABC - 3 engine_dat
DF_1>
Sr.No. Stage Time Result Result_2
1 updated_date 1516868822411 1516868822361 1516868822350
2 id 1516868822411 ABC -
3 engine_date 1516868822411 1516868822000 -
4 blocked 1516868822411 80000 0
5 updated_date 1516868822398 1516868822350 1516866877815
6 list 1516868822398 BCD -
7 sub_stat_1 1516868779095 AC-12 AC-14
8 status_1 1516868642468 AC-25 AC-38
DF_2>
Sr. No. ID Type_1 Type_2
1 AC-12 X Y
2 AC-14 XX YY
3 AC-25 A B
4 AC-38 CC CD
现在,我想用下面提到的条件从DF_2中得到vlookup值:
Sr. No. Stage Time Result Result_2 Time_2 Final_1 Final_2
1 updated_date 1516868822411 1516868822361 1516868822350 25/01/2018 08:27:02 25/01/2018 08:27:02 25/01/2018 08:27:02
2 id 1516868822411 ABC - 25/01/2018 08:27:02 ABC -
3 engine_date 1516868822411 1516868822000 - 25/01/2018 08:27:02 25/01/2018 08:27:02 -
4 blocked 1516868822411 80000 0 25/01/2018 08:27:02 80000 0
5 updated_date 1516868822398 1516868822350 1516866877815 25/01/2018 08:27:02 25/01/2018 08:27:02 25/01/2018 07:54:38
6 list 1516868822398 BCD - 25/01/2018 08:27:02 BCD -
7 sub_stat_1 1516868779095 AC-12 AC-14 25/01/2018 08:26:19 Y (Output of AC-12) YY (Output of AC-14)
8 status_1 1516868642468 AC-25 AC-38 25/01/2018 08:24:02 A (Output of AC-25) CC (Output of AC-38)
我假设您的数据帧列是打印的,字符串列是
字符
类型,而不是因子
。如果尚未将它们转换为字符
。(请参见底部的示例数据。)
如果阶段为sub_stat_1,则类型_2(来自DF_2)的vlookup结果和结果_2
如果阶段为状态_1,则vlookup结果和类型_1(来自DF_2)的结果_2
若stage为status_1或sub_stat_1,但Result或Result_2为nothing,则在输出数据帧中给出“-”值
I我们使用缺少的值初始化,NA
。我鼓励您这样做,但是如果您真的愿意,您可以做dfu 1[is.na(DF_1)]=“-”
将与DF_1 Result和Result_2相同的其他值分别保留到所需的输出列Final_1和Final_2
只要时间中有历元时间,Result和Result_2列(如果可能)将其分别转换为所需输出列time_2、Final_1和Final_2中的正常时间
我把这个留给你-如果你提供原点,你可以在你的大纪元上使用as.POSIXct()
,但是你的整数对我来说太大了。在将它们插入最后一列之前,您可能需要对它们进行格式化,以便可以控制它们转换为字符时的外观。如果你需要更多的帮助,可以问一个单独的问题
DF_1
# Sr.No. Stage Time Result Result_2 Final_1 Final_2
# 1 1 updated_date 1.516869e+12 1516868822361 1516868822350 1516868822361 1516868822350
# 2 2 id 1.516869e+12 ABC - ABC -
# 3 3 engine_date 1.516869e+12 1516868822000 - 1516868822000 -
# 4 4 blocked 1.516869e+12 80000 0 80000 0
# 5 5 updated_date 1.516869e+12 1516868822350 1516866877815 1516868822350 1516866877815
# 6 6 list 1.516869e+12 BCD - BCD -
# 7 7 sub_stat_1 1.516869e+12 AC-12 AC-14 Y YY
# 8 8 status_1 1.516869e+12 AC-25 AC-38 A CC
使用这些数据:
DF_1
# Sr.No. Stage Time Result Result_2 Final_1 Final_2
# 1 updated_date 1.516869e+12 1516868822361 1516868822350 1516868822361 1516868822350
# 2 id 1.516869e+12 ABC - ABC -
# 3 engine_date 1.516869e+12 1516868822000 - 1516868822000 -
# 4 blocked 1.516869e+12 80000 0 80000 0
# 5 updated_date 1.516869e+12 1516868822350 1516866877815 1516868822350 1516866877815
# 6 list 1.516869e+12 BCD - BCD -
# 7 sub_stat_1 1.516869e+12 AC-12 AC-14 Y YY
# 8 status_1 1.516869e+12 AC-25 AC-38 A CC
DF_1 = read.table(text = "Sr.No. Stage Time Result
Result_2
1 updated_date 1516868822411 1516868822361 1516868822350
2 id 1516868822411 ABC -
3 engine_date 1516868822411 1516868822000 -
4 blocked 1516868822411 80000 0
5 updated_date 1516868822398 1516868822350 1516866877815
6 list 1516868822398 BCD -
7 sub_stat_1 1516868779095 AC-12 AC-14
8 status_1 1516868642468 AC-25 AC-38", check.names = F, stringsAsFactors = FALSE, header = T)
DF_2 = read.table(text = "Sr.No. ID Type_1 Type_2
1 AC-12 X Y
2 AC-14 XX YY
3 AC-25 A B
4 AC-38 CC CD", check.names = F, stringsAsFactors = FALSE, header = T)
使用这些数据:
DF_1
# Sr.No. Stage Time Result Result_2 Final_1 Final_2
# 1 updated_date 1.516869e+12 1516868822361 1516868822350 1516868822361 1516868822350
# 2 id 1.516869e+12 ABC - ABC -
# 3 engine_date 1.516869e+12 1516868822000 - 1516868822000 -
# 4 blocked 1.516869e+12 80000 0 80000 0
# 5 updated_date 1.516869e+12 1516868822350 1516866877815 1516868822350 1516866877815
# 6 list 1.516869e+12 BCD - BCD -
# 7 sub_stat_1 1.516869e+12 AC-12 AC-14 Y YY
# 8 status_1 1.516869e+12 AC-25 AC-38 A CC
DF_1 = read.table(text = "Sr.No. Stage Time Result
Result_2
1 updated_date 1516868822411 1516868822361 1516868822350
2 id 1516868822411 ABC -
3 engine_date 1516868822411 1516868822000 -
4 blocked 1516868822411 80000 0
5 updated_date 1516868822398 1516868822350 1516866877815
6 list 1516868822398 BCD -
7 sub_stat_1 1516868779095 AC-12 AC-14
8 status_1 1516868642468 AC-25 AC-38", check.names = F, stringsAsFactors = FALSE, header = T)
DF_2 = read.table(text = "Sr.No. ID Type_1 Type_2
1 AC-12 X Y
2 AC-14 XX YY
3 AC-25 A B
4 AC-38 CC CD", check.names = F, stringsAsFactors = FALSE, header = T)
“比vlookup结果和类型_2的结果_2”vlookup是excel函数(对吗?)。它是R中的合并。如果结果和结果2在同一个数据框中,你如何合并它们?@Mislav是的,这是一个excel函数,但用vlookup的意思是,我想要在R中使用该函数以获得所需的输出。你可以使用?match
@Gregor你能解释一下如何使用满足我所有条件的if-else条件来编写它吗如上所述。@Mislav我不必合并,因为对于每个id,类型_1和类型_2中都会有静态值,我只想在DF_1 Time_2、Final_1和Final_2中添加三列,其中Time_2给出DF_1的历元时间的标准格式,Final_1中相同,Final_2将具有与Result和Result_2相同的值,除了(大纪元时间,status_1和sub_stat_1)其中status_1和sub_stat_1值根据id来自DF_2。为什么使用Sr.No.?因为您的问题使其看起来像列名的一部分。这应该很容易让您适应实际数据。使用dput()
如果您希望答案具有完全相同的输入,则提供可复制/粘贴的数据。很抱歉,这是我的错误,现在我已修改了答案。它将为您提供数据帧的确切概念。尽管收到此错误,但仍能正常工作:警告消息:In[在sub_stat_1和status_1不可用的情况下获取此错误,并在Final_1和Final_2中给出一些随机值(即Final_1=1516868822361和Final_2=1516868822350)
DF_1
# Sr.No. Stage Time Result Result_2 Final_1 Final_2
# 1 updated_date 1.516869e+12 1516868822361 1516868822350 1516868822361 1516868822350
# 2 id 1.516869e+12 ABC - ABC -
# 3 engine_date 1.516869e+12 1516868822000 - 1516868822000 -
# 4 blocked 1.516869e+12 80000 0 80000 0
# 5 updated_date 1.516869e+12 1516868822350 1516866877815 1516868822350 1516866877815
# 6 list 1.516869e+12 BCD - BCD -
# 7 sub_stat_1 1.516869e+12 AC-12 AC-14 Y YY
# 8 status_1 1.516869e+12 AC-25 AC-38 A CC
DF_1 = read.table(text = "Sr.No. Stage Time Result
Result_2
1 updated_date 1516868822411 1516868822361 1516868822350
2 id 1516868822411 ABC -
3 engine_date 1516868822411 1516868822000 -
4 blocked 1516868822411 80000 0
5 updated_date 1516868822398 1516868822350 1516866877815
6 list 1516868822398 BCD -
7 sub_stat_1 1516868779095 AC-12 AC-14
8 status_1 1516868642468 AC-25 AC-38", check.names = F, stringsAsFactors = FALSE, header = T)
DF_2 = read.table(text = "Sr.No. ID Type_1 Type_2
1 AC-12 X Y
2 AC-14 XX YY
3 AC-25 A B
4 AC-38 CC CD", check.names = F, stringsAsFactors = FALSE, header = T)