Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/71.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 通过将函数应用于数据框中的另一列来创建新列_R_Dplyr_Apply - Fatal编程技术网

R 通过将函数应用于数据框中的另一列来创建新列

R 通过将函数应用于数据框中的另一列来创建新列,r,dplyr,apply,R,Dplyr,Apply,这是我正在处理的数据的一个示例数据帧。对于熟悉基因数据格式的人来说,它基本上是一个修改过的VCF文件。如果不是,则基本上每一行都包含基因组中可能存在变体的位置信息 samp <- structure(list(Chrom = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "chr12", class = "factor"), Pos = c(8613204L, 8613412L

这是我正在处理的数据的一个示例数据帧。对于熟悉基因数据格式的人来说,它基本上是一个修改过的VCF文件。如果不是,则基本上每一行都包含基因组中可能存在变体的位置信息

samp <- structure(list(Chrom = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L), .Label = "chr12", class = "factor"), 
    Pos = c(8613204L, 8613412L, 8614238L, 8614506L, 8614652L, 
    8614669L, 8614768L, 8614951L, 8614986L, 8615225L, 8615809L, 
    8616149L, 8616392L), Ref = structure(c(1L, 1L, 4L, 3L, 3L, 
    3L, 2L, 3L, 2L, 4L, 2L, 4L, 3L), .Label = c("A", "C", "G", 
    "T"), class = "factor"), Alt = structure(c(3L, 2L, 2L, 1L, 
    1L, 1L, 3L, 1L, 1L, 3L, 4L, 2L, 4L), .Label = c("A", "C", 
    "G", "T"), class = "factor"), Info = c("AC=3913;AF=0.78135;AN=5008;NS=2504;DP=-128;EAS_AF=0.9921;AMR_AF=0.8357;AFR_AF=0.5779;EUR_AF=0.7366;SAS_AF=0.8466;AA=G|||;CSQ=G|ENSG00000205846|ENST00000382073|Transcript|intron_variant||||||||1||||||;GENCODE=ENST00000382073;FUNSEQ=0.1881", 
    "AC=4051;AF=0.808906;AN=5008;NS=2504;DP=-128;EAS_AF=0.9921;AMR_AF=0.8444;AFR_AF=0.6725;EUR_AF=0.7366;SAS_AF=0.8538;AA=C|||;CSQ=C|ENSG00000205846|ENST00000382073|Transcript|intron_variant||||||||1||||||;GENCODE=ENST00000382073;FUNSEQ=0.1881", 
    "AC=4021;AF=0.802915;AN=5008;NS=2504;DP=-128;EAS_AF=0.9921;AMR_AF=0.8415;AFR_AF=0.6558;EUR_AF=0.7376;SAS_AF=0.8466;AA=T|||;CSQ=C|ENSG00000205846|ENST00000382073|Transcript|intron_variant||||||||1||||||;GENCODE=ENST00000382073;FUNSEQ=0.7997", 
    "AC=3990;AF=0.796725;AN=5008;NS=2504;DP=-128;EAS_AF=0.9921;AMR_AF=0.8386;AFR_AF=0.6339;EUR_AF=0.7376;SAS_AF=0.8466;AA=A|||;CSQ=A|ENSG00000205846|ENST00000382073|Transcript|intron_variant||||||||1||||||;GENCODE=ENST00000382073;FUNSEQ=0.1881", 
    "AC=4069;AF=0.8125;AN=5008;NS=2504;DP=17188;EAS_AF=0.9921;AMR_AF=0.8487;AFR_AF=0.6528;EUR_AF=0.7714;SAS_AF=0.8599;AA=A|||;CSQ=A|ENSG00000205846|ENST00000382073|Transcript|intron_variant||||||||1||||||;GENCODE=ENST00000382073;FUNSEQ=0.0029", 
    "AC=4044;AF=0.807508;AN=5008;NS=2504;DP=-128;EAS_AF=0.9911;AMR_AF=0.8458;AFR_AF=0.6362;EUR_AF=0.7714;SAS_AF=0.8599;AA=G|||;CSQ=A|ENSG00000205846|ENST00000382073|Transcript|intron_variant||||||||1||||||;GENCODE=ENST00000382073;FUNSEQ=0.0029", 
    NA, NA, "AC=3795;AF=0.757788;AN=5008;NS=2504;DP=-128;EAS_AF=0.9653;AMR_AF=0.7954;AFR_AF=0.5651;EUR_AF=0.7167;SAS_AF=0.82;AA=c|||;CSQ=A|ENSG00000205846|ENST00000382073|Transcript|intron_variant||||||||1||||||;GENCODE=ENST00000382073;FUNSEQ=0.0029", 
    NA, "AC=4053;AF=0.809305;AN=5008;NS=2504;DP=-128;EAS_AF=0.9921;AMR_AF=0.8458;AFR_AF=0.6362;EUR_AF=0.7724;SAS_AF=0.8671;AA=C|||;CSQ=T|ENSG00000205846|ENST00000382073|Transcript|intron_variant||||||||1||||||;GENCODE=ENST00000382073;FUNSEQ=0.0029", 
    "AC=4076;AF=0.813898;AN=5008;NS=2504;DP=-128;EAS_AF=0.9921;AMR_AF=0.8473;AFR_AF=0.6528;EUR_AF=0.7724;SAS_AF=0.8671;AA=C|||;CSQ=C|ENSG00000205846|ENST00000382073|Transcript|intron_variant||||||||1||||||;GENCODE=ENST00000382073;FUNSEQ=0.0029", 
    "AC=4052;AF=0.809105;AN=5008;NS=2504;DP=-128;EAS_AF=0.9921;AMR_AF=0.8473;AFR_AF=0.6346;EUR_AF=0.7724;SAS_AF=0.8671;AA=T|||;CSQ=T|ENSG00000205846|ENST00000382073|Transcript|intron_variant||||||||1||||||;GENCODE=ENST00000382073;FUNSEQ=0.0029"
    ), TG_rs = c("rs10770739", "rs10770740", "rs4883148", "rs4883149", 
    "rs4883150", "rs4883151", NA, NA, "rs7303948", NA, "rs4242889", 
    "rs4883154", "rs4242890")), row.names = c(NA, -13L), .Names = c("Chrom", 
"Pos", "Ref", "Alt", "Info", "TG_rs"), class = "data.frame")
但是,我希望对dataframe的每一行执行此操作,并创建一个包含数据的新列。当我使用dplyr的mutate函数时,我得到了一个值相同的列:

library("dplyr")
mutate(samp, AFR_AF = extractAF("AFR_AF", Info))
我读过一篇帖子(我现在似乎找不到,否则我会引用它),上面说mutate一次传递所有的行,而不是我需要的一行一行地传递

基于此,我尝试了以下几种其他方法:

应用中出错(samp[,“Info”],1,函数(x)extractAF(“AMR_AF”,x)): 尺寸(X)必须具有正长度

samp[, extractAF("AMR_AF", Info), by = .I]
[.data.frame
中出错(samp,extractAF(“AMR_AF”,Info),by=.I): 未使用的参数(by=.I)

# 更新

在下面的信息列中包含NA和AF=0的其他示例数据集:

结构(列表(CHROM=c(“chr1”、“chr1”、“chr1”、“chr1”、“chr1”), “chr1”),位置=c(16090898L、16091074L、16091583L、16092212L、, 16093560L,16093639L),ID=c(“rs6429774”,“rs6429776”,NA, “rs74528955”,“rs904912”,NA),参考=c(“G”,“A”,“T”,“c”,“T”,“c”), ALT=c(“A”、“G”、“A”、“T”、“A”、“T”)、QUAL=c(NA、NA、NA、NA、, NA),过滤器=c(NA,NA,NA,NA,NA),信息= c(AC=1606;AF=0.1606;F=0.3206 6 6 6 6 6 6 7 7;A=0.6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6;F=0.6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 A | |近端| 1216 |调节|近端|增强子;FUNSEQ=0.3335“,“AC=1690;AF=0.1690;AF=0.33746;A=0.33746;A=0.1690;A=0.1690;A=0.1690;F=0.1690;F=0.1690;F=0.1690;A=0.33746;A=0.1696;A=0.6;A=0.8;A=0.8;N=2508;N=2504;N=2504;N=2504;N=4;DP=4=4;DP=4;DP=20247=20247;DP=20247;AD47;EAS=0.1497;AD0.1497;AD0.1497;EAS=0.1497;AS0.1490.149=0.149=0.149=0.14;AD0.149=0.149=0.149=0.149=0.149=0.149=0.14;ADA=G | |近端| 1216 |调节|近端|增强子;FUNSEQ=0.3335“,NA, “AC=8;AF=0.00159744;A=0.8;F=0.00159744;A=8;A=8;A=8;AC=8;AF=8;AF=8;AF=0;AF=0.8;AF=0.8;AF=0.8;AF=0.8;F=0.8;F=0.8;F=0.8;A=0.8;A=0.8;A=8;A=8.8;A=8.8;A=8.8;A=8.8;A=8;A=8.8;A=8.8;A=8;A=8.8.8.8.8;A=8=8;A=8;AF=8;A=8;A=0.0.0.0.0079;A=8;A=8;A=8;A=8;A=0.0.0.0.0.0079;AC=8;A=8;近端|调节|特征|近端|增强子;FUNSEQ=0.3335“AC=3282;AF=0.655351;A=3282;A=0.655351;A=500 8;N=5008;N=2504;DP=14721;EAS-U AF=0.8343;AMR-U=0.2882;AC=0.2882;A=0.655382;AC=0.2882;A=0.6551;A=0.538;N=2508;DP=14721;DP=14721;EAS-U AF=0.8343;AMR-U AF=0.8343;AMR-U AF=0.8316;AMR-U=0.6916;AMR-U AF=0.6916;AMR=0.6916;AFR=0.16;AFR=0.6916;AFR=0.16;AFR=0.16;AFR=0.6;AFR=0.6=0.6=0.6;AFR=0.4259;AF0 FUNSEQ=0.1483“, “AC=5;AF=0.000998403;AN=5008;NS=2504;DP=14736;EAS_AF=0.003;AMR_AF=0;AFR_AF=0;EUR_AF=0;SAS_AF=0.002;AA=C||||||||;CSQ=T|ENG000001624458 | ENST000041801 |转录本|内含子|变体