Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/276.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 尝试设置阈值时清空数据帧_Python_Pandas_Dataframe_Bioinformatics - Fatal编程技术网

Python 尝试设置阈值时清空数据帧

Python 尝试设置阈值时清空数据帧,python,pandas,dataframe,bioinformatics,Python,Pandas,Dataframe,Bioinformatics,我正在尝试对包含基因id和统计信息的熊猫数据帧进行阈值设置。python程序的输入是一个config.yaml文件,它保存初始阈值和CSV文件(最终数据帧)的路径。我似乎遇到的问题源于将阈值变量传递到“缩减”数据帧中。我能够在使用整数值时成功地设置阈值(在不推荐使用的方法中),但在尝试使用指向配置文件中的值的变量设置阈值时,我会收到一个空数据帧 以下是我目前的执行情况: config = yaml.full_load(file) # for item, doc in config

我正在尝试对包含基因id和统计信息的熊猫数据帧进行阈值设置。python程序的输入是一个config.yaml文件,它保存初始阈值和CSV文件(最终数据帧)的路径。我似乎遇到的问题源于将阈值变量传递到“缩减”数据帧中。我能够在使用整数值时成功地设置阈值(在不推荐使用的方法中),但在尝试使用指向配置文件中的值的变量设置阈值时,我会收到一个空数据帧

以下是我目前的执行情况:

    config = yaml.full_load(file)
    # for item, doc in config.items():
    # print (item, ":", doc)
    input_path = config['DESeq_input']['path']
    # print(input_path)
    baseMean = config['baseMean']
    log2FoldChange = config['log2FoldChange']
    lfcSE = config['lfcSE']
    pvalue = config['pvalue']
    padj = config['padj']
    df = pd.read_csv(input_path)
    # print if 0 < than padj for test
    # convert to #, most likely being read as string
    # now use threshold value to cut down CSV
    # only columns defined in config.yaml file
    df_select = df[['genes', 'baseMean', 'log2FoldChange', 'lfcSE', 'pvalue', 'padj']]
    # print(df_select)
    # print(df_select['genes'])
    df_threshold = df_select.loc[(df_select['baseMean'] < baseMean)
                                     & (df_select['log2FoldChange'] < log2FoldChange)
                                     & (df_select['lfcSE'] < lfcSE)
                                     & (df_select['pvalue'] < pvalue)
                                     & (df_select['padj'] < padj)]
    print(df_threshold)

事实证明,我的阈值太严格了(我添加了两个在我的原始实现中不存在的额外变量)。我现在收到一个填充的数据帧。

您是否检查了用作阈值的变量的数据类型?我对变量执行了一些数学运算,以检查它们是否作为字符串而不是int传入,下面是config.yaml文件baseMean:20.0 log2FoldChange:1.0 lfcSE:0.5 pvalue:0.0001 padj:0.05中定义的值。您能提供一个阈值化的数据帧示例吗?当然,我会将其添加到主帖子中
df = pd.read_csv('/Users/nmaki/Documents/GitHub/IDEA/tests/eDESeq2.csv')
df_select = df[['genes', 'pvalue', 'padj', 'log2FoldChange']]
df_threshold = df_select.loc[(df_select['pvalue'] < 0.05) 
                           & (df_select['padj'] < 0.1) 
                           & (df_select['log2FoldChange'] < 0.5)]
print(df_threshold)
"genes","baseMean","log2FoldChange","lfcSE","stat","pvalue","padj"
"ENSDARG00000000001",98.1095154977918,-0.134947665995593,0.306793322887575,-0.439865068527078,0.660034837008121,0.93904992415549
"ENSDARG00000000002",731.125841719954,0.666095249996351,0.161764851506172,4.11767602043598,3.82712199388831e-05,0.00235539468663284
"ENSDARG00000000018",367.699187187462,-0.170546910862128,0.147128047078344,-1.1591733476304,0.246385533026112,0.756573630543937
"ENSDARG00000000019",1133.08821430092,-0.131148919306121,0.104742185100469,-1.25211173683576,0.210529151546469,0.718240791187956
"ENSDARG00000000068",397.13408030651,-0.111332941901299,0.161417383863387,-0.689720891496564,0.49036972534723,0.8864754582597
"ENSDARG00000000069",1886.21783387126,-0.107901197025113,0.113522109960702,-0.950486183374019,0.341865271089735,0.82295928359482
"ENSDARG00000000086",246.197553048504,0.390421091410488,0.215725761369183,1.80980282063921,0.0703263703690051,0.466064880589034
"ENSDARG00000000103",797.782152145232,0.236382332789599,0.145111727277908,1.62896781138092,0.103319833277229,0.550658656731341
"ENSDARG00000000142",26.1411622212853,0.248419645848534,0.495298350652519,0.501555568519983,0.615980180267141,0.927327861190167
"ENSDARG00000000151",121.397701922367,0.276123125224845,0.244276041791451,1.13037333993066,0.25831894300396,0.766841249972654
"ENSDARG00000000161",22.2863001989718,0.837640942615127,0.542200061816621,1.54489274643135,0.122372208261173,0.587106227452529
"ENSDARG00000000183",215.47910609869,0.567221763062732,0.188807351259458,3.00423558340829,0.00266249076445763,0.0615311290935424
"ENSDARG00000000189",620.819069705942,0.0525797819665496,0.142171888686286,0.369832478504743,0.711507313969775,0.950479626809728
"ENSDARG00000000212",54472.1417532637,0.344813324409911,0.130070467015575,2.65097321722249,0.00802602056136946,0.132041563800088
"ENSDARG00000000229",172.985864037855,-0.0814838221355631,0.22200915791162,-0.367029103222856,0.713597309421024,0.95157821096128
"ENSDARG00000000241",511.449190233542,-0.431854805500191,0.157764756166574,-2.73733383801019,0.0061939401710654,0.114238610824236
"ENSDARG00000000324",179.189751392247,0.0141623609187069,0.206197755704643,0.0686833902256096,0.945241639658214,0.992706066946251
"ENSDARG00000000349",13.6578995386995,0.86981405362392,0.716688718472183,1.21365668414338,0.224878851627296,0.731932542953245
"ENSDARG00000000369",9.43959070533812,-0.042383076946964,0.868977019485631,-0.0487735302506061,0.961099776861288,NA
"ENSDARG00000000370",129.006520833067,0.619490133053518,0.250960632807829,2.46847533863165,0.0135690001510168,0.184768676917612
"ENSDARG00000000380",17.695581482726,-0.638493654324115,0.597289695632778,-1.06898488119351,0.285076482019819,0.786103920659844
"ENSDARG00000000394",2200.41651475378,-0.00605761754099435,0.0915611724486909,-0.0661592395443486,0.947251047773153,0.992978480118812
"ENSDARG00000000423",195.477813443242,-0.18634265895713,0.188820984694016,-0.986874733542448,0.323704052061987,0.810439992736898
"ENSDARG00000000442",1102.47980192551,0.0589654622770368,0.112333519273845,0.524914225586502,0.599642819781172,0.920807266898811
"ENSDARG00000000460",8.52822266110357,0.229130838495461,0.957763036484278,0.239235416034165,0.810923041830713,NA
"ENSDARG00000000472",0.840917787550721,-0.4234502342491,3.1634759582284,-0.133855998857105,0.893516444899853,NA
"ENSDARG00000000474",5.12612778660879,0.394871266508097,1.07671345623418,0.366737560696199,0.713814786364707,NA
"ENSDARG00000000476",75.8417047936895,0.242006157627571,0.349451220882324,0.692532013528336,0.488603288756242,0.885874315527816
"ENSDARG00000000489",1233.33364888202,0.0676458807753533,0.131846296650645,0.513066217965876,0.607905001380741,0.924392802283811