Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/282.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 读取数据帧中的序列化json_Python_Json_Pandas - Fatal编程技术网

Python 读取数据帧中的序列化json

Python 读取数据帧中的序列化json,python,json,pandas,Python,Json,Pandas,我想从熊猫数据帧中的文件中读取双重序列化的json对象。 json的示例如下: {"input":"8\t140630920\t.\tC\tT\t840.948\t.","assembly_name":"GRCh37","end":140630920,"seq_region_name":"8","transcript_consequences":[{"source":"Ensembl","variant_allele":"T","cdna_end":770,"phenotypes":[{"sou

我想从熊猫数据帧中的文件中读取双重序列化的
json
对象。
json
的示例如下:

{"input":"8\t140630920\t.\tC\tT\t840.948\t.","assembly_name":"GRCh37","end":140630920,"seq_region_name":"8","transcript_consequences":[{"source":"Ensembl","variant_allele":"T","cdna_end":770,"phenotypes":[{"source":"MIM_disease","end":140715299,"seq_region_name":"8","attrib_type":"Gene","external_id":612292,"strand":"-","phenotype":"BIRK-BAREL MENTAL RETARDATION DYSMORPHISM SYNDROME","type":"Gene","id":"ENSG00000169427","start":140613081},{"source":"OMIM","risk_allele":1,"end":140630920,"seq_region_name":"8","strand":"+","phenotype":"BIRK-BAREL SYNDROME","associated_gene":"KCNK9","variation_names":"rs121908332","type":"Variation","id":"rs121908332","start":140630920},{"source":"ClinVar","clinvar_clin_sig":"pathogenic","review_status":"no assertion criteria provided","risk_allele":"T","end":140630920,"seq_region_name":"8","external_id":"RCV000005007.1","associated_gene":"KCNK9","phenotype":"Birk Barel mental retardation dysmorphism syndrome","strand":"+","type":"Variation","id":"rs121908332","start":140630920}],"codons":"Ggg/Agg","protein_end":236,"strand":-1,"amino_acids":"G/R","cdna_start":770,"transcript_id":"ENST00000520439","cds_start":706,"gene_id":"ENSG00000169427","protein_start":236,"cds_end":706,"consequence_terms":["missense_variant"],"impact":"MODERATE"},{"source":"RefSeq","variant_allele":"T","cdna_end":770,"phenotypes":[{"source":"MIM_disease","end":140715299,"seq_region_name":"8","attrib_type":"Gene","external_id":612292,"strand":"-","phenotype":"BIRK-BAREL MENTAL RETARDATION DYSMORPHISM SYNDROME","type":"Gene","id":"ENSG00000169427","start":140613081},{"source":"OMIM","risk_allele":1,"end":140630920,"seq_region_name":"8","strand":"+","phenotype":"BIRK-BAREL SYNDROME","associated_gene":"KCNK9","variation_names":"rs121908332","type":"Variation","id":"rs121908332","start":140630920},{"source":"ClinVar","clinvar_clin_sig":"pathogenic","review_status":"no assertion criteria provided","risk_allele":"T","end":140630920,"seq_region_name":"8","external_id":"RCV000005007.1","associated_gene":"KCNK9","phenotype":"Birk Barel mental retardation dysmorphism syndrome","strand":"+","type":"Variation","id":"rs121908332","start":140630920}],"codons":"Ggg/Agg","protein_end":236,"strand":-1,"amino_acids":"G/R","cdna_start":770,"transcript_id":"NM_016601.2","cds_start":706,"gene_id":51305,"protein_start":236,"cds_end":706,"consequence_terms":["missense_variant"],"impact":"MODERATE"},{"source":"RefSeq","variant_allele":"T","cdna_end":764,"phenotypes":[{"source":"MIM_disease","end":140715299,"seq_region_name":"8","attrib_type":"Gene","external_id":612292,"strand":"-","phenotype":"BIRK-BAREL MENTAL RETARDATION DYSMORPHISM SYNDROME","type":"Gene","id":"ENSG00000169427","start":140613081},{"source":"OMIM","risk_allele":1,"end":140630920,"seq_region_name":"8","strand":"+","phenotype":"BIRK-BAREL SYNDROME","associated_gene":"KCNK9","variation_names":"rs121908332","type":"Variation","id":"rs121908332","start":140630920},{"source":"ClinVar","clinvar_clin_sig":"pathogenic","review_status":"no assertion criteria provided","risk_allele":"T","end":140630920,"seq_region_name":"8","external_id":"RCV000005007.1","associated_gene":"KCNK9","phenotype":"Birk Barel mental retardation dysmorphism syndrome","strand":"+","type":"Variation","id":"rs121908332","start":140630920}],"codons":"Ggg/Agg","protein_end":236,"strand":-1,"amino_acids":"G/R","cdna_start":764,"transcript_id":"XM_005250954.1","cds_start":706,"gene_id":51305,"protein_start":236,"cds_end":706,"consequence_terms":["missense_variant"],"impact":"MODERATE"},{"source":"Ensembl","variant_allele":"T","cdna_end":755,"phenotypes":[{"source":"MIM_disease","end":140715299,"seq_region_name":"8","attrib_type":"Gene","external_id":612292,"strand":"-","phenotype":"BIRK-BAREL MENTAL RETARDATION DYSMORPHISM SYNDROME","type":"Gene","id":"ENSG00000169427","start":140613081},{"source":"OMIM","risk_allele":1,"end":140630920,"seq_region_name":"8","strand":"+","phenotype":"BIRK-BAREL SYNDROME","associated_gene":"KCNK9","variation_names":"rs121908332","type":"Variation","id":"rs121908332","start":140630920},{"source":"ClinVar","clinvar_clin_sig":"pathogenic","review_status":"no assertion criteria provided","risk_allele":"T","end":140630920,"seq_region_name":"8","external_id":"RCV000005007.1","associated_gene":"KCNK9","phenotype":"Birk Barel mental retardation dysmorphism syndrome","strand":"+","type":"Variation","id":"rs121908332","start":140630920}],"codons":"Ggg/Agg","protein_end":236,"strand":-1,"amino_acids":"G/R","cdna_start":755,"transcript_id":"ENST00000522317","cds_start":706,"gene_id":"ENSG00000169427","protein_start":236,"cds_end":706,"consequence_terms":["missense_variant","NMD_transcript_variant"],"impact":"MODERATE"},{"source":"Ensembl","variant_allele":"T","cdna_end":770,"phenotypes":[{"source":"MIM_disease","end":140715299,"seq_region_name":"8","attrib_type":"Gene","external_id":612292,"strand":"-","phenotype":"BIRK-BAREL MENTAL RETARDATION DYSMORPHISM SYNDROME","type":"Gene","id":"ENSG00000169427","start":140613081},{"source":"OMIM","risk_allele":1,"end":140630920,"seq_region_name":"8","strand":"+","phenotype":"BIRK-BAREL SYNDROME","associated_gene":"KCNK9","variation_names":"rs121908332","type":"Variation","id":"rs121908332","start":140630920},{"source":"ClinVar","clinvar_clin_sig":"pathogenic","review_status":"no assertion criteria provided","risk_allele":"T","end":140630920,"seq_region_name":"8","external_id":"RCV000005007.1","associated_gene":"KCNK9","phenotype":"Birk Barel mental retardation dysmorphism syndrome","strand":"+","type":"Variation","id":"rs121908332","start":140630920}],"codons":"Ggg/Agg","protein_end":236,"strand":-1,"amino_acids":"G/R","cdna_start":770,"transcript_id":"ENST00000303015","cds_start":706,"gene_id":"ENSG00000169427","protein_start":236,"cds_end":706,"consequence_terms":["missense_variant"],"impact":"MODERATE"},{"gene_id":"ENSG00000169427","source":"Ensembl","distance":1672,"variant_allele":"T","phenotypes":[{"source":"MIM_disease","end":140715299,"seq_region_name":"8","attrib_type":"Gene","external_id":612292,"strand":"-","phenotype":"BIRK-BAREL MENTAL RETARDATION DYSMORPHISM SYNDROME","type":"Gene","id":"ENSG00000169427","start":140613081},{"source":"OMIM","risk_allele":1,"end":140630920,"seq_region_name":"8","strand":"+","phenotype":"BIRK-BAREL SYNDROME","associated_gene":"KCNK9","variation_names":"rs121908332","type":"Variation","id":"rs121908332","start":140630920},{"source":"ClinVar","clinvar_clin_sig":"pathogenic","review_status":"no assertion criteria provided","risk_allele":"T","end":140630920,"seq_region_name":"8","external_id":"RCV000005007.1","associated_gene":"KCNK9","phenotype":"Birk Barel mental retardation dysmorphism syndrome","strand":"+","type":"Variation","id":"rs121908332","start":140630920}],"consequence_terms":["upstream_gene_variant"],"strand":-1,"transcript_id":"ENST00000523477","impact":"MODIFIER"},{"gene_id":"ENSG00000169427","source":"Ensembl","distance":2630,"variant_allele":"T","phenotypes":[{"source":"MIM_disease","end":140715299,"seq_region_name":"8","attrib_type":"Gene","external_id":612292,"strand":"-","phenotype":"BIRK-BAREL MENTAL RETARDATION DYSMORPHISM SYNDROME","type":"Gene","id":"ENSG00000169427","start":140613081},{"source":"OMIM","risk_allele":1,"end":140630920,"seq_region_name":"8","strand":"+","phenotype":"BIRK-BAREL SYNDROME","associated_gene":"KCNK9","variation_names":"rs121908332","type":"Variation","id":"rs121908332","start":140630920},{"source":"ClinVar","clinvar_clin_sig":"pathogenic","review_status":"no assertion criteria provided","risk_allele":"T","end":140630920,"seq_region_name":"8","external_id":"RCV000005007.1","associated_gene":"KCNK9","phenotype":"Birk Barel mental retardation dysmorphism syndrome","strand":"+","type":"Variation","id":"rs121908332","start":140630920}],"consequence_terms":["upstream_gene_variant"],"strand":-1,"transcript_id":"ENST00000519923","impact":"MODIFIER"}],"strand":1,"id":"8_140630920_C/T","allele_string":"C/T","most_severe_consequence":"missense_variant","start":140630920}
在pandas数据框中使用

dft = pd.read_json(filename, lines = True)
结果如下表所示:,

但是,我想从
['transcript\u responses']
列中提取信息,也可以从
['transcript\u responses']
列中的
['ephentices']
中提取信息


如何在pandas数据框中实现这一点?

一个选项可以是:

>>> import pandas as pd
>>> jsona = pd.read_json('jsona.json') #here the file is named 'jsona.json'
>>> transcript_consequences = jsona['transcript_consequences'].apply(pd.Series)
>>> phenotypes0 = pd.DataFrame(transcript_consequences.phenotypes[0]) #and so on

>>> isinstance(phenotypes0,pd.DataFrame)
True
>>> isinstance(transcript_consequences,pd.DataFrame)
True
>>> isinstance(jsona,pd.DataFrame)
True 

#in order to get one dataframe, concatenate and pass the dataframes in a list, like so:
>>> pd.concat([transcript_consequences, phenotypes0], axis=1) #with more elements (more phenotypes) add them to the list
在这里,我不确定这是否是有意的,但似乎
转录本后果[0]
转录本后果[6]
是相同的

您可以执行以下操作:

>>> import pandas as pd
>>> jsona = pd.read_json('jsona.json') #here the file is named 'jsona.json'
>>> transcript_consequences = jsona['transcript_consequences'].apply(pd.Series)
>>> phenotypes0 = pd.DataFrame(transcript_consequences.phenotypes[0]) #and so on

>>> isinstance(phenotypes0,pd.DataFrame)
True
>>> isinstance(transcript_consequences,pd.DataFrame)
True
>>> isinstance(jsona,pd.DataFrame)
True 

#in order to get one dataframe, concatenate and pass the dataframes in a list, like so:
>>> pd.concat([transcript_consequences, phenotypes0], axis=1) #with more elements (more phenotypes) add them to the list

“双重序列化”是什么意思?@John Zwinck我已经给出了一个我想要处理的json示例。希望这能有所帮助。Thanks@nilesh我的建议有效吗?您的实际数据是否有任何特殊差异?@erasmortg主要问题是如何在生成时连接这两个数据帧(转录本和表型0…6)?有什么想法吗??Thanks@nilesh请参见编辑,这可能更接近预期输出?