Python 0409非空对象 国籍20409非空对象 notes 20409非空对象 numcases 20409非空对象 patientnumber 20409非空对象 source1 20409非空对象 source2 20409非空对象 source3 2040
0409非空对象 国籍20409非空对象 notes 20409非空对象 numcases 20409非空对象 patientnumber 20409非空对象 source1 20409非空对象 source2 20409非空对象 source3 20409非空对象 状态码20409非空对象 statepatientnumber 20409非空对象 statuschangedate 20409非空对象 typeoftransmission 20409非空对象 数据类型:对象(20) 内存使用率:3.1+MB UsePython 0409非空对象 国籍20409非空对象 notes 20409非空对象 numcases 20409非空对象 patientnumber 20409非空对象 source1 20409非空对象 source2 20409非空对象 source3 2040,python,python-3.x,pandas,python-2.7,data-science,Python,Python 3.x,Pandas,Python 2.7,Data Science,0409非空对象 国籍20409非空对象 notes 20409非空对象 numcases 20409非空对象 patientnumber 20409非空对象 source1 20409非空对象 source2 20409非空对象 source3 20409非空对象 状态码20409非空对象 statepatientnumber 20409非空对象 statuschangedate 20409非空对象 typeoftransmission 20409非空对象 数据类型:对象(20) 内存使用率:
j=r.json()['raw\u data']
Usej=r.json()['raw\u data']
import pandas as pd
import requests
r = requests.get('https://api.covid19india.org/raw_data5.json')
j = r.json()
df = pd.DataFrame.from_dict(j)
raw_data
0 {'agebracket': '', 'contractedfromwhichpatient...
1 {'agebracket': '', 'contractedfromwhichpatient...
2 {'agebracket': '', 'contractedfromwhichpatient...
3 {'agebracket': '', 'contractedfromwhichpatient...
4 {'agebracket': '', 'contractedfromwhichpatient...
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20409 entries, 0 to 20408
Data columns (total 1 columns):
raw_data 20409 non-null object
dtypes: object(1)
memory usage: 159.5+ KB
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20409 entries, 0 to 20408
Data columns (total 20 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 agebracket 20409 non-null object
1 contractedfromwhichpatientsuspected 20409 non-null object
2 currentstatus 20409 non-null object
3 dateannounced 20409 non-null object
4 detectedcity 20409 non-null object
5 detecteddistrict 20409 non-null object
6 detectedstate 20409 non-null object
7 entryid 20409 non-null object
8 gender 20409 non-null object
9 nationality 20409 non-null object
10 notes 20409 non-null object
11 numcases 20409 non-null object
12 patientnumber 20409 non-null object
13 source1 20409 non-null object
14 source2 20409 non-null object
15 source3 20409 non-null object
16 statecode 20409 non-null object
17 statepatientnumber 20409 non-null object
18 statuschangedate 20409 non-null object
19 typeoftransmission 20409 non-null object
dtypes: object(20)
memory usage: 3.1+ MB
df = df['raw_data'].apply(pd.Series)
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20409 entries, 0 to 20408
Data columns (total 20 columns):
agebracket 20409 non-null object
contractedfromwhichpatientsuspected 20409 non-null object
currentstatus 20409 non-null object
dateannounced 20409 non-null object
detectedcity 20409 non-null object
detecteddistrict 20409 non-null object
detectedstate 20409 non-null object
entryid 20409 non-null object
gender 20409 non-null object
nationality 20409 non-null object
notes 20409 non-null object
numcases 20409 non-null object
patientnumber 20409 non-null object
source1 20409 non-null object
source2 20409 non-null object
source3 20409 non-null object
statecode 20409 non-null object
statepatientnumber 20409 non-null object
statuschangedate 20409 non-null object
typeoftransmission 20409 non-null object
dtypes: object(20)
memory usage: 3.1+ MB