如何将Excel电子表格中满足特定条件的行读入Python?
我需要通过读取包含近50000行和81列的Excel电子表格,在Python中创建一个pandas数据框。该文件包含各种医疗专业人员的信息:医生、护士、护士从业者等。我只想在“PROFTYPE”列的值为“NURSEPRACT”的行中读取 我使用的是Python3.73,我已经读入了整个文件,然后我用PROFTYPE列对其进行了裁剪;但是读入的时间太长了。我只想读那些PROFTYPE=='NURSEPRACT'的行如何将Excel电子表格中满足特定条件的行读入Python?,python,excel,pandas,Python,Excel,Pandas,我需要通过读取包含近50000行和81列的Excel电子表格,在Python中创建一个pandas数据框。该文件包含各种医疗专业人员的信息:医生、护士、护士从业者等。我只想在“PROFTYPE”列的值为“NURSEPRACT”的行中读取 我使用的是Python3.73,我已经读入了整个文件,然后我用PROFTYPE列对其进行了裁剪;但是读入的时间太长了。我只想读那些PROFTYPE=='NURSEPRACT'的行 df\u np=pd.read\u excel(SourceFile,sheetn
df\u np=pd.read\u excel(SourceFile,sheetname='Data',header=0)
df_np=df_np[df_np['PROFTYPE']=='NURSEPRACT']
这段代码实际上是有效的,但那是因为我先读取整个文件。实际上,我只对那些满足PROFTYPE='NURSEPRACT'条件的内容感兴趣。一个想法是你可以
df = pd.read_excel(SourceFile,
sheet_name='Data',
header=0,
usecols=['PROFTYPE']) # <-- Load just 'PROFTYPE' of the following table
# ID PROFTYPE YEARS_IN_PRACTICE
# 1234 NURSEPRACT 12
# 43 NURSE 32
# 789 NURSEPRACT 4
# 34 PHYSICIAN 2
# 93 NURSEPRACT 13
row_numbers = [x+1 for x in df[df['PROFTYPE'] != 'NURSEPRACT'].index]
df = pd.read_excel(SourceFile, sheet_name='Data', header=0, skiprows=row_numbers)
# ID PROFTYPE YEARS_IN_PRACTICE
# 1234 NURSEPRACT 12
# 789 NURSEPRACT 4
# 93 NURSEPRACT 13
df=pd.read\u excel(源文件,
工作表_name='Data',
标题=0,
usecols=['PROFTYPE'])#一个想法是你可以
仅加载“PROFTYPE”列
识别非护士从业人员行
加载整个表以仅保留行
这一战略正在实施:
df = pd.read_excel(SourceFile,
sheet_name='Data',
header=0,
usecols=['PROFTYPE']) # <-- Load just 'PROFTYPE' of the following table
# ID PROFTYPE YEARS_IN_PRACTICE
# 1234 NURSEPRACT 12
# 43 NURSE 32
# 789 NURSEPRACT 4
# 34 PHYSICIAN 2
# 93 NURSEPRACT 13
row_numbers = [x+1 for x in df[df['PROFTYPE'] != 'NURSEPRACT'].index]
df = pd.read_excel(SourceFile, sheet_name='Data', header=0, skiprows=row_numbers)
# ID PROFTYPE YEARS_IN_PRACTICE
# 1234 NURSEPRACT 12
# 789 NURSEPRACT 4
# 93 NURSEPRACT 13
df=pd.read\u excel(源文件,
工作表_name='Data',
标题=0,
usecols=['PROFTYPE'])#