Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/84.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何从excel文件中获取特定行_Python_Pandas - Fatal编程技术网

Python 如何从excel文件中获取特定行

Python 如何从excel文件中获取特定行,python,pandas,Python,Pandas,我正在使用pandas读取具有以下结构的Excel文件: A B C D E ╔════════════════════╦════════════╦═══════╦═══════╦══════════╗ 1 ║ Project Name ║ ║ ║ ║ ║ ╠════════════════════╬════════

我正在使用pandas读取具有以下结构的Excel文件:

            A                 B         C       D        E
  ╔════════════════════╦════════════╦═══════╦═══════╦══════════╗
1 ║ Project Name       ║            ║       ║       ║          ║
  ╠════════════════════╬════════════╬═══════╬═══════╬══════════╣
2 ║ Date of generation ║            ║       ║       ║          ║
  ╠════════════════════╬════════════╬═══════╬═══════╬══════════╣
3 ║ ID                 ║ Type       ║ Value ║ Color ║ Capacity ║
  ╠════════════════════╬════════════╬═══════╬═══════╬══════════╣
4 ║ 1                  ║ Car        ║ 1,000 ║ Red   ║ 2        ║
  ╠════════════════════╬════════════╬═══════╬═══════╬══════════╣
5 ║ 2                  ║ Truck      ║ 800   ║ Green ║ 12       ║
  ╠════════════════════╬════════════╬═══════╬═══════╬══════════╣
6 ║ 3                  ║ Helicopter ║ 5,000 ║ White ║ 4        ║
  ╚════════════════════╩════════════╩═══════╩═══════╩══════════╝
如您所见,数据标题位于第3行,下面的信息继续。如果我这样读框架:

df = pandas.read_excel("sample.xls", sheetname=0, skiprows=2, header=0)
我的数据帧有正确的数据部分,但我无法读取第一个单元格A1和A2中的信息

我也尝试过以下方法,在获取前两个值后,对新数据帧进行“切片”。我得到一个错误:

df = pandas.read_excel("sample.xls", sheetname=0, skiprows=0, header=None)

project_name = df.iloc[0][0]
project_date = df.iloc[1][0]

new_header = df.iloc[2]         #grab the first row for the header
df = df[3:]                     #take the data less the header row
df.rename(columns = new_header) #set the header row as the df header

#tidyup   
df = df.fillna('')              #convert NaN (blank) cells to empty string
df.applymap(str)                #convert all columns to str
df.columns = df.columns.str.strip()   #strip whitespace before and after headers
df_obj = df.select_dtypes(['object']) #select all object columns (not int/float columns)
df[df_obj.columns] = df_obj.apply(lambda x: x.str.strip()) #strip (trim whitepace) of all object columns
AttributeError: Can only use .str accessor with string values (i.e. inferred_type is 'string', 'unicode' or 'mixed')
运行上述命令时,我会得到一个错误:

df = pandas.read_excel("sample.xls", sheetname=0, skiprows=0, header=None)

project_name = df.iloc[0][0]
project_date = df.iloc[1][0]

new_header = df.iloc[2]         #grab the first row for the header
df = df[3:]                     #take the data less the header row
df.rename(columns = new_header) #set the header row as the df header

#tidyup   
df = df.fillna('')              #convert NaN (blank) cells to empty string
df.applymap(str)                #convert all columns to str
df.columns = df.columns.str.strip()   #strip whitespace before and after headers
df_obj = df.select_dtypes(['object']) #select all object columns (not int/float columns)
df[df_obj.columns] = df_obj.apply(lambda x: x.str.strip()) #strip (trim whitepace) of all object columns
AttributeError: Can only use .str accessor with string values (i.e. inferred_type is 'string', 'unicode' or 'mixed')
我不清楚为什么在“tidyups”部分使用相同的格式,但在上面的第一行代码中阅读df


是否有一种方法可以通过读取两次文件来获取“上述标题”信息,而不创建一个全新的数据帧?

有一个问题,您忘记重新分配:

df = df.applymap(str)   
更好(更快)的方法是
applymap
使用:

总而言之:

project_name = df.iloc[0, 0]
project_date = df.iloc[1, 0]

new_header = df.iloc[2]        
df = df[3:]      
df.columns = new_header.str.strip()
df.columns.name = None

df = df.astype(str).reset_index(drop=True)
df = df.apply(lambda x: x.str.strip())
print (df)
  ID        Type  Value  Color Capacity
0  1         Car  1,000    Red        2
1  2       Truck    800  Green       12
2  3  Helicopter  5,000  White        4