Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/277.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/excel/28.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
将CSV/Excel中的多个表转换为Python中的字典或数据帧_Python_Excel_Csv_Dictionary_Dataframe - Fatal编程技术网

将CSV/Excel中的多个表转换为Python中的字典或数据帧

将CSV/Excel中的多个表转换为Python中的字典或数据帧,python,excel,csv,dictionary,dataframe,Python,Excel,Csv,Dictionary,Dataframe,我需要帮助 我有一个Excel文件,其中包含我试图进入数据框的数据,但数据是以表格形式存在的,不容易处理。例如: 我希望最终将其放入这种形式的数据框架中: Meal Food Calories Breakfast English Muffins 120 Breakfast Peanut Butter Spread

我需要帮助

我有一个Excel文件,其中包含我试图进入数据框的数据,但数据是以表格形式存在的,不容易处理。例如:

我希望最终将其放入这种形式的数据框架中:

Meal               Food                              Calories
Breakfast          English Muffins                   120
Breakfast          Peanut Butter Spread              190
Morning Snack      Banana                            90
Morning Snack      Nectarine                         59
...                ...                               ...
以及此表单中每日总计的单独数据框(暂时忽略“日期”列):

我正在努力将其放入数据帧中。查看数据集的屏幕截图,首先将数据存储到字典中是有意义的,但这会给我带来一组NA值,因为所有的空白单元格。

我想用我想要的方式来获取“膳食”专栏,就是做一个正向填充,但这意味着我将不得不使用一个系列或数据帧,而我还没有做到这一点

这就是我目前拥有的:

df = pd.read_excel('filename.xls', 'Foods')

# create a list to store the dictionaries
food_logs = []

# this is code to reformat the string values in a certain column 
# to get the name of the sheets I need to use in the Excel. This can be ignored
for day in df.values:
    if day[1] != '0':
        foodLogSheetName = 'Food Log ' + day[0].replace('-', '')
        food_logs.append(foodLogSheetName)

# 'foods' is now a list of nested dictionaries (think of everything in the 
# first screenshot as the outer dictionary, and each of the column as the 
# inner dictionary)
foods = [xls.parse(food_log).to_dict() for food_log in food_logs]
这就是“食品”目前的含义,如果我在每个外部字典之间用一行字打印出来:

我可以选择使用CSV文件,但是如果有意义的话,我可以选择垂直堆叠多个“表”,而不是多张图纸


我将非常感谢任何人能提供的任何提示,请

我认为你的数据收集是正确的。听起来您可能只是在处理丢失的数据时遇到了问题。从您发布的示例中,看起来您可以将整个内容读入一个数据框,删除所有空行,在“用餐”列中添加F,然后删除任何部分为空的行(或子集上的行)


有了你的建议,我才知道如何回答我的问题!非常感谢你!!
df = pd.read_excel('filename.xls', 'Foods')

# create a list to store the dictionaries
food_logs = []

# this is code to reformat the string values in a certain column 
# to get the name of the sheets I need to use in the Excel. This can be ignored
for day in df.values:
    if day[1] != '0':
        foodLogSheetName = 'Food Log ' + day[0].replace('-', '')
        food_logs.append(foodLogSheetName)

# 'foods' is now a list of nested dictionaries (think of everything in the 
# first screenshot as the outer dictionary, and each of the column as the 
# inner dictionary)
foods = [xls.parse(food_log).to_dict() for food_log in food_logs]
import pandas as pd

df = pd.read_excel(file_path_or_buffer, sheet_name=my_sheet_name, **other_kwargs)
# You should have a dataframe that looks like
# Meal               Food                              Calories
# Breakfast          
#                    English Muffins                   120
#                    Peanut Butter Spread              190
# ...
# Next drop totally NaN/empty rows
df.dropna(how='all', inplace=True)
df['Meal'] = df['Meal'].fillna(method='ffill')
# Now you should have something that looks like
# Meal               Food                              Calories
# Breakfast          
# Breakfast          English Muffins                   120
# Breakfast          Peanut Butter Spread              190
# ...
# Drop empty rows, if you need to allow for some sparse data, use the subset argument
df.dropna(how='any', inplace=True)