Python 3.x 在从其他两个CSV文件（python）派生的一个CSV文件中搜索特定信息_Python 3.x_Csv

Python 3.x 在从其他两个CSV文件（python）派生的一个CSV文件中搜索特定信息

python-3.x csv

Python 3.x 在从其他两个CSV文件（python）派生的一个CSV文件中搜索特定信息,python-3.x,csv,Python 3.x,Csv,我有一个营养数据库，包含3个不同的CSV文件。清洁后，第一个文件包含两列：营养素id和营养素名称；第二个文件包含两列：食物id和食物描述（名称）；最后，第三个文件包含三列：营养素id、食物id和数量（此食物中的营养素）。由于有数百万行，我不能每次单独打开每个文件，检查哪个id对应于哪个营养素或食物。所以我尝试创建一个代码，它将读取所有三个文件，然后在id列中搜索营养素（来自第一个文件）和食物（来自第二个文件）的匹配项，用名称替换id，并返回3列：营养素名称、食物名称、数量。现在，有一个复杂的问题

我有一个营养数据库，包含3个不同的CSV文件。清洁后，第一个文件包含两列：营养素id和营养素名称；第二个文件包含两列：食物id和食物描述（名称）；最后，第三个文件包含三列：营养素id、食物id和数量（此食物中的营养素）。由于有数百万行，我不能每次单独打开每个文件，检查哪个id对应于哪个营养素或食物。所以我尝试创建一个代码，它将读取所有三个文件，然后在id列中搜索营养素（来自第一个文件）和食物（来自第二个文件）的匹配项，用名称替换id，并返回3列：营养素名称、食物名称、数量。现在，有一个复杂的问题，即：在1和2个文件中，行是按id-s排序的，而在第三个文件（带数量）中，行是按营养素id排序的（这意味着食物id-s列是混乱的）。所以我不能合并三个文件，或者用第三个文件中的name列替换id列。。。下面是我的代码示例，它不返回我需要的内容。我被这个问题困住了，因为我在网上找不到答案。谢谢

#-*- coding: utf-8 -*- 

"""
Created on Fri Nov  8 17:38:45 2019

@author: user
"""
import pandas as pd 
#%% reading csv files 

#read the first scv file with nutrient_name, nutrient_id
df1 = pd.read_csv('nutrient.csv', low_memory=False)
print(df1)

#read specific columns from the first csv file
df1 = pd.read_csv('nutrient.csv', usecols = ['id', 'name'], low_memory=False)
df1.rename(columns={'name' : 'nut_name'}, inplace = True)
print(df1)

#read the second scv file with food_id and food_name , read specific columns 
df2 = pd.read_csv('food.csv', usecols = ['fdc_id', 'description'], low_memory=False)
print(df2)

#read the third csv file with food_id, nutrient_id and nutrient amount
df3 = pd.read_csv('food_nutrient.csv', usecols=['fdc_id','nutrient_id', 'amount'], low_memory=False)
print(df3)

#%% create a list of rows from each csv file 
# Create an empty list 1
Id_list =[] 
Name_list = []

# Iterate over each rowin first csv file 
for index, rows in df1.iterrows(): 

# append the list to the final list 
Id_list.append(rows.id)
Name_list.append(rows.nut_name)


# Print the list 
print(Id_list[:10])
print(Name_list[:10])  

# Create an empty list 2
Food_id_list =[]
Food_name_list =[] 

# Iterate over each rowin seconf csv file 
for index, rows in df2.iterrows(): 

# append the list to the final list 
Food_id_list.append(rows.fdc_id)
Food_name_list.append(rows.description)

print(Food_id_list[:10])
print(Food_name_list[:10])

# Create an empty list 1
Amount_list =[] 
Name_list1 = []
Food_name1 = []

# Iterate over each rowin third csv file 
for index, rows in df3.iterrows(): 

# append the list to the final list 
Amount_list.append(rows.amount)
Name_list1.append(rows.nutrient_id)
Food_name1.append(rows.fdc_id)

# Print the list 
print(Amount_list[:10])
print(Name_list1[:10])
print(Food_name1[:10])

#%% search in the third csv only rows, where amount of the certain nut in certain food is not empty 
value 
for i in Name_list:
   #for j in Food_name_list:
    if i in df3['nutrient_id']:
        print(df3.loc[i, 'amount'])

提前谢谢

这正是创建SQL的目的。SQL的

join

命令连接多个表。

与pandas一起玩了很多次之后，我强烈建议您选择一门简单的SQL课程，或者首先尝试学习一门简单的

SQLJoin

教程，因为这在SQL中是一个非常重要的入门问题。

谢谢！然后我将检查SQL（以前从未使用过它）