Python 如何基于表中的数据向表中添加列_Python_Pandas_Dataframe

Python 如何基于表中的数据向表中添加列

python pandas dataframe

Python 如何基于表中的数据向表中添加列,python,pandas,dataframe,Python,Pandas,Dataframe,我使用pandas从csv文件中提取数据以打印出表格。这是我目前掌握的代码 try: df = pd.read_csv("file.csv") df_filter= df[['Time','ID','ItemName', "PassFailStatus"] if df_filter['PassFailStatus'].str.contains('Fail').any(): f

我使用pandas从csv文件中提取数据以打印出表格。这是我目前掌握的代码


    try:
        df = pd.read_csv("file.csv")
        df_filter= df[['Time','ID','ItemName', "PassFailStatus"]
        if df_filter['PassFailStatus'].str.contains('Fail').any():
            finalTable= df_filter[(df_filter.PassFailStatus == 'Fail')]
            if finalTable.empty:
                print("Did not complete")
                sheet1[cellLocLastRow('A')] = "Did not complete"
            else:
                fullFinalTable= finalTable[['Time','ID','ItemName']]
                finalTableFilter = fullFinalTable.to_string()
                print(finalTableFilter)
                lastRow = writeTableToExcel(sheet1, "A", lastRow, fullFinalTable, 'Time') #prints to excel
        else:
            print("Run Successful")
            sheet1[cellLocLastRow('A')] = "Run Successful"
    except FileNotFoundError:
        print("File does not exist")
        sheet1[cellLocLastRow('A')] = "File does not exist"

但是我想添加一个Fix列，如果ID列表示某个字符串，那么它将在Fix列中打印出一些内容。例如，如果在ID列中找到字符串“Integration”，那么fix列将显示“checkfolder”，如图所示。当我尝试添加另一个专栏时，我会遇到错误，任何帮助都将不胜感激

下表是我目前拥有的

Time         ID                   ItemName
2020-Aug-07  Integration_comp_14  Integration_System::CheckTest_eos0
2020-Aug-07  Integration_comp_14  Connections_SYSTEM::System_eos0
2020-Aug-07  Integration_comp_9   System::SourceTestExternal_eos0
2020-Aug-07  MainInstrument_2017  Integration::FunctionalTest_eos0
2020-Aug-07  MainInstrument_2020  Integration::TimingLoopbackOddTest_eos0
2020-Aug-07                       Integration::TimingLoopbackEvenTest_eos0
2020-Aug-07  MainInstrument_2022  Integration::TimingLoopbackOddTest_eos0

下面是我想要的表格


Time         ID                   ItemName                                 Fix
2020-Aug-07  Integration_comp_14  Integration_System::CheckTest_eos0       Folder
2020-Aug-07  Integration_comp_14  Connections_SYSTEM::System_eos0          Folder
2020-Aug-07  Integration_comp_9   System::SourceTestExternal_eos0          Folder
2020-Aug-07  MainInstrument_2017  Integration::FunctionalTest_eos0         Device
2020-Aug-07  MainInstrument_2020  Integration::TimingLoopbackOddTest_eos0  Device
2020-Aug-07                       Integration::TimingLoopbackEvenTest_eos0 None
2020-Aug-07  MainInstrument_2022  Integration::TimingLoopbackOddTest_eos0  Device

对于修复列，可以使用numpy和pandas的组合

Numpy-您可以使用np.where来完成“case when then else”部分
熊猫-您可以执行df.str.contains（'string'）操作

把它们放在一起，就是

df['Fix'] = np.where(df['col'].str.contains('string'),'something','something else')

您甚至可以放置多个语句—将其视为excel中的嵌套if语句

df['Fix'] = np.where(df.str.contains('string'),'something',np.where(df.str.contains('string'),'something','something else') )

以下是一个工作示例：

import numpy as np
import pandas as pd
columns=['a','b','c']
rows=['1','2','3']
d_base=np.array(['no','yes','hello'])
data=np.tile(d_base,(3,1))

#create df
df=pd.DataFrame(data,columns=columns,index=rows)

让我们创建一个名为Fix的新列，该列在该列中查找字符串并为其指定类别

df['Fix'] = np.where(df['a'].str.contains('no'),'something','something else')

创建字典以将子字符串映射到其值：

import pandas as pd

df = pd.DataFrame({'ID': ['Integration_comp_14', 'MainInstrument_2017', 'Integration_comp_14']})

replace_map = {'Integration': 'Folder', 'Instrument': 'Device'}
df['Fix'] = df.apply(lambda row: ','.join([replace_map[y] for y in replace_map if y in row.ID]), axis=1)
print(df)

输出：

                    ID     Fix
0  Integration_comp_14  Folder
1  MainInstrument_2017  Device
2  Integration_comp_14  Folder

代码引用了示例数据中不包含的列

“PassFailStatus”，“Counters”

“PassFailStatus”用于筛选表，因此它不在其中。