Pandas 为多个数据集连接编写相同条件的更好方法是什么?

Pandas 为多个数据集连接编写相同条件的更好方法是什么?,pandas,concatenation,concat,Pandas,Concatenation,Concat,下面是代码,我只是以2015年到2016年为例。我实际上在为2015-2019数据集编写相同的代码。这对我来说很有效,但正如你所看到的,这很冗长,因为我必须不断重复代码(直到2019年) 问题:1。写这篇文章的更好更有效的方法是什么 df2015 = pd.read_csv('EPL_20152016.csv', parse_dates=['Date'], dayfirst=True, usecols=['Date', 'HomeTeam', 'AwayTeam', 'FTHG

下面是代码,我只是以2015年到2016年为例。我实际上在为2015-2019数据集编写相同的代码。这对我来说很有效,但正如你所看到的,这很冗长,因为我必须不断重复代码(直到2019年)

问题:1。写这篇文章的更好更有效的方法是什么

df2015 = pd.read_csv('EPL_20152016.csv', parse_dates=['Date'], dayfirst=True,
         usecols=['Date', 'HomeTeam', 'AwayTeam', 'FTHG', 'FTAG', 'FTR', 'BbAv>2.5','BbAv<2.5'])
df2015.rename(columns={'BbAv>2.5': 'Avg>2.5', 'BbAv<2.5': 'Avg<2.5'},inplace=True)
df2015['FTTG'] = df2015['FTHG'] + df2015['FTAG']
df2015['%Avg>2.5'] = 100* (1 / df2015['Avg>2.5'])
df2015['%Avg<2.5'] = 100* (1 / df2015['Avg<2.5'])
df2015['%TotalAvg><2.5'] = df2015['%Avg>2.5'] + df2015['%Avg<2.5']
df2015['%Vig><2.5'] = df2015['%TotalAvg><2.5'] - 100
#df2015 = df2015[['Date', 'HomeTeam', 'AwayTeam', 'FTHG', 'FTAG', 'FTTG', 'FTR','Avg>2.5','Avg<2.5',
df2015 = df2015.reindex(columns=['Date', 'HomeTeam', 'AwayTeam', 'FTHG', 'FTAG', 'FTTG', 'FTR','Avg>2.5','Avg<2.5','%Avg>2.5', '%Avg<2.5', '%TotalAvg><2.5', '%Vig><2.5'])


df2016 = pd.read_csv('EPL_20162017.csv', parse_dates=['Date'], dayfirst=True,
         usecols=['Date', 'HomeTeam', 'AwayTeam', 'FTHG', 'FTAG', 'FTR', 'BbAv>2.5','BbAv<2.5'])
df2016.rename(columns={'BbAv>2.5': 'Avg>2.5', 'BbAv<2.5': 'Avg<2.5'},inplace=True)
df2016['FTTG'] = df2016['FTHG'] + df2016['FTAG']
df2016['%Avg>2.5'] = 100* (1 / df2016['Avg>2.5'])
df2016['%Avg<2.5'] = 100* (1 / df2016['Avg<2.5'])
df2016['%TotalAvg><2.5'] = df2016['%Avg>2.5'] + df2016['%Avg<2.5']
df2016['%Vig><2.5'] = df2016['%TotalAvg><2.5'] - 100
df2016 = df2016.reindex(columns=['Date', 'HomeTeam', 'AwayTeam', 'FTHG', 'FTAG', 'FTTG', 'FTR','Avg>2.5','Avg<2.5','%Avg>2.5', '%Avg<2.5', '%TotalAvg><2.5', '%Vig><2.5'])

使用pd.read_csv()将文件读入不同的数据帧后,可以尝试将它们添加到列表中。然后有一个for循环,在该列表中的每个数据帧上循环以执行所需的操作。例如,您可以尝试以下方法:

df_list = [df2015, df2016, df2017, df2018, df2019]

for df in df_list:
    df.rename(columns={'BbAv>2.5': 'Avg>2.5', 'BbAv<2.5': 'Avg<2.5'},inplace=True)
    df['FTTG'] = df['FTHG'] + df['FTAG']
    df['%Avg>2.5'] = 100* (1 / df['Avg>2.5'])
    df['%Avg<2.5'] = 100* (1 / df['Avg<2.5'])
    df['%TotalAvg><2.5'] = df['%Avg>2.5'] + df['%Avg<2.5']
    df['%Vig><2.5'] = df['%TotalAvg><2.5'] - 100
    df = df.reindex(columns=['Date', 'HomeTeam', 'AwayTeam', 'FTHG', 'FTAG','FTTG', 'FTR','Avg>2.5','Avg<2.5','%Avg>2.5', '%Avg<2.5', '%TotalAvg><2.5', '%Vig><2.5'])                                                                          
df_列表=[df2015、df2016、df2017、df2018、df2019]
对于df_列表中的df:
重命名(列={'BbAv>2.5':'Avg>2.5','BbAv2.5'])
df['%Avg
df_list = [df2015, df2016, df2017, df2018, df2019]

for df in df_list:
    df.rename(columns={'BbAv>2.5': 'Avg>2.5', 'BbAv<2.5': 'Avg<2.5'},inplace=True)
    df['FTTG'] = df['FTHG'] + df['FTAG']
    df['%Avg>2.5'] = 100* (1 / df['Avg>2.5'])
    df['%Avg<2.5'] = 100* (1 / df['Avg<2.5'])
    df['%TotalAvg><2.5'] = df['%Avg>2.5'] + df['%Avg<2.5']
    df['%Vig><2.5'] = df['%TotalAvg><2.5'] - 100
    df = df.reindex(columns=['Date', 'HomeTeam', 'AwayTeam', 'FTHG', 'FTAG','FTTG', 'FTR','Avg>2.5','Avg<2.5','%Avg>2.5', '%Avg<2.5', '%TotalAvg><2.5', '%Vig><2.5'])