Python 熊猫-绘制两个数据帧-带有通用图例_Python_Pandas_Matplotlib

Python 熊猫-绘制两个数据帧-带有通用图例

python pandas matplotlib

Python 熊猫-绘制两个数据帧-带有通用图例,python,pandas,matplotlib,Python,Pandas,Matplotlib,我试图在一张图表上绘制两个数据框——一个是每月最低温度的数据框，另一个是堪培拉市自1930年以来每十年平均每月最高温度的数据框。我希望这两个数据帧共享一个公共图例。为了增加挑战，我希望图例有两列我可以得到一个共享的图例，但只能在一列中。或者我可以让图例在两列中重复。我还没有掌握两个专栏中的一个共享传奇。请参见下图下面是包含来自web的数据的完整代码，但我的问题是最后几行代码 # Look at temperature data for Canberra # --- initialise

我试图在一张图表上绘制两个数据框——一个是每月最低温度的数据框，另一个是堪培拉市自1930年以来每十年平均每月最高温度的数据框。我希望这两个数据帧共享一个公共图例。为了增加挑战，我希望图例有两列

我可以得到一个共享的图例，但只能在一列中。或者我可以让图例在两列中重复。我还没有掌握两个专栏中的一个共享传奇。请参见下图

下面是包含来自web的数据的完整代码，但我的问题是最后几行代码

# Look at temperature data for Canberra

# --- initialise
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import requests
import io

# --- some graphics management
plt.style.use('./bryan.mplstyle')
LOCATION = './Charts/Canb-'

def plot_save_and_close(ax, title, xlabel, ylabel, 
    filename, legend=True, bar_labels=0):
    """Add the usual chart annotations, 
    save to file and close the plot """

    ax.set_title(title)
    ax.set_xlabel(xlabel)
    ax.set_ylabel(ylabel)
    fig = ax.figure
    fig.tight_layout(pad=1)
    if legend and not ax.get_legend():
        l = ax.legend(loc='best', fontsize='small')
    if bar_labels:
        for i, t in enumerate(ax.get_xticklabels()):
            if ((i-3) % bar_labels) != 0:
                t.set_visible(False)
    fig.savefig(filename, dpi=125)
    plt.close()


# --- Get the data
# Data from Australian Bureau pf Meteorology
url_stem = 'http://www.bom.gov.au/climate/change/hqsites/data/temp/'
canberra = '070351'
url_min = url_stem+'tmin.'+canberra+'.daily.csv'
url_max = url_stem+'tmax.'+canberra+'.daily.csv'

# download minimum temperatures
min_df = pd.read_csv(io.StringIO(requests.get(
    url_min).content.decode('utf-8')), 
    header=0, index_col=0, parse_dates=[0])
min_df = min_df.drop(min_df.index[0])[[min_df.columns[0]]]
min_df.columns = ['Minimum']

# download maximum temperatures
max_df = pd.read_csv(io.StringIO(requests.get(
    url_max).content.decode('utf-8')), 
    header=0, index_col=0, parse_dates=[0])
max_df = max_df.drop(max_df.index[0])[[max_df.columns[0]]]
max_df.columns = ['Maximum']

# combine into a single dataframe 
df = min_df.join(max_df, how='outer')

# let's augment with the latest daily data - TO DO

# provide some grouping tags for the data
df['calendar year'] = df.index.to_period(freq='A-DEC') 
df['winter year'] = df.index.to_period(freq='A-MAY') 
df['decade begining'] = (df.index.year // 10) * 10
df['tri-decade beginning'] = (((df.index.year - 1900) // 30) * 30) + 1900
df['julian day'] = df.index.dayofyear
df['Month'] = df.index.month

# check for missing data by year
#print('Missing max data by calandar year: ') 
#print(df[df['Maximum'].isna()].groupby('calendar year')['Maximum'].size())
#print('Missing min data by calandar year: ') 
#print(df[df['Minimum'].isna()].groupby('calendar year')['Minimum'].size())
# note: substantial missing data for Canberra between 1920-25 inclusive
df = df[df.index >= pd.Timestamp('1926-01-01')].copy() # not a slice
df_saved = df.copy() # keep a copy of the original to return to

# let's plot decadal monthly averages - both maximum and minimums
df = df_saved.copy()
df = df.groupby('decade begining').filter(lambda x: len(x) >= 3300) # close to full decades
df = df.groupby(['decade begining', 'Month'])['Maximum', 'Minimum'].mean().unstack(level=0)
max = df['Maximum']
min = df['Minimum']
colors = plt.cm.coolwarm(np.linspace(0,1,len(max.columns)))
ax = max.plot(color=colors, legend=True)
legend = ax.legend(title='Decade begining', ncol=2, loc='best',fontsize='small')
ax = min.plot(ax=ax, color=colors, legend=False)
plot_save_and_close(ax, 'Canberra: Average Monthly Min and Max Temp by Decade', 
    'Month', 'Degrees Celsius', LOCATION+'unsmoothed-decadal-average-monthly.png')

当调用

.plot

时，问题可能是一个内部问题。图例似乎是使用链接到轴的所有数据创建的。你可以解决修补熊猫的问题，但我认为这不是一个好主意。以下是利用两个不同matplotlib轴的可能解决方法（丑陋）：

ax = max.plot(color=colors)
ax2 = ax.twinx()
ax2 = min.plot(color=colors, ax=ax2, legend=False)
ax.legend(title='Decade begining', ncol=2, loc=1, fontsize='small')

min_ylim, max_lim = [f([ax.get_ylim()[0], ax2.get_ylim()[0]]) for f in [np.min, np.max]]
[axis.set_ylim([min_ylim, max_ylim]) for axis in [ax, ax2]]
ax2.set_yticks([])
plt.show()

您可以通过执行

plt.figlegend（）

而不是

ax.legend（）

来实现这一点。不保证，虽然Thank-twinx（）是我下一步要去的地方。我将一行编辑为以下内容：min_ylim，max_ylim=[f（[ax.get_ylim（）[g]，ax2.get_ylim（）[g]]），用于zip中的f，g（[np.min，np.max]，[0，1]）]