Python 数据帧最小/最大范围_Python_Pandas_Numpy_Dataframe

Python 数据帧最小/最大范围

python pandas numpy dataframe

Python 数据帧最小/最大范围,python,pandas,numpy,dataframe,Python,Pandas,Numpy,Dataframe,提前感谢您的帮助！（以下代码）/此处的数据：我正在尝试向数据框中添加另外两列，它们表示表土列的数据范围，就像20 cm列的mean['maxx20']=maxx['20 cm']和mean['minn20']=minn['20 cm']do一样我试图通过添加以下内容来实现这一点： mean['topsoilMax']=maxx['Topsoil'] mean['topsoilMin']=minn['Topsoil'] 这并没有像我所希望的那样添加额外的列，而是导致了关键错误：“表层土”，即

提前感谢您的帮助！（以下代码）/此处的数据：

我正在尝试向数据框中添加另外两列，它们表示表土列的数据范围，就像20 cm列的mean['maxx20']=maxx['20 cm']和mean['minn20']=minn['20 cm']do一样

我试图通过添加以下内容来实现这一点：

mean['topsoilMax']=maxx['Topsoil']
mean['topsoilMin']=minn['Topsoil']

这并没有像我所希望的那样添加额外的列，而是导致了关键错误：“表层土”，即使表层土已经是数据框中的一列，就像我添加范围时的20 cm一样

为什么我会出现这个错误？添加这些列的正确方法是什么

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

#Importing data, creating a copy, and assigning it to a variable
raw_data = pd.read_csv('all-deep-soil-temperatures.csv', index_col=1, parse_dates=True)
df_all_stations = raw_data.copy()

#Setting the program to iterate based off of the station of the users choice
selected_soil_station = 'Minot'
df_selected_station = df_all_stations[df_all_stations['Station'] == selected_soil_station]
df_selected_station.fillna(method = 'ffill', inplace=True);

# Indexes the data by day and creates a column that keeps track of the day
df_selected_station_D=df_selected_station.resample(rule='D').mean()
df_selected_station_D['Day'] = df_selected_station_D.index.dayofyear


#Assigning variable so that mean represents df_selected_station_D but indexed by day
mean=df_selected_station_D.groupby(by='Day').mean()
mean['Day']=mean.index

#This inserts a new column named 'Topsoil' at the end that represents the average between 5 cm, 10 cm, and 20 cm
mean['Topsoil']=mean[['5 cm', '10 cm','20 cm']].mean(axis=1)


#Creating the range in which the line graph will fill in 
maxx=df_selected_station_D.groupby(by='Day').max()
minn=df_selected_station_D.groupby(by='Day').min()

mean['maxx20']=maxx['20 cm']
mean['minn20']=minn['20 cm']

如果我理解你的问题，那么我解决问题的方式是

表土=[-2.971686，-2.599278，-2.264897，-2.083117，-1.946969]

最大数量=最大（表土）最小值=最小值（表土）打印（最大数量）#这里是表土列表的最大数量打印（最小值）#这里是表土列表的最小值打印（最大值-最小值）#这里是表土列表的最大值-最小值

这里的解决方案可能需要将“表土”列添加到maxx和minn数据帧：

maxx['Topsoil']=maxx[['5 cm', '10 cm','20 cm']].max(axis=1)
minn['Topsoil']=minn[['5 cm', '10 cm','20 cm']].min(axis=1)

任务完成后：

mean['topsoilMax']=maxx['Topsoil']
mean['topsoilMin']=minn['Topsoil']

我认为这是从这三个值中选择最小值和最大值，但表层土柱应该是这三个值的平均值。因此，我认为逻辑应该是，新列是这三个列在该日期的平均值的范围。有点像这样（实际上你不是这样编码的哈哈）：maxx['Topsoil']=maxx[[Average（'5cm'，'10cm'，'20cm'）]].max（axis=1）也许我不太明白，但为什么不使用

maxx['Topsoil']=意思是['5cm'，'10cm'，'20cm'].max（axis=1）

？mean['5 cm'、'10 cm'、'20 cm']应该已经包含了平均值。我算出了，我最后添加了更多表示5 cm和10 cm范围的列，如下所示：mean['maxx05']=maxx['5 cm']mean['minn05']=minn['5 cm']mean['maxx10']=maxx['10 cm']mean['minn10']=minn['10 cm']然后我平均了这三个，我得到了我想要的结果。如果没有你的指导，我是不可能做到的。谢谢！