Python 表示不同版本的代码
我想说的是规范化我的数据帧,当我实现代码的第一个版本时,我得到了规范化的值,但是当我实现版本2时,我得到了一个名为Python 表示不同版本的代码,python,pandas,Python,Pandas,我想说的是规范化我的数据帧,当我实现代码的第一个版本时,我得到了规范化的值,但是当我实现版本2时,我得到了一个名为stop iteration的错误[“1B”、“2B”、“3B”、“HR”、“BB”]是我的数据框中的列 第1版: def meanNormalizeRates(df): subRates = df[["1B","2B","3B","HR","BB"]] df[["1B","2B","3B","HR","BB"]] = subRates - subRa
stop iteration
的错误<代码>[“1B”、“2B”、“3B”、“HR”、“BB”]是我的数据框中的列
第1版:
def meanNormalizeRates(df):
subRates = df[["1B","2B","3B","HR","BB"]]
df[["1B","2B","3B","HR","BB"]] = subRates - subRates.mean(axis=0)
return df
stats = stats.groupby('yearID').apply(meanNormalizeRates)
stats.head()
def mean(df):
for val in ["1B","2B","3B","HR","BB"]:
stats[val] = stats[val] -stats[val].mean(axis=0)
stats = stats.groupby('yearID').apply(mean)
stats.head()
def std(df):
temp = df[['gate', 'pop']]
df[['gate', 'pop']] = temp - temp.mean(axis=0)
return df
frame.groupby('year').apply(std)
gate pop state year
0 9 1.5 Ohio 2000
1 7 1.7 Ohio 2001
2 4 3.6 Ohio 2002
3 6 2.4 Nevada 2001
4 9 2.9 Nevada 2002
第2版:
def meanNormalizeRates(df):
subRates = df[["1B","2B","3B","HR","BB"]]
df[["1B","2B","3B","HR","BB"]] = subRates - subRates.mean(axis=0)
return df
stats = stats.groupby('yearID').apply(meanNormalizeRates)
stats.head()
def mean(df):
for val in ["1B","2B","3B","HR","BB"]:
stats[val] = stats[val] -stats[val].mean(axis=0)
stats = stats.groupby('yearID').apply(mean)
stats.head()
def std(df):
temp = df[['gate', 'pop']]
df[['gate', 'pop']] = temp - temp.mean(axis=0)
return df
frame.groupby('year').apply(std)
gate pop state year
0 9 1.5 Ohio 2000
1 7 1.7 Ohio 2001
2 4 3.6 Ohio 2002
3 6 2.4 Nevada 2001
4 9 2.9 Nevada 2002
我无法理解这两个版本之间的区别
一个很好的例子
data = {'state': ['Ohio', 'Ohio', 'Ohio', 'Nevada', 'Nevada'],
'year': [2000, 2001, 2002, 2001, 2002],
'pop': [1.5, 1.7, 3.6, 2.4, 2.9],
'gate' : [9, 7, 4,6, 9]}
frame = pd.DataFrame(data)
frame.head()
版本1.1
def meanNormalizeRates(df):
subRates = df[["1B","2B","3B","HR","BB"]]
df[["1B","2B","3B","HR","BB"]] = subRates - subRates.mean(axis=0)
return df
stats = stats.groupby('yearID').apply(meanNormalizeRates)
stats.head()
def mean(df):
for val in ["1B","2B","3B","HR","BB"]:
stats[val] = stats[val] -stats[val].mean(axis=0)
stats = stats.groupby('yearID').apply(mean)
stats.head()
def std(df):
temp = df[['gate', 'pop']]
df[['gate', 'pop']] = temp - temp.mean(axis=0)
return df
frame.groupby('year').apply(std)
gate pop state year
0 9 1.5 Ohio 2000
1 7 1.7 Ohio 2001
2 4 3.6 Ohio 2002
3 6 2.4 Nevada 2001
4 9 2.9 Nevada 2002
版本1.2
def mean(df):
for val in ['gate', 'pop']:
df[val] = df[val]- df[val].mean(axis=0)
frame.groupby('year').apply(mean)
error: stop iteration
好的,因为在
mean()
函数(在示例1.2中)中没有return语句,所以该函数只为每个组返回None
。您得到的StopIteration
错误并不十分清楚,但发生的是:
对每个组调用apply()
函数mean()
- 每个调用都返回
None
- 结果被放入一个列表中,所以这里是所有结果的列表
sNone
- 作为将结果缝合在一起的一部分,
尝试在列表中查找非apply()
值, 它抛出一个None
异常StopIteration
eg_list = [None, None, None]
v = next(v for v in eg_list if v is not None)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-12-93b31b7a51e4> in <module>()
----> 1 v = next(v for v in eg_list if v is not None)
你能试着加入一些吗?很难说为什么没有它你的结果可能会有所不同。@马吕斯-我已经给出了一个例子。不是frame.groupby('year')[['gate','pop']]和frame.groupby('year')是相等的??我不确定你如何在函数中做事情,只是不是groupby和apply如何工作
frame.groupby('year')[['gate','pop']]
几乎与frame.groupby('year')
相同,它只是排除了state列。