Python Pandas series.apply出现溢出错误
我有一个可以很好地处理单个值的函数,但是当我将它与pandas series.apply()一起使用时,它会产生溢出错误Python Pandas series.apply出现溢出错误,python,pandas,integer-division,Python,Pandas,Integer Division,我有一个可以很好地处理单个值的函数,但是当我将它与pandas series.apply()一起使用时,它会产生溢出错误 from __future__ import division import pandas as pd import numpy as np birthdays = pd.DataFrame(np.empty([365,2]), columns = ['k','probability'], index = range(1,366)) birthdays['k'] = bir
from __future__ import division
import pandas as pd
import numpy as np
birthdays = pd.DataFrame(np.empty([365,2]), columns = ['k','probability'], index = range(1,366))
birthdays['k'] = birthdays.index
我做了一个函数:
def probability_of_shared_bday(k):
end_point = 366 - k
numerator = 1
for i in range(end_point, 366):
numerator = numerator*i
denominator = 365**k
probability_of_no_match = (1 - numerator/denominator)
return probability_of_no_match
当我尝试使用单个整数时,效果很好:
probability_of_shared_bday(1)
0.0
0.9999996927510721
但当我尝试将此函数与apply一起使用时:
birthdays['probability'] = birthdays['k'].apply(probability_of_shared_bday, convert_dtype=False)
溢出错误:整数除法结果对于浮点来说太大
无论convert\u dtype
是真是假,都会发生这种情况
检查
birthdays['k'].dtypes
我得到dtype('int64')
我不知道为什么你在apply
上会遇到这个问题,但是你不应该像当初那样编写函数。这里有一个建议,可以避免将两个巨大的数字分开:
def probability_of_shared_bday(k):
end_point = 366 - k
ratio = 1
for i in range(end_point, 366):
ratio *= i / 365
probability_of_no_match = (1 - ratio)
return probability_of_no_match
问题就解决了 什么是生日['k'].max()?生日['k'].max()是365这是一个有趣的问题,但老实说,当你进入循环时,你应该重写你的函数以除以365。我不确定你的意思是什么,伊恩-你能举个例子吗?实际上我刚刚做了一个回答:)这是编写函数的更好的方法,谢谢。如果能知道熊猫为什么会这样做,那就太好了;)
def probability_of_shared_bday(k):
end_point = 366 - k
ratio = 1
for i in range(end_point, 366):
ratio *= i / 365
probability_of_no_match = (1 - ratio)
return probability_of_no_match