Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/344.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 计算活人与死人的年龄 编辑:_Python_Python 3.x_Pandas - Fatal编程技术网

Python 计算活人与死人的年龄 编辑:

Python 计算活人与死人的年龄 编辑:,python,python-3.x,pandas,Python,Python 3.x,Pandas,正如有人建议的那样,我给出了一个可验证的例子。如果您从中取出熊猫,只需放置原始值而不是数据帧值,它就可以完美地工作 如果你把熊猫带回来,就像我下面所说的,程序运行并返回0进行打印(true_age) 现在,我把它设计成嵌套的if语句,但我在某个地方出了问题。如果这个人还活着,一切正常;当我打印年龄时,它会输出正确的年龄。但是,如果此人已死亡,则打印的年龄始终为零。以下是嵌套的if语句以及相关的打印语句: #Here are the nested if statements: if di

正如有人建议的那样,我给出了一个可验证的例子。如果您从中取出熊猫,只需放置原始值而不是数据帧值,它就可以完美地工作

如果你把熊猫带回来,就像我下面所说的,程序运行并返回0进行打印(true_age)

现在,我把它设计成嵌套的if语句,但我在某个地方出了问题。如果这个人还活着,一切正常;当我打印年龄时,它会输出正确的年龄。但是,如果此人已死亡,则打印的年龄始终为零。以下是嵌套的if语句以及相关的打印语句:

#Here are the nested if statements:

    if died_year is None:   

            if bmonth > now_month:
                    if bday > now_day:
                            true_age = age_raw - 1
                    elif bday < now_day:
                            true_age = age_raw
            elif bmonth < now_month:

                    true_age = age_raw

    elif died_year is not None:

            died_year = int(died_year)
            died_month = int(died_month)
            died_day = int(died_day)

            age_raw = died_year - byear

            if bmonth > died_month:
                    if bday > died_day:
                            true_age = age_raw - 1
                    elif bday < died_day:
                            true_age = age_raw
            elif bmonth < died_month:
                    true_age = age_raw

#And now the print statement:

print("DOB: "+str(bmonth)+"/"+str(bday)+"/"+str(byear)+" ("+str(true_age)+" years old)")

注意在满足适当条件之前,我没有将变量died_year、died_month和died_day转换为整数;在if语句之外执行此操作将触发错误,因为null值不能作为int()传递。我觉得我错过了一些非常明显的东西,但也许不是。另外,如果有人有更好的方法来完成这一切,我随时准备学习如何提高效率。

将这些值转换为datetime对象要容易得多,然后进行if/elif过滤

import datetime
bmonth = int(storage.iloc[0]['birthMonth'])
bday = int(storage.iloc[0]['birthDay'])
byear = int(storage.iloc[0]['birthYear'])

died_year = storage.iloc[0]['deathYear']
died_month = storage.iloc[0]['deathMonth']
died_day = storage.iloc[0]['deathDay']

start = datetime.datetime(month = bmonth, day=bday, year=byear)
end =  datetime.datetime(month=died_month, day=died_day, year=died_year)
(start-end).days#returns the difference between the days
您还可以将
datetime.now()和
计算在内

希望这会有帮助,它会让你的流程变得更好。

为最小的示例兼容性而编辑 您可以定义一个计算人员年龄的函数:

from datetime import date 


def calc_age(row):
    bm = row['bornMonth']
    bd = row['bornDay']
    by = row['bornYear']

    dm = row['diedMonth']
    dd = row['diedDay']
    dy = row['diedYear']

    birth_date = date(*[int(i) for i in (by, bm, bd)])  # suppose that all the parameters is not None
    try:
        end_date = date(*[int(i) for i in (dy, dm, dd)])
    except (TypeError, ValueError):  # if death date is None
        end_date = date.today()

    # is birth date after death date or today; if True == 1, else == 0
    is_next_year = ((end_date.month, end_date.day) < (birth_date.month, birth_date.day))
    age = end_date.year - birth_date.year - is_next_year 
    return age
如果没有遗漏的数据,它会返回所有人的pd.系列和年龄。您可以将其连接到数据帧:

df['personsAge'] = df.apply(calc_age, axis=1)
然后添加另一列,并显示状态和打印结果:

def is_dead(row):
    dm = row['diedMonth']
    dd = row['diedDay']
    dy = row['diedYear']
    try:
        died = date(*[int(i) for i in (dy, dm, dd)])
        return True
    except ValueError:
        return False

df['is_dead'] = df.apply(is_dead, axis=1)

def print_status(row):
    bm = row['bornMonth']
    bd = row['bornDay']
    by = row['bornYear']

    dm = row['diedMonth']
    dd = row['diedDay']
    dy = row['diedYear']
    age = row['personsAge']

    print("DOB: "+str(bm)+"/"+str(bd)+"/"+str(by)+" ("+str(age)+" years old)")
    if row['is_dead']:
        print("*DECEASED: "+str(dm)+"/"+str(dd)+"/"+str(dy))

df.apply(print_status, axis=1)

stdout:
DOB: 8/17/1932 (47 years old)
*DECEASED: 3/22/1980
DOB: 4/12/1950 (68 years old)

如果您不喜欢复制粘贴日期选择,请将其替换为解决方案中的
datetime
方法

熊猫对时间序列有极好的支持,因此利用适当的工具是个好主意。将列转换为单个Datetime列后,我们可以对其执行时间算术:

# demo dataframe
df = pd.DataFrame({
    'birthMonth': [5, 2],
    'birthDay': [4, 24],
    'birthYear': [1924, 1997],
    'deathMonth': [3, None],
    'deathDay': [1, None],
    'deathYear': [2008, None]
})

# convert birth dates to datetimes
birth = pd.to_datetime(df[['birthMonth', 'birthDay', 'birthYear']]
                       .rename(columns={'birthMonth': 'month', 'birthDay': 'day', 'birthYear': 'year'}))
# convert death dates to datetimes
death = pd.to_datetime(df[['deathMonth', 'deathDay', 'deathYear']]
                       .rename(columns={'deathMonth':'month', 'deathDay': 'day', 'deathYear': 'year'}))

# calculate age in days, normalizing 'now' to midnight of today
age = (pd.Timestamp.now().normalize() - birth).where(death.isnull(), other=death-birth)

编辑:请参阅下面@ALollz关于时间戳规范化的讨论。

您认为您可以生成一个吗?似乎您可以使用
DataFrame.loc来实现这一点,并且可以避免所有循环。将日期转换为
日期时间
将使减法变得非常简单。我同意ALollz的观点,并补充说,用布尔值表示此人是否已死亡可能比较方便。要进一步说明@Alessi42所说的内容,我建议您检查
死亡年份
是否符合您的预期。请尝试打印
ing
dead\u year
,并确保它是错误的。
ValueError:以10为基数的int()的文本无效:“”
,这是预期的。解决方案不错+1.你可以考虑使用<代码> Pd。时间戳。00时间段,因为没有关于死者的信息。这看起来像一个固溶体,我将测试它,看看会发生什么。这非常有效。使用您的解决方案,我使用“age”并向其添加dt.days属性,然后除以365.2422,然后将其作为整数输出。问题解决了。谢谢@很好的建议,我没想到。没有它,上面的代码会在一天中的不同时间产生不同的输出。@adrysdale另一个可以转换为年的方法是
age.astype('timedelta64[Y]”)
df.apply(calc_age, axis=1)
df['personsAge'] = df.apply(calc_age, axis=1)
def is_dead(row):
    dm = row['diedMonth']
    dd = row['diedDay']
    dy = row['diedYear']
    try:
        died = date(*[int(i) for i in (dy, dm, dd)])
        return True
    except ValueError:
        return False

df['is_dead'] = df.apply(is_dead, axis=1)

def print_status(row):
    bm = row['bornMonth']
    bd = row['bornDay']
    by = row['bornYear']

    dm = row['diedMonth']
    dd = row['diedDay']
    dy = row['diedYear']
    age = row['personsAge']

    print("DOB: "+str(bm)+"/"+str(bd)+"/"+str(by)+" ("+str(age)+" years old)")
    if row['is_dead']:
        print("*DECEASED: "+str(dm)+"/"+str(dd)+"/"+str(dy))

df.apply(print_status, axis=1)

stdout:
DOB: 8/17/1932 (47 years old)
*DECEASED: 3/22/1980
DOB: 4/12/1950 (68 years old)
# demo dataframe
df = pd.DataFrame({
    'birthMonth': [5, 2],
    'birthDay': [4, 24],
    'birthYear': [1924, 1997],
    'deathMonth': [3, None],
    'deathDay': [1, None],
    'deathYear': [2008, None]
})

# convert birth dates to datetimes
birth = pd.to_datetime(df[['birthMonth', 'birthDay', 'birthYear']]
                       .rename(columns={'birthMonth': 'month', 'birthDay': 'day', 'birthYear': 'year'}))
# convert death dates to datetimes
death = pd.to_datetime(df[['deathMonth', 'deathDay', 'deathYear']]
                       .rename(columns={'deathMonth':'month', 'deathDay': 'day', 'deathYear': 'year'}))

# calculate age in days, normalizing 'now' to midnight of today
age = (pd.Timestamp.now().normalize() - birth).where(death.isnull(), other=death-birth)