Pandas 使用<;,编写新列>;,和<;=x<=

Pandas 使用<;,编写新列>;,和<;=x<=,pandas,Pandas,我几乎完成了一个程序的编写,该程序将遍历由两个csv文件组成的csv文件。我被困在最后一列中,该列假设将造成的伤害>700000列为“高”,造成的伤害

我几乎完成了一个程序的编写,该程序将遍历由两个csv文件组成的csv文件。我被困在最后一列中,该列假设将造成的伤害>700000列为“高”,造成的伤害<列为“低”,300000 700000): df3['dps_quality']=“高” 如果(第[‘损坏’行]<300000): df3['dps_quality']=“低” 如果(300000)
  • damage\u done
    应该有数字对象(
    int
    float
    ),而不是
    字符串

  • 方法
    .apply
    为每行调用函数
    quality

    该函数返回的值将构成该方法返回的序列。
    以代码编写的此系列将分配给数据帧中的列
    dps\u quality

    因此,不需要在函数中使用列名

  • 考虑到这两个因素,可能的解决方案是:

    def quality(damage_done):
        # this line assures that the value will be interpreted as an integer
        damage_done = int(damage_done)
        if damage_done > 700000:
            # now we are returning a value, instead of assigning it directly to the column
            return 'High'
        if damage_done < 300000:
            return 'Low'
        # removing the last check as it is not necessary
        return 'Medium'
    
    # we are using the .apply method only on a series. This makes the reading easier
    df3['dps_quality'] = df3['damage_done'].apply(quality)
    
    def质量(损坏):
    #这一行确保该值将被解释为整数
    损坏完成=整数(损坏完成)
    如果造成的损坏超过700000:
    #现在我们返回一个值,而不是将其直接赋给列
    返回“高”
    如果造成的损坏小于300000:
    返回“低”
    #拆除最后一个止回阀,因为它不是必需的
    返回“中等”
    #我们只对一个系列使用.apply方法。这使阅读更容易
    df3['dps_质量']=df3['damage_done']。应用(质量)
    
    您可以尝试
    .astype('int')
    pd.to\u numeric
    函数。。
    df3['dps_quality'][df3['damage_done'] > 700000] = 'High'
    df3['dps_quality'][df3['damage_done'] < 300000] = 'Low'
    df3['dps_quality'][300000 <= df3['damage_done'] <= 699000] = 'High'
    
    import pandas as pd
    import io
    import requests as r
    
    url = 'http://drd.ba.ttu.edu/isqs6339/hw/hw2/'
    path = '/Users/jeredwilloughby/Desktop/Business Intelligence/'
    file1 = 'players.csv'
    file2 = 'player_sessions.csv'
    fileout = 'pandashw.csv'
    
    res1 = r.get(url + file1)
    res1.status_code
    df1 = pd.read_csv(io.StringIO(res1.text), delimiter='|')
    df1
    
    res2 = r.get(url + file2)
    res2.status_code
    df2 = pd.read_csv(io.StringIO(res2.text), delimiter=',')
    df2.head(5)
    df2.tail(5)
    
    df3 = df1.merge(df2, how="left", on="playerid")
    df3.describe()
    list(df3)
    df3.count()
    
    df3['damage_done'].fillna(0, inplace=True)
    df3.count()
    
    df3.to_csv(path + fileout)
    
    def performance(row):
        return (row['damage_done']*2.5 + row['healing_done']*4.5)/4
    
    df3['player_performance_metric'] = df3.apply(performance, axis = 1)
    df3
    df3.to_csv(path + fileout)
    
    def quality(row):
        if (row['damage_done'] > 700000):
            df3['dps_quality'] = 'High'
        if (row['damage_done'] < 300000):
            df3['dps_quality'] = 'Low'
        if (300000 <= row['damage_done'] <= 699999):
            df3['dps_quality'] = 'Medium'
    
    df3['dps_quality'] = df3.apply(quality, axis = 1)
    df3
    
    def quality(damage_done):
        # this line assures that the value will be interpreted as an integer
        damage_done = int(damage_done)
        if damage_done > 700000:
            # now we are returning a value, instead of assigning it directly to the column
            return 'High'
        if damage_done < 300000:
            return 'Low'
        # removing the last check as it is not necessary
        return 'Medium'
    
    # we are using the .apply method only on a series. This makes the reading easier
    df3['dps_quality'] = df3['damage_done'].apply(quality)