Pandas 使用<；，编写新列>；，和<；=x<=_Pandas

Pandas 使用<；，编写新列>；，和<；=x<=

pandas

Pandas 使用<；，编写新列>；，和<；=x<=,pandas,Pandas,我几乎完成了一个程序的编写，该程序将遍历由两个csv文件组成的csv文件。我被困在最后一列中，该列假设将造成的伤害>700000列为“高”，造成的伤害

我几乎完成了一个程序的编写，该程序将遍历由两个csv文件组成的csv文件。我被困在最后一列中，该列假设将造成的伤害>700000列为“高”，造成的伤害<列为“低”，300000 700000）： df3['dps_quality']=“高” 如果（第[‘损坏’行]<300000）： df3['dps_quality']=“低” 如果（300000）

列

damage\u done

应该有数字对象（

int

或

float

），而不是

字符串


方法.apply
为每行调用函数quality
。

该函数返回的值将构成该方法返回的序列。

以代码编写的此系列将分配给数据帧中的列dps\u quality
。

因此，不需要在函数中使用列名
考虑到这两个因素，可能的解决方案是：
def quality(damage_done):
    # this line assures that the value will be interpreted as an integer
    damage_done = int(damage_done)
    if damage_done > 700000:
        # now we are returning a value, instead of assigning it directly to the column
        return 'High'
    if damage_done < 300000:
        return 'Low'
    # removing the last check as it is not necessary
    return 'Medium'

# we are using the .apply method only on a series. This makes the reading easier
df3['dps_quality'] = df3['damage_done'].apply(quality)

def质量（损坏）：
#这一行确保该值将被解释为整数
损坏完成=整数（损坏完成）
如果造成的损坏超过700000：
#现在我们返回一个值，而不是将其直接赋给列
返回“高”
如果造成的损坏小于300000：
返回“低”
#拆除最后一个止回阀，因为它不是必需的
返回“中等”
#我们只对一个系列使用.apply方法。这使阅读更容易
df3['dps_质量']=df3['damage_done']。应用（质量）
您可以尝试.astype（'int'）
或pd.to\u numeric函数。。
df3['dps_quality'][df3['damage_done'] > 700000] = 'High'
df3['dps_quality'][df3['damage_done'] < 300000] = 'Low'
df3['dps_quality'][300000 <= df3['damage_done'] <= 699000] = 'High'

import pandas as pd
import io
import requests as r

url = 'http://drd.ba.ttu.edu/isqs6339/hw/hw2/'
path = '/Users/jeredwilloughby/Desktop/Business Intelligence/'
file1 = 'players.csv'
file2 = 'player_sessions.csv'
fileout = 'pandashw.csv'

res1 = r.get(url + file1)
res1.status_code
df1 = pd.read_csv(io.StringIO(res1.text), delimiter='|')
df1

res2 = r.get(url + file2)
res2.status_code
df2 = pd.read_csv(io.StringIO(res2.text), delimiter=',')
df2.head(5)
df2.tail(5)

df3 = df1.merge(df2, how="left", on="playerid")
df3.describe()
list(df3)
df3.count()

df3['damage_done'].fillna(0, inplace=True)
df3.count()

df3.to_csv(path + fileout)

def performance(row):
    return (row['damage_done']*2.5 + row['healing_done']*4.5)/4

df3['player_performance_metric'] = df3.apply(performance, axis = 1)
df3
df3.to_csv(path + fileout)

def quality(row):
    if (row['damage_done'] > 700000):
        df3['dps_quality'] = 'High'
    if (row['damage_done'] < 300000):
        df3['dps_quality'] = 'Low'
    if (300000 <= row['damage_done'] <= 699999):
        df3['dps_quality'] = 'Medium'

df3['dps_quality'] = df3.apply(quality, axis = 1)
df3

def quality(damage_done):
    # this line assures that the value will be interpreted as an integer
    damage_done = int(damage_done)
    if damage_done > 700000:
        # now we are returning a value, instead of assigning it directly to the column
        return 'High'
    if damage_done < 300000:
        return 'Low'
    # removing the last check as it is not necessary
    return 'Medium'

# we are using the .apply method only on a series. This makes the reading easier
df3['dps_quality'] = df3['damage_done'].apply(quality)