Python 使用.loc[row\u indexer,col\u indexer]=value更新列时收到警告
我正试图对收到的数据进行清理 代码如下:Python 使用.loc[row\u indexer,col\u indexer]=value更新列时收到警告,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我正试图对收到的数据进行清理 代码如下: 将熊猫作为pd导入 def清理(df:pd.DataFrame)->pd.DataFrame: #从IT部门删除条目 掩码=(df['dept']!='IT') df=df[遮罩] #将其余行的部门从marketing重命名为comms mask=df['dept']=='marketing' df.loc[掩码,'部门]=“通信” #警告出现在这里。。。 #将其余行的部门从会计重命名为财务 mask=df['dept']=='accounting' d
将熊猫作为pd导入
def清理(df:pd.DataFrame)->pd.DataFrame:
#从IT部门删除条目
掩码=(df['dept']!='IT')
df=df[遮罩]
#将其余行的部门从marketing重命名为comms
mask=df['dept']=='marketing'
df.loc[掩码,'部门]=“通信”
#警告出现在这里。。。
#将其余行的部门从会计重命名为财务
mask=df['dept']=='accounting'
df.loc[掩码,'部门]='财务'
返回df
数据=[[1,“营销”],[2,“会计”],[3,“营销”],[4,“IT”],[5,“IT”],[6,“董事会”]]
df=pd.DataFrame(数据,列=['id','dept'])
df=清理(df)
我得到以下警告:
/opt/conda/lib/python3.7/site packages/pandas/core/index.py:480:SettingWithCopyWarning:
试图在数据帧切片的副本上设置值。
尝试改用.loc[row\u indexer,col\u indexer]=value
请参阅文档中的注意事项:
self.obj[item]=s
我有点担心这个警告,因为返回的数据是正确的,并且在这种情况下似乎不适用
我的代码有什么问题吗?
或者我可以安全地忽略警告吗?错误表明了这一点。您的
df
是由df=df[mask]
生成的另一帧的切片。尝试更新原始帧,而不是切片:
def cleanup(df: pd.DataFrame) -> pd.DataFrame:
# Remove entries from the IT dept
mask1 = (df['dept'] != 'IT')
# Rename the dept from marketing to comms for the remaining rows
mask2 = df['dept'] == 'marketing'
df.loc[mask1 & mask2, 'dept'] = "comms"
# The warning occurs here...
# Rename the dept from accounting to finance for the remaining rows
mask2 = df['dept'] == 'accounting'
df.loc[mask1&mask2, 'dept'] = 'finance'
return df
data = [[1,"marketing"],[2,"accounting"],[3,"marketing"],[4,"IT"],[5,"IT"],[6,"board"]]
df = pd.DataFrame(data, columns = ['id', 'dept'])
df=cleanup(df)
修改后的函数返回新的df
,其中包含部门中的值。事实上,如果您不想要这些记录,您可以复制并更新这些记录:
def cleanup(df: pd.DataFrame) -> pd.DataFrame:
# Remove entries from the IT dept
mask = (df['dept'] != 'IT')
# we copy the data frame here so it's no longer a slice
df = df[mask].copy()
# Rename the dept from marketing to comms for the remaining rows
mask = df['dept'] == 'marketing'
df.loc[mask, 'dept'] = "comms"
# The warning occurs here...
# Rename the dept from accounting to finance for the remaining rows
mask = df['dept'] == 'accounting'
df.loc[mask, 'dept'] = 'finance'
return df
错误说明了这一点。您的df
是由df=df[mask]
生成的另一帧的切片。尝试更新原始帧,而不是切片:
def cleanup(df: pd.DataFrame) -> pd.DataFrame:
# Remove entries from the IT dept
mask1 = (df['dept'] != 'IT')
# Rename the dept from marketing to comms for the remaining rows
mask2 = df['dept'] == 'marketing'
df.loc[mask1 & mask2, 'dept'] = "comms"
# The warning occurs here...
# Rename the dept from accounting to finance for the remaining rows
mask2 = df['dept'] == 'accounting'
df.loc[mask1&mask2, 'dept'] = 'finance'
return df
data = [[1,"marketing"],[2,"accounting"],[3,"marketing"],[4,"IT"],[5,"IT"],[6,"board"]]
df = pd.DataFrame(data, columns = ['id', 'dept'])
df=cleanup(df)
修改后的函数返回新的df
,其中包含部门中的值。事实上,如果您不想要这些记录,您可以复制并更新这些记录:
def cleanup(df: pd.DataFrame) -> pd.DataFrame:
# Remove entries from the IT dept
mask = (df['dept'] != 'IT')
# we copy the data frame here so it's no longer a slice
df = df[mask].copy()
# Rename the dept from marketing to comms for the remaining rows
mask = df['dept'] == 'marketing'
df.loc[mask, 'dept'] = "comms"
# The warning occurs here...
# Rename the dept from accounting to finance for the remaining rows
mask = df['dept'] == 'accounting'
df.loc[mask, 'dept'] = 'finance'
return df
关于df=df[mask].copy()
你不再使用切片了吗?关于df=df[mask].copy()
你不再使用切片了吗?