修改python函数内的数据帧
我需要在函数中向dataframe追加一行,并使用作为参数传递的值修改python函数内的数据帧,python,pandas,Python,Pandas,我需要在函数中向dataframe追加一行,并使用作为参数传递的值 import pandas as pd # Declare global DataFrame global df df = pd.DataFrame([['1','2','3']], columns=['x','y','z']) def append_row(a,b,c): vlist = [a,b,c] cols = ['x','y','z'] # using zip() to conver
import pandas as pd
# Declare global DataFrame
global df
df = pd.DataFrame([['1','2','3']], columns=['x','y','z'])
def append_row(a,b,c):
vlist = [a,b,c]
cols = ['x','y','z']
# using zip() to convert lists to dictionary
res = dict(zip(cols, vlist))
# Create pandas DataFrame for new row addition
df = df.append(res, ignore_index=True)
print("New row added", df.tail(1))
return df
预期产出:
New row appended to `df`
x y z
1 2 3
a b c
当我运行此代码时,我得到一个:
Python 3: UnboundLocalError: local variable `df` referenced before assignment.
How would I be able to modify pandas DataFrame and add a new row by referencing a dataframe that's read outside the function?
附加上下文:从不同的脚本调用函数,并在与函数声明相同的脚本中读取数据帧。全局df应位于函数内部
df = pd.DataFrame([['1','2','3']], columns=['x','y','z'])
def append_row(a,b,c):
global df
vlist = [a,b,c]
cols = ['x','y','z']
# using zip() to convert lists to dictionary
res = dict(zip(cols, vlist))
# Create pandas DataFrame for new row addition
df = df.append(res, ignore_index=True)
print("New row added", df.tail(1))
return df
append_row(1,2,3)
如果要逐行插入,只需将新值添加为元组即可:
def append_row(a, b, c):
global df
df.loc[df.shape[0], :] = a, b, c
return df
另一方面,既然你无论如何都要返回df,我看不出有什么理由它应该是全球性的。可以将dataframe作为参数传递给函数和新值的元组:
def append_row(df: pd.DataFrame, new_data: tuple) -> pd.DataFrame:
df.loc[df.shape[0], :] = new_data
return df
但是,将全局放在内部,修改全局内容是一种糟糕的编程实践,因为在以后的阶段调试将更加困难
import pandas as pd
# Declare DataFrame
df = pd.DataFrame([['1','2','3']], columns=['x','y','z'])
def append_row(a,b,c):
vlist = [a,b,c]
cols = ['x','y','z']
# using zip() to convert lists to dictionary
res = dict(zip(cols, vlist))
# Create pandas DataFrame for new row addition and assign to global df
global df
df = df.append(res, ignore_index=True)
print("New row added", df.tail(1))
return df
append_row('a','b','c')
df
有两个问题:
def append_row(dataframe, args):
row = dict(zip(dataframe.columns.to_list(), args))
return dataframe.append(row, ignore_index=True)
#usage
global df
df = pd.DataFrame([['1','2','3']], columns=['x','y','z'])
df = append_row(df, [4,5,6])
df = append_row(df, [7, '8 as text', [9, 'in a list']])
print(df)
此解决方案使用并允许多个输入变量,如原始代码示例中所示:
def append_row(dataframe, *args):
row = dict(zip(dataframe.columns.to_list(), args))
return dataframe.append(row, ignore_index=True)
#usage
global df
df = pd.DataFrame([['1','2','3']], columns=['x','y','z'])
df = append_row(df, 4, 5, 6)
df = append_row(df, 7, '8 as text', [9, 'in a list'])
print(df)
两者产生相同的输出:
x y z
0 1 2 3
1 4 5 6
2 7 8 as text [9, in a list]
希望这有帮助。快乐蟒蛇:)no
return
在函数中。另外,您的预期输出是什么如果您想修改全局变量,您需要添加global
关键字。虽然更好的办法是将df
作为参数传递给函数。@sammywemmy问题已更新为return
语句和预期输出。完美答案。