Python 如何根据不同的数据帧应用最小-最大定标器
我有一个数据帧,如下所示:Python 如何根据不同的数据帧应用最小-最大定标器,python,Python,我有一个数据帧,如下所示: import pandas as pd df = pd.DataFrame({ 'category': ['fruits','fruits','fruits','fruits','fruits','vegetables','vegetables','vegetables','vegetables','vegetables'], 'product' : ['apple','orange','durian','coconut','grape','cabbage','c
import pandas as pd
df = pd.DataFrame({
'category': ['fruits','fruits','fruits','fruits','fruits','vegetables','vegetables','vegetables','vegetables','vegetables'],
'product' : ['apple','orange','durian','coconut','grape','cabbage','carrot','spinach','grass','potato'],
'sales' : [10,20,30,40,100,10,30,50,60,100]
})
df.head(15)
当前方法:根据df中的单个类别进行规范化,手动
from sklearn import preprocessing
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
df_fruits = df[df['category'] == "fruits"]
df_fruits['sales'] = scaler.fit_transform(df_fruits[['sales']])
df_fruits.head()
df_fruits = pd.to_csv('minmax/output/category-{}-minmax.csv'.format('XX'))
问题:-如何相应地循环到df中的所有类别
-然后如何相应地导出包含类别名称的csv文件
非常感谢使用
系列。独特的:
for i in df["category"].unique():
cat = df[df['category'] == i]
cat['sales'] = scaler.fit_transform(cat[['sales']])
cat.to_csv('minmax/output/category-{}-minmax.csv'.format(i))
看起来你需要做一些功能体操才能让它起作用
您的数据帧
import pandas as pd
df = pd.DataFrame({
'category': ['fruits','fruits','fruits','fruits','fruits','vegetables','vegetables','vegetables','vegetables','vegetables'],
'product' : ['apple','orange','durian','coconut','grape','cabbage','carrot','spinach','grass','potato'],
'sales' : [10,20,30,40,100,10,30,50,60,100]
})
现在将其应用于分组的数据帧
df['scaled_sales'] = df.groupby('category')['sales'].transform(minmax_wrapper)
瞧
您可以使用
# I believe this should work haven't tried it out
for category, grouped in df.groupby('category'):
grouped.to_csv(f"minmax/output/category-{category}-minmax.csv")
AttributeError:模块“pandas”没有“to_csv”属性,我做错了哪部分?对不起,一个简单的输入错误。上面已编辑。您是否尝试使用groupby
例如df.groupby('category')['sales'].agg(scaler.fit\u transform)
# I believe this should work haven't tried it out
for category, grouped in df.groupby('category'):
grouped.to_csv(f"minmax/output/category-{category}-minmax.csv")