Python 3.x 根据python中其他列的计算平均值更改列值
我是熊猫的新手。我问了很多问题,但没有找到答案 我有以下数据集Python 3.x 根据python中其他列的计算平均值更改列值,python-3.x,pandas,dataframe,Python 3.x,Pandas,Dataframe,我是熊猫的新手。我问了很多问题,但没有找到答案 我有以下数据集 Name || Price || Cuisine Category || City || Region || Cuisine Types || Rating Types || Rating Pizza || 600 || Fast Food,Pizza || Ajmer || Ana Saga || Quick Bites || Good || 3.9 ... ... ... ...
Name || Price || Cuisine Category || City || Region || Cuisine Types || Rating Types || Rating
Pizza || 600 || Fast Food,Pizza || Ajmer || Ana Saga || Quick Bites || Good || 3.9
... ... ... ... ... ... ... ... ...
Chawla's || 300 || Beverages || Ajmer || Sagar Lake || Cafe || Average || 3.3
Masala || 0 || North,South Indian || Ajmer || Ram Ganj || Mess || None || NEW
我想更改以下内容的值:
- 根据特定烹饪类型的平均评分,然后根据计算出的评分,对新烹饪类型进行评分
- 价格为0时,基于该特定区域的平均价格
data = pd.read_csv('/content/Ajmer.csv')
计算区域平均价格
gregion = round(data.groupby('Region')['Price'].mean())
正在尝试替换价格列的0
data['Price'] = data['Price'].replace(0, gregion[data['Region']])
但我的价格栏是不变的
我尝试更改评级:
读取CSV文件
data2 = pd.read_csv('/content/Ajmer.csv')
创建单独的数据框,使其不会影响平均值
filtered_rating = data2[(data2['Rating'] == 'NEW') | (data2['Rating'] == '-') | (data2['Rating'] == 'Opening')]
从原始数据中删除2
data2.drop(data2.loc[data['Rating']=='NEW'].index, inplace=True)
data2.drop(data2.loc[data['Rating']=='-'].index, inplace=True)
data2.drop(data2.loc[data['Rating']=='Opening'].index, inplace=True)
按等级平均数计算烹饪类型
c = round(data2.groupby('Cuisine Types')['Rating'].mean(),1)
这给了我如下输出:
Cuisine Types
Bakery 3.4
Confectionery 3.4
Dessert Parlor 3.5
...
Quick Bites 3.4
Sweet Shop 3.4
Name: Rating, dtype: float64
试图替换值
filtered_rating['Rating'].replace('NEW', c[data2['Region']], inplace=True)
filtered_rating['Rating'].replace('-', c[data2['Region']], inplace=True)
filtered_rating['Rating'].replace('Opening', c[data2['Region']], inplace=True)
但我的评级栏没有改变
预期产量
- 价格列中价格为零的行的特定区域的平均价格
data['Price'] = data['Price'].replace(0, gregion[data['Region']])
非常感谢您的帮助您可以尝试以下代码:
gregion = round(data.groupby('Region')['Price'].mean())
# convert your group by to DataFrame
gregion = pd.DataFrame(gregion)
gregion.reset_index(inplace=True)
# merge the datas and drop the new column that is created
data = data.merge(gregion, left_on='Region', right_on='Region', suffixes=('_x', ''))
data = data.drop(columns={'Price_x'})
filtered_rating = data[(data['Rating'] == 'NEW') | (data['Rating'] == '-') | (data['Rating'] == 'Opening')]
# you don't need to re-upload the file
data2 = data.copy()
data2.drop(data2.loc[data2['Rating']=='NEW'].index, inplace=True)
data2.drop(data2.loc[data2['Rating']=='-'].index, inplace=True)
data2.drop(data2.loc[data['Rating']=='Opening'].index, inplace=True)
# do the same with c
c = round(data2.groupby('Cuisine Types')['Rating'].mean(),1)
c = pd.DataFrame(c)
c.reset_index(inplace=True)
filtered_rating = filtered_rating.merge(c, left_on='Cuisine Types', right_on='Cuisine Types', how='left', suffixes=('_x', ''))
filtered_rating = filtered_rating.drop(columns={'Rating_x'})
希望这有帮助。假设您有如下数据
data
name region price cuisine_type rating_type rating
0 pizza NY 500 fast food average 3.3
1 burger NY 350 fast food good 4.1
2 lobster LA 1500 seafood good 4.5
3 mussels LA 1000 seafood average 3.9
4 shawarma NY 300 mediterranean average 3.4
5 kabab LA 600 mediterranean good 4
6 pancake NY 250 breakfast average 3.7
7 waffle LA 450 breakfast good 4.2
8 fries NY 0 fast food None NEW
9 crab LA 0 seafood None Opening
10 tuna sandwich NY 0 seafood None NEW
11 onion rings LA 0 fast food None Opening
现在,根据您的问题,我们需要将新的或开放的评级替换为相应类型的平均评级。以及当其为0时的价格,以及相应区域的平均价格。并在末尾更新“无”的评级类型
#get a list of cuisine types
cuisine_type_list=data.cuisine_type.unique().tolist()
cuisine_type_list
['fast food', 'seafood', 'mediterranean', 'breakfast']
#get a list of regions
region_list=data.region.unique().tolist()
region_list
['NY', 'LA']
这是更新后的数据
data
name region price cuisine_type rating_type rating
0 pizza NY 500 fast food average 3.3
1 burger NY 350 fast food good 4.1
2 lobster LA 1500 seafood good 4.5
3 mussels LA 1000 seafood average 3.9
4 shawarma NY 300 mediterranean average 3.4
5 kabab LA 600 mediterranean good 4
6 pancake NY 250 breakfast average 3.7
7 waffle LA 450 breakfast good 4.2
8 fries NY 350 fast food average 3.7
9 crab LA 887.5 seafood good 4.2
10 tuna sandwich NY 350 seafood good 4.2
11 onion rings LA 887.5 fast food average 3.7
谢谢!这是一个绝妙的方法