Python 如何从dataframe中的两列除法中找到最小值
我想找到两列的最小除法,只在列表的第三列中有值。 我的数据帧是:Python 如何从dataframe中的两列除法中找到最小值,python,dataframe,Python,Dataframe,我想找到两列的最小除法,只在列表的第三列中有值。 我的数据帧是: ID size price 0 1 5 300 1 2 10 500 2 3 20 600 3 4 35 800 4 5 65 900 5 6 70 1000 我只想从列表中有值的ID中找到最低价格/尺寸 ids_wanted = [1,4,6] 我编写了这段代码,它是有效的,但我觉得为任务创建新的数据帧既昂贵又没有必要
ID size price
0 1 5 300
1 2 10 500
2 3 20 600
3 4 35 800
4 5 65 900
5 6 70 1000
我只想从列表中有值的ID中找到最低价格/尺寸
ids_wanted = [1,4,6]
我编写了这段代码,它是有效的,但我觉得为任务创建新的数据帧既昂贵又没有必要
import numpy as np
import pandas as pd
index = [0,1,2,3,4,5]
i = pd.Series([1,2,3,4,5,6], index=index)
s = pd.Series([5,10,20,35,65,70],index= index)
p = pd.Series([300,500,600,800,900,1000],index= index)
df = pd.DataFrame(np.c_[i,s,p],columns = ["ID","size","price"])
print("original df:\n",df,"\n")
ids_wanted = [1,4,6]
df_with_ids_wanted = df.loc[df['ID'].isin(ids_wanted)]
print("df with ids wanted:\n",df_with_ids_wanted,"\n")
price_per_byte = df_with_ids_wanted['price'] / df_with_ids_wanted['size']
df_with_ids_wanted_ppb = df_with_ids_wanted.assign(pricePerByte=price_per_byte)
print("df with ids wanted and price/size column:\n",df_with_ids_wanted_pps,"\n")
min_ppb = df_with_ids_wanted_pps['pricePerByte'].min()
print("min price per byte:",min_ppb)
产出:
original df:
ID size price
0 1 5 300
1 2 10 500
2 3 20 600
3 4 35 800
4 5 65 900
5 6 70 1000
df with ids wanted:
ID size price
0 1 5 300
3 4 35 800
5 6 70 1000
df with ids wanted and price/size column:
ID size price pricePerByte
0 1 5 300 60.000000
3 4 35 800 22.857143
5 6 70 1000 14.285714
min price per byte: 14.285714285714286
如果您想简洁,可以尝试以下方法:
i = range(1,7)
s = [5,10,20,35,65,70]
p = [300,500,600,800,900,1000]
df = pd.DataFrame({"ID":i,"size":s,"price":p})
df
输出:
ID size price
0 1 5 300
1 2 10 500
2 3 20 600
3 4 35 800
4 5 65 900
5 6 70 1000
14.285714285714286
the minimum price/size is 14.285714285714286
下一行将如下所示:
id_chosen = [1,4,6]
(df[df.ID.isin(id_chosen)]["price"]/df[df.ID.isin(id_chosen)]["size"]).min()
import numpy as np
import pandas as pd
dict = {'id': [1, 2, 3, 4, 5, 6],
'size': [5, 10, 20, 35, 65, 70],
'price': [300, 500, 600, 800, 900, 1000]
}
df = pd.DataFrame(dict)
df['price/byte'] = df['price'] / df['size']
ids_wanted = [1, 4, 6]
subset = df[df['id'].isin(ids_wanted)]
sorted_values = subset.sort_values(by='price/byte', ascending = True)
print(sorted_values['price/byte'].iloc[0])
输出:
ID size price
0 1 5 300
1 2 10 500
2 3 20 600
3 4 35 800
4 5 65 900
5 6 70 1000
14.285714285714286
the minimum price/size is 14.285714285714286
或
输出:
ID size price
0 1 5 300
1 2 10 500
2 3 20 600
3 4 35 800
4 5 65 900
5 6 70 1000
14.285714285714286
the minimum price/size is 14.285714285714286
这样,您就不必创建新的数据帧。
希望这有帮助。如果您想简洁,可以尝试以下方法:
i = range(1,7)
s = [5,10,20,35,65,70]
p = [300,500,600,800,900,1000]
df = pd.DataFrame({"ID":i,"size":s,"price":p})
df
输出:
ID size price
0 1 5 300
1 2 10 500
2 3 20 600
3 4 35 800
4 5 65 900
5 6 70 1000
14.285714285714286
the minimum price/size is 14.285714285714286
下一行将如下所示:
id_chosen = [1,4,6]
(df[df.ID.isin(id_chosen)]["price"]/df[df.ID.isin(id_chosen)]["size"]).min()
import numpy as np
import pandas as pd
dict = {'id': [1, 2, 3, 4, 5, 6],
'size': [5, 10, 20, 35, 65, 70],
'price': [300, 500, 600, 800, 900, 1000]
}
df = pd.DataFrame(dict)
df['price/byte'] = df['price'] / df['size']
ids_wanted = [1, 4, 6]
subset = df[df['id'].isin(ids_wanted)]
sorted_values = subset.sort_values(by='price/byte', ascending = True)
print(sorted_values['price/byte'].iloc[0])
输出:
ID size price
0 1 5 300
1 2 10 500
2 3 20 600
3 4 35 800
4 5 65 900
5 6 70 1000
14.285714285714286
the minimum price/size is 14.285714285714286
或
输出:
ID size price
0 1 5 300
1 2 10 500
2 3 20 600
3 4 35 800
4 5 65 900
5 6 70 1000
14.285714285714286
the minimum price/size is 14.285714285714286
这样,您就不必创建新的数据帧。
希望这有帮助。我会这样做:
id_chosen = [1,4,6]
(df[df.ID.isin(id_chosen)]["price"]/df[df.ID.isin(id_chosen)]["size"]).min()
import numpy as np
import pandas as pd
dict = {'id': [1, 2, 3, 4, 5, 6],
'size': [5, 10, 20, 35, 65, 70],
'price': [300, 500, 600, 800, 900, 1000]
}
df = pd.DataFrame(dict)
df['price/byte'] = df['price'] / df['size']
ids_wanted = [1, 4, 6]
subset = df[df['id'].isin(ids_wanted)]
sorted_values = subset.sort_values(by='price/byte', ascending = True)
print(sorted_values['price/byte'].iloc[0])
我会这样做:
id_chosen = [1,4,6]
(df[df.ID.isin(id_chosen)]["price"]/df[df.ID.isin(id_chosen)]["size"]).min()
import numpy as np
import pandas as pd
dict = {'id': [1, 2, 3, 4, 5, 6],
'size': [5, 10, 20, 35, 65, 70],
'price': [300, 500, 600, 800, 900, 1000]
}
df = pd.DataFrame(dict)
df['price/byte'] = df['price'] / df['size']
ids_wanted = [1, 4, 6]
subset = df[df['id'].isin(ids_wanted)]
sorted_values = subset.sort_values(by='price/byte', ascending = True)
print(sorted_values['price/byte'].iloc[0])
我建议尽可能避免在pandas/NumPy中使用for循环。我建议尽可能避免pandas/NumPy中的for循环。更快的方法比比皆是