Python 如何在数据帧上循环并创建列表

Python 如何在数据帧上循环并创建列表,python,function,loops,class,dictionary,Python,Function,Loops,Class,Dictionary,因此,我有下面的数据,我想循环数据帧并执行一些函数,最后将函数的结果保存在一个列表中。我无法创建列表。我只得到列表中的一个值,而不是我想要得到的两个值。任何人如果有更有效的方法来解决这个问题,请分享 dict = {'PassengerId' : [0.0, 0.001, 0.002, 0.003, 0.004, 0.006, 0.007, 0.008, 0.009, 0.01], 'Survived' : [0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0

因此,我有下面的数据,我想循环数据帧并执行一些函数,最后将函数的结果保存在一个列表中。我无法创建列表。我只得到列表中的一个值,而不是我想要得到的两个值。任何人如果有更有效的方法来解决这个问题,请分享


     dict = {'PassengerId' : [0.0, 0.001, 0.002, 0.003, 0.004, 0.006, 0.007, 0.008, 0.009, 0.01], 
'Survived' : [0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0], 
'Pclass' : [1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.5],
'Age' : [0.271, 0.472, 0.321, 0.435, 0.435, np.nan, 0.673, 0.02, 0.334, 0.171], 
'SibSp' : [0.125, 0.125, 0.0, 0.125, 0.0, 0.0, 0.0, 0.375, 0.0, 0.125], 
'Parch' : [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.167, 0.333, 0.0], 
'Fare' : [0.014, 0.139, 0.015, 0.104, 0.016, 0.017, 0.101, 0.041, 0.022, 0.059]}


        
import pandas as pd
dicts = pd.DataFrame(dicts, columns = dicts.keys())
def Mean(self):
    list_mean = []
    list_all = []
    for i, row in dicts.iterrows():
        if (row['Age'] > 0.2) & (row['Fare'] < 0.1):
            list_all.append(row['PassengerId'])
        elif (row['Age'] > 0.2) & (row['Fare'] > 0.1):
            list_all.clear()
            list_all.append(row['PassengerId'])
    return list_mean.append(np.mean(list_all))
            
               
Mean()


dict={'PassengerId':[0.0,0.001,0.002,0.003,0.004,0.006,0.007,0.008,0.009,0.01],
“生存下来的”:[0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0],
“Pclass”:[1.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0,1.0,1.0,0.5],
‘年龄’:[0.271,0.472,0.321,0.435,0.435,np.nan,0.673,0.02,0.334,0.171],
“SibSp”:[0.125,0.125,0.0,0.125,0.0,0.0,0.0,0.375,0.0,0.125],
“Parch”:[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.167,0.333,0.0],
‘票价’:[0.014,0.139,0.015,0.104,0.016,0.017,0.101,0.041,0.022,0.059]}
作为pd进口熊猫
dicts=pd.DataFrame(dicts,columns=dicts.keys())
def平均值(自我):
列表_平均值=[]
列出所有=[]
对于我,在dicts.iterrows()中的行:
如果(第[‘年龄’]>0.2行)和(第[‘票价’]<0.1行):
list_all.append(第['PassengerId'行])
elif(第[‘年龄’]>0.2行)和(第[‘票价’]>0.1行):
列出所有。清除()
list_all.append(第['PassengerId'行])
返回列表\平均值追加(np.平均值(列表\全部))
平均数()

请帮帮我

要解决此问题,您必须在解决方案中进行一些更改。对于矢量化答案,请查看我的代码部分。

1. 返回语句
返回列表\u意思是
应该放在
功能块中
而不是
if块中

更改:

. . .         
if (row['Age'] > self.age) & (row['Fare'] < self.fare):
                list_mean.append(row['PassengerId'])
                return list_mean            
. . .
. . .
list_mean = []
for i, row in dicts.iterrows():
    if (row['Age'] > self.age) & (row['Fare'] < self.fare):
         list_mean.append(row['PassengerId'])
return list_mean
. . .
(0.00375, 0.0036666666666666666)
将熊猫作为pd导入
将numpy作为np导入
dict={'PassengerId':[0.0,0.001,0.002,0.003,0.004,0.006,0.007,0.008,0.009,0.01],
“生存下来的”:[0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0],
“Pclass”:[1.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0,1.0,1.0,0.5],
‘年龄’:[0.271,0.472,0.321,0.435,0.435,np.nan,0.673,0.02,0.334,0.171],
“SibSp”:[0.125,0.125,0.0,0.125,0.0,0.0,0.0,0.375,0.0,0.125],
“Parch”:[0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.167,0.333,0.0],
‘票价’:[0.014,0.139,0.015,0.104,0.016,0.017,0.101,0.041,0.022,0.059]}
df=pd.DataFrame(dict,columns=dict.keys())
def calculate_mean():
l1,l2=[],[]
对于i,df.iterrows()中的行:
如果第['Age']>0.2行和第['Fare']<0.1行:
l1.追加(第['PassengerId'行])
elif第['Age']>0.2行和第['Fare']>0.1行:
l2.追加(第['PassengerId'行])
返回np.平均值(l1),np.平均值(l2)
打印(计算平均值())#(0.00375,0.0036666)

如果我正确理解了这个问题,那么您只会得到列表中的一项,这是因为只要满足数据帧中第一个值的If条件,您就会返回。我相信您应该返回最终值,即在for循环完成时返回。@SomuSinhhaa谢谢您的回答,但是我能够解决问题,我现在有一个新的挑战,您能帮我检查一下吗?我已经修改了代码。抱歉,我仍然看到您的旧代码,您正在尝试在if块中返回。您应该按照其中一个答案中的说明返回,即仅在您将所有列表元素存储在list_mean中后,即在完成for循环后返回。此外,如果您有不同的问题,我建议您打开一个新的线程。@SomuSinhhaa我已经编辑了它,您现在可以检查它。请您详细说明这一行。不太清楚“我只在列表中得到一个值,而不是我想要得到的两个值”我猜,但不确定您是否希望附加到列表中,以防您的任何一个条件匹配,那么在这种情况下,您必须使用逻辑或组合这两个条件,而不是elifthank you非常@Somusinhai编辑问题,你能帮我找到最有效的解决方法吗?@deeplearningEngineer请检查解决方案并让我知道任何问题。如果回答是肯定的,看起来还可以。然而,我希望是否有一种方法可以使用循环来实现它?该代码是更大代码的一部分,循环将更有效,而不是在explorore X中使用大块代码
(0.00375, 0.0036666666666666666)
import pandas as pd
import numpy as np

dict = {'PassengerId' : [0.0, 0.001, 0.002, 0.003, 0.004, 0.006, 0.007, 0.008, 0.009, 0.01],
'Survived' : [0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0],
'Pclass' : [1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.5],
'Age' : [0.271, 0.472, 0.321, 0.435, 0.435, np.nan, 0.673, 0.02, 0.334, 0.171],
'SibSp' : [0.125, 0.125, 0.0, 0.125, 0.0, 0.0, 0.0, 0.375, 0.0, 0.125],
'Parch' : [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.167, 0.333, 0.0],
'Fare' : [0.014, 0.139, 0.015, 0.104, 0.016, 0.017, 0.101, 0.041, 0.022, 0.059]}

df = pd.DataFrame(dict, columns = dict.keys())

def calculate_mean():
    l1, l2 = [], []
    for i, row in df.iterrows():
        if row['Age'] > 0.2 and row['Fare'] < 0.1:
            l1.append(row['PassengerId'])
        elif row['Age'] > 0.2 and row['Fare'] > 0.1:
            l2.append(row['PassengerId'])
    return np.mean(l1), np.mean(l2)


print(calculate_mean()) # (0.00375, 0.0036666666666666666)