Python 如何在循环中过滤多个数据帧?
我有很多数据帧,我想对它们应用相同的过滤器,而不必每次都复制粘贴过滤器条件 这是我目前的代码:Python 如何在循环中过滤多个数据帧?,python,pandas,dataframe,Python,Pandas,Dataframe,我有很多数据帧,我想对它们应用相同的过滤器,而不必每次都复制粘贴过滤器条件 这是我目前的代码: df_list_2019 = [df_spain_2019,df_amsterdam_2019, df_venice_2019, df_sicily_2019] for data in df_list_2019: data = data[['host_since','host_response_time','host_response_rate', 'host_ac
df_list_2019 = [df_spain_2019,df_amsterdam_2019, df_venice_2019, df_sicily_2019]
for data in df_list_2019:
data = data[['host_since','host_response_time','host_response_rate',
'host_acceptance_rate','host_is_superhost','host_total_listings_count',
'host_has_profile_pic','host_identity_verified',
'neighbourhood','neighbourhood_cleansed','zipcode','latitude','longitude','property_type','room_type',
'accommodates','bathrooms','bedrooms','beds','amenities','price','weekly_price',
'monthly_price','cleaning_fee','guests_included','extra_people','minimum_nights','maximum_nights',
'minimum_nights_avg_ntm','has_availability','availability_30','availability_60','availability_90',
'availability_365','number_of_reviews','number_of_reviews_ltm','review_scores_rating',
'review_scores_checkin','review_scores_communication','review_scores_location', 'review_scores_value',
'instant_bookable','is_business_travel_ready','cancellation_policy','reviews_per_month'
]]
但它不会对数据帧应用过滤器。我如何更改代码才能做到这一点
谢谢过滤器(列选择)实际上应用于每个数据帧,您只需覆盖名称数据
所指向的内容即可丢弃结果
您需要将结果存储在某个位置,例如列表
cols = ['host_since','host_response_time', ...]
filtered = [df[cols] for df in df_list_2019]
只要写入
var=new\u value
,就不会更改原始对象,而是让变量引用新对象
如果要从df_list_2019
更改数据帧,必须使用inplace=True
方法。在这里,您可以使用drop
:
keep = set(['host_since','host_response_time','host_response_rate',
'host_acceptance_rate','host_is_superhost','host_total_listings_count',
'host_has_profile_pic','host_identity_verified',
'neighbourhood','neighbourhood_cleansed','zipcode','latitude','longitude','property_type','room_type',
'accommodates','bathrooms','bedrooms','beds','amenities','price','weekly_price',
'monthly_price','cleaning_fee','guests_included','extra_people','minimum_nights','maximum_nights',
'minimum_nights_avg_ntm','has_availability','availability_30','availability_60','availability_90',
'availability_365','number_of_reviews','number_of_reviews_ltm','review_scores_rating',
'review_scores_checkin','review_scores_communication','review_scores_location', 'review_scores_value',
'instant_bookable','is_business_travel_ready','cancellation_policy','reviews_per_month'
])
for data in df_list_2019:
data.drop(columns=[col for col in data.columns if col not in keep], inplace=True)
但是要小心,熊猫专家建议选择df=df代码>习惯用法到df…(…,inplace=True)
,因为它允许链接操作。所以你应该问问自己,如果不能使用。无论如何,这一个应该适合你的要求