Python 如何从pandas dict中删除不包含任何内容的行？_Python_Pandas_List_Dictionary

Python 如何从pandas dict中删除不包含任何内容的行？

python pandas list dictionary

Python 如何从pandas dict中删除不包含任何内容的行？,python,pandas,list,dictionary,Python,Pandas,List,Dictionary,我的数据框如下所示 df time home_team away_team full_time_result both_teams_to_score double_chance League -- ------------------- ------------ ------------------ ------------

我的数据框如下所示

df
    time                 home_team     away_team           full_time_result                   both_teams_to_score        double_chance                         League
--  -------------------  ------------  ------------------  ---------------------------------  -------------------------  ------------------------------------  ----------------
 0  2021-01-08 19:45:00  Charlton      Accrington Stanley  {'1': 2370, 'X': 3400, '2': 3000}  {'yes': 1900, 'no': 1900}  {'1X': 1360, '12': 1300, '2X': 1530}  England League 1
 1  2021-01-09 12:30:00  Lincoln City  Peterborough        {'1': 2290, 'X': 3400, '2': 3100}  {'yes': 1800, 'no': 1950}  {'1X': 1360, '12': 1300, '2X': 1570}  England League 1
 2  2021-01-09 13:00:00  Gillingham    Burton Albion       {'1': 2200, 'X': 3400, '2': 3300}  {'yes': 1700, 'no': 2040}  {'1X': 1330, '12': 1300, '2X': 1610}  England League 1
 3  2021-01-09 17:30:00  Ipswich       Swindon             {'1': None, 'X': None, '2': None}  {'yes': 1750, 'no': 2000}  {'1X': 1220, '12': 1250, '2X': 1900}  England League 1

如何删除不包含任何内容的行？在col

full\u time\u result

中的这个例子中，我想删除行

{'1'：None，'X'：None，'2'：None}

谢谢

您可以创建一个布尔掩码，用

'1'

和

'2'

中的

无

过滤掉

完整时间结果的值。Tp提取我们可以用来检查相等性的值，即检查它是否（无，无）

细节

计时结果
≈ 比使用lambda
和lambda x:
时快2倍您正在浏览指定列的每一行。从那里，您可以执行正常的python操作，如any（）
，访问每行字典的值（）
，并检查是否有任何值等于None
。这将返回True
，因此我们希望使用~
过滤掉这些True
结果：
df[~df['full_time_result'].apply(lambda x: any([True for v in x.values() if v == None]))]

当print（df[m]）
我得到这一行值{'1'：None，'X'：None，'2'：None}
但我无法删除它。df[~m]
my bad@PyNoobAs an FYI，因为您已经按照以前的方法将DICT列扩展为单独的行。最好的选择是在对列进行非标准化后使用df_normalized=df_normalized.dropna（）。这将比使用提供的解决方案快得多。这正是我在等待您的解决方案时所做的，但是，我想创建一个更健壮的代码处理解决方案，因此，我采用了bu@david erickson的解决方案
_.map(itemgetter('1', '2')).map((None, None).__eq__)
# All of this can be written using lambda in single line.

_.map(lambda x: itemgetter('1', '2')(x).__eq__((None, None)))

example_dict = {'1': 10, '2': 20}
itemgetter('1', '2')(example_dict)
# (10, 20)

# Since you want to identify values with `None`. We can leverage on __eq__
itemgetter('1', '2')(example_dict).__eq__((10, 20))
# True # equivalent to (10, 20) == (10, 20)

# Benchmarking setup
s = pd.Series([{'1':10, '2':20}, {'1':None, '2':None}, {'1':1, '2':2}])
df = s.repeat(1_000_000).to_frame('full_time_result')
df.shape
# (3000000, 1) # 3 million rows, 1 column


# @david's
In [33]: %timeit df[~df['full_time_result'].apply(lambda x: any([True for v in x.values() if v == None]))]
1.59 s ± 82.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# @Ch3steR's
In [34]: %%timeit
    ...: m = df['full_time_result'].map(itemgetter('1', '2')).map((None, None).__eq__)
    ...: df[~m]
    ...:
    ...:
834 ms ± 16.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

df[~df['full_time_result'].apply(lambda x: any([True for v in x.values() if v == None]))]