Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/redis/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用熊猫高级索引屏蔽/修改值_Python_Pandas_Dataframe_Multi Index - Fatal编程技术网

Python 使用熊猫高级索引屏蔽/修改值

Python 使用熊猫高级索引屏蔽/修改值,python,pandas,dataframe,multi-index,Python,Pandas,Dataframe,Multi Index,我试图通过屏蔽一些值来更新多索引列数据帧,如下所示。我没有找到正确的语法。是否有方法重新索引状态_df,以便同时具有两列级别?还是有一个简单的方法 # -*- coding: utf-8 -*- """ Created on Thu Jul 2 18:31:31 2020 @author: ancollet """ import numpy as np import pandas as pd def generate_serie

我试图通过屏蔽一些值来更新多索引列数据帧,如下所示。我没有找到正确的语法。是否有方法重新索引
状态_df
,以便同时具有两列级别?还是有一个简单的方法

# -*- coding: utf-8 -*-
"""
Created on Thu Jul  2 18:31:31 2020

@author: ancollet
"""

import numpy as np
import pandas as pd

def generate_series():
    return pd.Series(np.random.randn(1, 5)[0], [1, 2, 3, 4, 5])

# initial labels
iterables = [['U', 'acidity', 'Al'], ['TSU16_PR']]
# transform it to tuples
columns = pd.MultiIndex.from_product(iterables, names=['elment', 'asset'])
# build a multi-index from it
df = pd.DataFrame(columns=columns)

# Add data
df['U', 'TSU16_PR'] = generate_series()
df['acidity', 'TSU16_PR'] = generate_series()
df['Al', 'TSU16_PR'] = generate_series()
df['U', 'TSU17_PR'] = generate_series()
df['U', 'TSU18_PR'] = generate_series()

states_df = pd.DataFrame([[0.0, 1.0, 0.0],
                          [1.0, 1.0, 1.0],
                          [1.0, 0.0, 1.0],
                          [1.0, 1.0, 1.0],
                          [0.0, 1.0, 1.0]],
                         columns=['TSU16_PR', 'TSU17_PR', 'TSU18_PR'],
                         index=[1, 2, 3, 4, 5])

# This is not working since df and states do not have the same number of dimensions
df.loc[:, (slice(None),slice(None))].where(states_df != 0, np.nan, inplace=True)
我知道我可以通过这种方式实现,所以这可能不是一笔小交易。以下是所需的输出:

arrays = [['U', 'acidity', 'Al', 'U', 'U'],
          ['TSU16_PR', 'TSU16_PR', 'TSU16_PR', 'TSU17_PR', 'TSU18_PR']]

tuples = list(zip(*arrays))

columns = pd.MultiIndex.from_tuples(tuples, names=['elment', 'asset'])

states_df_2 = pd.DataFrame([[0.0, 0.0, 0.0, 1.0, 0.0],
                           [1.0, 1.0, 1.0, 1.0, 1.0],
                           [1.0, 1.0, 1.0, 0.0, 1.0],
                           [1.0, 1.0, 1.0, 1.0, 1.0],
                           [0.0, 0.0, 0.0, 1.0, 1.0]],
                           columns=columns,
                           index=[1, 2, 3, 4, 5])

df.where(states_df_2 != 0, np.nan, inplace = True)

In[1]: df
Out[1]: 
elment         U   acidity        Al         U          
asset   TSU16_PR  TSU16_PR  TSU16_PR  TSU17_PR  TSU18_PR
1            NaN       NaN       NaN  0.188960       NaN
2       1.920012 -1.355612  0.514419 -0.648037  0.461363
3       0.196968 -1.292682 -0.484867       NaN  0.373522
4      -0.340107  0.764010  1.081631 -0.141903  0.530718
5            NaN       NaN       NaN -0.732350 -1.148502
您可以将用于遮罩并传递到:

详细信息

print (states_df.reindex(df.columns, level=1, axis=1) != 0)
elment        U  acidity       Al        U         
asset  TSU16_PR TSU16_PR TSU16_PR TSU17_PR TSU18_PR
1         False    False    False     True    False
2          True     True     True     True     True
3          True     True     True    False     True
4          True     True     True     True     True
5         False    False    False     True     True

工作起来很有魅力!也很有用。当涉及到切片和索引时,多重索引可能有点难掌握。。。谢谢
df = df.where(states_df.reindex(df.columns, level=1, axis=1) != 0)
print (df)
elment         U   acidity        Al         U          
asset   TSU16_PR  TSU16_PR  TSU16_PR  TSU17_PR  TSU18_PR
1            NaN       NaN       NaN -0.434351       NaN
2       0.997345 -2.426679 -0.094709  2.205930  1.490732
3       0.282978 -0.428913  1.491390       NaN -0.935834
4      -1.506295  1.265936 -0.638902  1.004054  1.175829
5            NaN       NaN       NaN  0.386186 -1.253881
print (states_df.reindex(df.columns, level=1, axis=1) != 0)
elment        U  acidity       Al        U         
asset  TSU16_PR TSU16_PR TSU16_PR TSU17_PR TSU18_PR
1         False    False    False     True    False
2          True     True     True     True     True
3          True     True     True    False     True
4          True     True     True     True     True
5         False    False    False     True     True