Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/317.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 带遮罩和变换的Groupby_Python_Pandas - Fatal编程技术网

Python 带遮罩和变换的Groupby

Python 带遮罩和变换的Groupby,python,pandas,Python,Pandas,我有这样一个数据帧: POLY_KEY_I Class SP_Percent FS01080100SM001 NA 5.0 MTGP 67.5 Meadow 25.0 Woodland 2.5 FS01080100SM002 PHP 85.0

我有这样一个数据帧:

POLY_KEY_I      Class     SP_Percent             
FS01080100SM001 NA               5.0
                MTGP            67.5
                Meadow          25.0
                Woodland         2.5
FS01080100SM002 PHP             85.0
                SP              15.0
对于每个单元
POLY_KEY_I
if
Class
=
Meadow
SP_%我想将
MTGP
转换成
WMTGP

我想要的输出是:

POLY_KEY_I      Class     SP_Percent             
FS01080100SM001 NA               5.0
                WMTGP           67.5
                Meadow          25.0
                Woodland         2.5
FS01080100SM002 PHP             85.0
                SP              15.0
我正在尝试的代码如下:

df ['mask'] = ((df['Class'] == 'Meadow') & df['SP_Percent'] >=20)
mask = df.groupby(['POLY_KEY_I'])['mask'].transform('MTGP')
df.loc[mask,'Class']='WMTGP'
print(df)
但这会返回错误:

mask=final.groupby(['POLY_KEY_I'])['mask'].transform('MTGP'))

文件“C:\Users\Stefano\Anaconda2\lib\site packages\pandas\core\groupby.py”,转换中的第2439行 返回self._transform_fast(lambda:getattr(self,func)(*args,**kwargs))

文件“C:\Users\Stefano\Anaconda2\lib\site packages\pandas\core\groupby.py”,第2484行,在快速转换中 值=func()。值

文件“C:\Users\Stefano\Anaconda2\lib\site packages\pandas\core\groupby.py”,第2439行,在 返回self._transform_fast(lambda:getattr(self,func)(*args,**kwargs))

文件“C:\Users\Stefano\Anaconda2\lib\site packages\pandas\core\groupby.py”,第520行,位于getattr (键入(自身)。名称,属性)

AttributeError:“SeriesGroupBy”对象没有属性“MTGP”

编辑:

我不知道这是否有帮助,但如果我改变这一行:

mask=df.groupby(['POLY_KEY_I'])['mask'].transform('MTGP')

为此:

mask=df.groupby(['POLY_KEY_I'])['mask'].transform('any')


它会将相应的
POLY_KEY_ID
的每个值更改为
WMTGP
,但我只希望在它是
MTGP
时更改它。我使用
apply
自定义函数
f
将您的解决方案完全更改为
groupby
。对于检查字符串值,最好使用

输入(增加第5行用于测试):

编辑1:

增加时间:

%timeit df.groupby(['POLY_KEY_I']).apply(f)
100 loops, best of 3: 4.78 ms per loop

%timeit shahram(df)
10 loops, best of 3: 38.2 ms per loop
时间来源:

import pandas as pd
import numpy as np
import io

temp=u"""POLY_KEY_I;Class;SP_Percent
FS01080100SM001;NA;5.0
FS01080100SM001;MTGP;67.5
FS01080100SM001;Meadow;25.0
FS01080100SM001;Woodland;2.5
FS01080100SM002;PHP;85.0
FS01080100SM002;MTGP;85.0
FS01080100SM002;SP;15.0"""

df = pd.read_csv(io.StringIO(temp), sep=";", index_col=None, parse_dates=False)
print df
print df.dtypes
print df.index

def shahram(df):
    df ['mask'] = ((df['Class'] == 'Meadow') & (df['SP_Percent'] >=20))
    df2 = df[(df['mask']==True)][['POLY_KEY_I']]
    df2['mask2']=True
    df = pd.merge(df,df2,how='left')
    df.ix[((df['mask2']==True) & (df['Class']=='MTGP')),'Class'] = 'WMTGP'
    return df

def f(g):
    if ((g['Class'].isin(['Meadow'])) & (g['SP_Percent'] >=20)).any():
       g['Class'].loc[g['Class'].isin(['MTGP'])] = 'WMTGP'
       return g
    else:
       return g

print df.groupby(['POLY_KEY_I']).apply(f)
print shahram(df)
我是这样做的:

df ['mask'] = ((df['Class'] == 'Meadow') & (df['SP_Percent'] >=20))
df2 = df[(df['mask']==True)][['POLY_KEY_I']]
df2['mask2']=True
df = pd.merge(df,df2,how='left')
df.ix[((df['mask2']==True) & (df['Class']=='MTGP')),'Class'] = 'WMTGP'
import pandas as pd
import numpy as np
import io

temp=u"""POLY_KEY_I;Class;SP_Percent
FS01080100SM001;NA;5.0
FS01080100SM001;MTGP;67.5
FS01080100SM001;Meadow;25.0
FS01080100SM001;Woodland;2.5
FS01080100SM002;PHP;85.0
FS01080100SM002;MTGP;85.0
FS01080100SM002;SP;15.0"""

df = pd.read_csv(io.StringIO(temp), sep=";", index_col=None, parse_dates=False)
print df
print df.dtypes
print df.index

def shahram(df):
    df ['mask'] = ((df['Class'] == 'Meadow') & (df['SP_Percent'] >=20))
    df2 = df[(df['mask']==True)][['POLY_KEY_I']]
    df2['mask2']=True
    df = pd.merge(df,df2,how='left')
    df.ix[((df['mask2']==True) & (df['Class']=='MTGP')),'Class'] = 'WMTGP'
    return df

def f(g):
    if ((g['Class'].isin(['Meadow'])) & (g['SP_Percent'] >=20)).any():
       g['Class'].loc[g['Class'].isin(['MTGP'])] = 'WMTGP'
       return g
    else:
       return g

print df.groupby(['POLY_KEY_I']).apply(f)
print shahram(df)
df ['mask'] = ((df['Class'] == 'Meadow') & (df['SP_Percent'] >=20))
df2 = df[(df['mask']==True)][['POLY_KEY_I']]
df2['mask2']=True
df = pd.merge(df,df2,how='left')
df.ix[((df['mask2']==True) & (df['Class']=='MTGP')),'Class'] = 'WMTGP'