Python 如何在一个类中组合生成多个变量
我有一个数据集如下Python 如何在一个类中组合生成多个变量,python,function,python-3.x,class,Python,Function,Python 3.x,Class,我有一个数据集如下 import pandas as pd import sklearn df= pd.DataFrame({'color': ['red', 'red,blue','red,blue,yellow', 'red,yellow', 'blue,yellow']}) 我得到一个像这样的新变量 df['red'] = 0 df.ix[df['color'].str.contains("red") == True, 'red' ] =1 同样,我可以得到df['blue']&df
import pandas as pd
import sklearn
df= pd.DataFrame({'color': ['red', 'red,blue','red,blue,yellow', 'red,yellow', 'blue,yellow']})
我得到一个像这样的新变量
df['red'] = 0
df.ix[df['color'].str.contains("red") == True, 'red' ] =1
同样,我可以得到df['blue']&df['yellow']
然后我不得不在类中使用它(我想应用管道
)
它可以工作,但我想得到类
,它也会生成“蓝色”和“黄色”
。为每种“颜色”上课?在真实的数据集中有几十种“颜色”。
我是新来的,请告诉我如何在一个类中组合
生成多个变量我很惊讶,但它是有效的
class Red(BaseEstimator, TransformerMixin):
def transform(self, X, y=None, **fit_params):
X['red'] = 0
X.loc[X['color'].str.contains("red") == True, 'red' ] = 1
X['blue'] = 0
X.loc[X['color'].str.contains("blue") == True, 'blue' ] = 1
return X[['red', 'blue']].values.reshape(len(X),2)
def fit_transform(self, X, y=None, **fit_params):
self.fit(X, y, **fit_params)
return self.transform(X)
def fit(self, X, y=None, **fit_params):
return self
class Red(BaseEstimator, TransformerMixin):
def transform(self, X, y=None, **fit_params):
X['red'] = 0
X.loc[X['color'].str.contains("red") == True, 'red' ] = 1
X['blue'] = 0
X.loc[X['color'].str.contains("blue") == True, 'blue' ] = 1
return X[['red', 'blue']].values.reshape(len(X),2)
def fit_transform(self, X, y=None, **fit_params):
self.fit(X, y, **fit_params)
return self.transform(X)
def fit(self, X, y=None, **fit_params):
return self