Python 3.x FeatureTools-如何将两列添加到一起？_Python 3.x_Featuretools

Python 3.x FeatureTools-如何将两列添加到一起？

python-3.x

Python 3.x FeatureTools-如何将两列添加到一起？,python-3.x,featuretools,Python 3.x,Featuretools,我卡住了。使用Featuretools，我所要做的就是创建一个新列，将数据集中的两列相加，创建一个排序的“堆叠”功能。对数据集中的所有列执行此操作我的代码如下所示： # Define the function def feature_engineering_dataset(df): es = ft.EntitySet(id = 'stockdata') # Make the "Date" index an actual column cuz de

我卡住了。使用Featuretools，我所要做的就是创建一个新列，将数据集中的两列相加，创建一个排序的“堆叠”功能。对数据集中的所有列执行此操作

我的代码如下所示：

# Define the function
def feature_engineering_dataset(df):

    es = ft.EntitySet(id = 'stockdata')
    
    # Make the "Date" index an actual column cuz defining it as the index below throws
    # a "can't find Date in index" error for some reason.
    df = df.reset_index()

    # Save some columns not used in Featuretools to concat back later
    dates = df['Date']
    tickers = df['Ticker']
    dailychange = df['DailyChange']
    classes = df['class']

    dataframe = df.drop(['Date', 'Ticker', 'DailyChange', 'class'],axis=1)

    # Define the entity
    es.entity_from_dataframe(entity_id='data', dataframe=dataframe, index='Date') # Won't find Date so uses a numbered index. We'll re-define date as index later

    # Pesky warnings
    warnings.filterwarnings("ignore", category=RuntimeWarning) 
    warnings.filterwarnings("once", category=ImportWarning)

    # Run deep feature synthesis
    feature_matrix, feature_defs = ft.dfs(n_jobs=-2,entityset=es, target_entity='data', 
                                           chunk_size=0.015,max_depth=2,verbose=True,
                    agg_primitives = ['sum'],
                    trans_primitives = []
                    ) 

    # Now re-add previous columnes because featuretools...
    df = pd.concat([dates, tickers, feature_matrix, dailychange, classes], axis=1)
    
    df = df.set_index(['Date'])
    
    # Return our new dataset!
    return(df)

# Now run that defined function
df = feature_engineering_dataset(df)

我不确定这里到底发生了什么，但我定义了深度2，所以我的理解是，对于数据集中的每一对列的组合，它将创建一个新列，将这两个列相加

我最初的dataframes形状有3101列，当我运行这个命令时，它会显示

build 3098 features

，最终的df在concat'ing之后有3098列，这是不对的，它应该有我所有的原始特性，加上工程特性

我怎样才能实现我的目标？featuretools页面和API文档上的示例非常混乱，并且处理了很多过时的示例，例如“time_since_last”trans-primitives和其他似乎不适用于此处的内容。谢谢

谢谢你的提问。您可以使用transform原语

add\u numeric

创建一个新列，对两列求和。我将使用这些数据快速浏览一个示例

id                time      open      high       low     close
 0 2019-07-10 07:00:00  1.053362  1.053587  1.053147  1.053442
 1 2019-07-10 08:00:00  1.053457  1.054057  1.053457  1.053987
 2 2019-07-10 09:00:00  1.053977  1.054192  1.053697  1.053917
 3 2019-07-10 10:00:00  1.053902  1.053907  1.053522  1.053557
 4 2019-07-10 11:00:00  1.053567  1.053627  1.053327  1.053397

首先，我们为数据创建实体集

将功能工具作为ft导入
es=ft.EntitySet（'stockdata'）
es.entity_from_数据帧(
实体_id='data'，
数据帧=df，
index='id'，
time_index='time'，
)

现在，我们使用transform原语应用DFS来添加数字列

feature\u矩阵，feature\u defs=ft.dfs(
entityset=es，
目标实体=“数据”，
trans_原语=['add_numeric']，
)

然后，新的工程特性将与原始特性一起返回

特征矩阵

通过调用函数

ft.list\u primitives（）

可以查看所有内置原语的列表。谢谢您的提问。您可以使用transform原语

add\u numeric

创建一个新列，对两列求和。我将使用这些数据快速浏览一个示例

id                time      open      high       low     close
 0 2019-07-10 07:00:00  1.053362  1.053587  1.053147  1.053442
 1 2019-07-10 08:00:00  1.053457  1.054057  1.053457  1.053987
 2 2019-07-10 09:00:00  1.053977  1.054192  1.053697  1.053917
 3 2019-07-10 10:00:00  1.053902  1.053907  1.053522  1.053557
 4 2019-07-10 11:00:00  1.053567  1.053627  1.053327  1.053397

首先，我们为数据创建实体集

将功能工具作为ft导入
es=ft.EntitySet（'stockdata'）
es.entity_from_数据帧(
实体_id='data'，
数据帧=df，
index='id'，
time_index='time'，
)

现在，我们使用transform原语应用DFS来添加数字列

feature\u矩阵，feature\u defs=ft.dfs(
entityset=es，
目标实体=“数据”，
trans_原语=['add_numeric']，
)

然后，新的工程特性将与原始特性一起返回

特征矩阵

通过调用函数

ft.list\u primitives（）

可以看到所有内置原语的列表，谢谢！我知道我很接近。我将把ft.list_primitives（）添加到我的工作流程中，这样我就可以熟悉转换了。谢谢！我知道我很接近。我将把ft.list_primitives（）添加到我的工作流程中，以便熟悉转换。