Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x 使用apply在dataframe中创建新列_Python 3.x_Pandas_Dataframe_Lambda_Apply - Fatal编程技术网

Python 3.x 使用apply在dataframe中创建新列

Python 3.x 使用apply在dataframe中创建新列,python-3.x,pandas,dataframe,lambda,apply,Python 3.x,Pandas,Dataframe,Lambda,Apply,我希望使用apply基于其他列值在pandas数据框中创建新列。我收到这个错误,我不明白为什么: File "C:\dev\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2448, in _setitem_array raise ValueError('Columns must be same length as key') ValueError: Columns must be same length as key 我

我希望使用apply基于其他列值在pandas数据框中创建新列。我收到这个错误,我不明白为什么:

File "C:\dev\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2448, in _setitem_array
    raise ValueError('Columns must be same length as key')
ValueError: Columns must be same length as key
我是否误解了应用功能?是否可以使用单个apply调用更新/创建多个列

以下是我的示例数据:

import pandas as pd

x = pd.DataFrame({'VP': ['Brian', 'Sarah', 'Sarah', 'Brian', 'Sarah'],
                  'Director': ['Jim', 'Ian', 'Ian', 'Jim', 'Jerry'],
                  'Requester': ['Kelly', 'Dave', 'Jordan', 'Matt', 'Rob'],
                  'VP from Query': ['Jordan', 'Justin', 'Sarah', 'Brian', 'Sarah'],
                  'Director from Query': ['Other', 'Other', 'Ian', 'Jim', 'Jerry'],
                  'Requester from Query': ['Kelly', 'Dave', 'Jordan', 'Matt', 'Rob']
                  })
x = x[['VP', 'Director', 'Requester', 'VP from Query', 'Director from Query', 'Requester from Query']]


def set_suggested_hierarchy(row):
    if row['VP'] != row['VP from Query']:
        return row[['VP', 'Director']]
    else:
        return row[['VP from Query', 'Director from Query']]


x[['Suggested VP', 'Suggested Director']] = x.apply(lambda row: set_suggested_hierarchy(row), axis=1)
非常感谢

我在这里找到了答案:

基本上,我需要更改lambda函数以返回一个系列:

def set_suggested_hierarchy(row):
    if row['VP'] != row['VP from Query']:
        return pd.Series([row['VP'], row['Director']])
    else:
        return pd.Series([row['VP from Query'], row['Director from Query']])
我在这里找到了答案:

基本上,我需要更改lambda函数以返回一个系列:

def set_suggested_hierarchy(row):
    if row['VP'] != row['VP from Query']:
        return pd.Series([row['VP'], row['Director']])
    else:
        return pd.Series([row['VP from Query'], row['Director from Query']])

一种解决方案是返回数据帧的整行,因为您将此函数应用于完整的数据帧:

def set_suggested_hierarchy(row):

    if row['VP'] != row['VP from Query']:
        row['Suggested VP'] = row['VP']
        row['Suggested Director'] = row['Director']
    else:
        row['Suggested VP'] = row['VP from Query']
        row['Suggested Director'] = row['Director from Query']

    return row

x = x.apply(lambda row: set_suggested_hierarchy(row), axis=1)

一种解决方案是返回数据帧的整行,因为您将此函数应用于完整的数据帧:

def set_suggested_hierarchy(row):

    if row['VP'] != row['VP from Query']:
        row['Suggested VP'] = row['VP']
        row['Suggested Director'] = row['Director']
    else:
        row['Suggested VP'] = row['VP from Query']
        row['Suggested Director'] = row['Director from Query']

    return row

x = x.apply(lambda row: set_suggested_hierarchy(row), axis=1)

我认为您应该将
apply(axis=1)
全部去掉。您的逻辑似乎可以实现为:

import numpy as np

x['Suggested VP'] = x.VP
x['Suggested Director'] = np.where(x.VP != x['VP from Query'], 
                                   x.Director, x['Director from Query'])

我认为您应该将
apply(axis=1)
全部去掉。您的逻辑似乎可以实现为:

import numpy as np

x['Suggested VP'] = x.VP
x['Suggested Director'] = np.where(x.VP != x['VP from Query'], 
                                   x.Director, x['Director from Query'])