Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x 熊猫,连接列的值。_Python 3.x_Pandas_Dataframe_Concatenation_Unary Operator - Fatal编程技术网

Python 3.x 熊猫,连接列的值。

Python 3.x 熊猫,连接列的值。,python-3.x,pandas,dataframe,concatenation,unary-operator,Python 3.x,Pandas,Dataframe,Concatenation,Unary Operator,我以前在这里找到过这个问题的答案,但似乎没有一个适合我。现在我有一个数据框,上面有客户名单和他们的地址。然而,每个地址被分成许多列,我试图将它们全部放在一个列下 到目前为止,我读过的代码如下: data1_df['Address'] = data1_df['Address 1'].map(str) + ", " + data1_df['Address 2'].map(str) + ", " + data1_df['Address 3'].map(str) + ", " + data1_df['

我以前在这里找到过这个问题的答案,但似乎没有一个适合我。现在我有一个数据框,上面有客户名单和他们的地址。然而,每个地址被分成许多列,我试图将它们全部放在一个列下

到目前为止,我读过的代码如下:

data1_df['Address'] = data1_df['Address 1'].map(str) + ", " + data1_df['Address 2'].map(str) + ", " +  data1_df['Address 3'].map(str) + ", " + data1_df['city'].map(str) + ", " +  data1_df['city'].map(str) + ", " +  data1_df['Province/State'].map(str) + ", " +  data1_df['Country'].map(str) + ", " +  data1_df['Postal Code'].map(str)  
但是,我得到的错误是: TypeError:一元加号需要数字数据类型,而不是对象


我不确定它为什么不接受字符串的原样并使用+运算符。加号不应该容纳对象吗

希望您会发现这个示例很有帮助:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1,2,3],
                   'B': list('ABC'),
                   'C': [4,5,np.nan],
                   'D': ['One', np.nan, 'Three']})

addColumns = ['B', 'C', 'D']

df['Address'] = df[addColumns].astype(str).apply(lambda x: ', '.join([i for i in x if i != 'nan']), axis=1)

df

#   A  B    C      D      Address
#0  1  A  4.0    One  A, 4.0, One
#1  2  B  5.0    NaN       B, 5.0
#2  3  C  NaN  Three     C, Three
以上内容将作为
str
表示
NaN
NaN

或者您可以使用空字符串填充
NaN

df['Address'] = df[addColumns].fillna('').astype(str).apply(lambda x: ', '.join([i for i in x if i]), axis=1)

希望您会发现此示例很有帮助:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1,2,3],
                   'B': list('ABC'),
                   'C': [4,5,np.nan],
                   'D': ['One', np.nan, 'Three']})

addColumns = ['B', 'C', 'D']

df['Address'] = df[addColumns].astype(str).apply(lambda x: ', '.join([i for i in x if i != 'nan']), axis=1)

df

#   A  B    C      D      Address
#0  1  A  4.0    One  A, 4.0, One
#1  2  B  5.0    NaN       B, 5.0
#2  3  C  NaN  Three     C, Three
以上内容将作为
str
表示
NaN
NaN

或者您可以使用空字符串填充
NaN

df['Address'] = df[addColumns].fillna('').astype(str).apply(lambda x: ', '.join([i for i in x if i]), axis=1)

如果列中有
NaN
值需要加在一起,下面是一些逻辑:

def add_cols_w_nan(df, col_list, space_char, new_col_name):
    """ Add together multiple columns where some of the columns 
    may contain NaN, with the appropriate amount of spacing between columns. 

    Examples:
        'Mr.' + NaN + 'Smith' becomes 'Mr. Smith'
        'Mrs.' + 'J.' + 'Smith' becomes 'Mrs. J. Smith'
        NaN + 'J.' + 'Smith' becomes 'J. Smith'

    Args:
        df: pd.DataFrame
            DataFrame for which strings are added together.
        col_list: ORDERED list of column names, eg. ['first_name', 
            'middle_name', 'last_name']. The columns will be added in order. 
        space_char: str
            Character to insert between concatenation of columns.
        new_col_name: str
            Name of the new column after adding together strings.

    Returns: pd.DataFrame with a string addition column

    """
    df2 = df[col_list].copy()

    # Convert to strings, leave nulls alone
    df2 = df2.where(df2.isnull(), df2.astype('str'))

    # Add space character, NaN remains NaN, which is important
    df2.loc[:, col_list[1:]] = space_char + df2.loc[:, col_list[1:]]

    # Fix rows where leading columns are null
    to_fix = df2.notnull().idxmax(1)
    for col in col_list[1:]:
        m = to_fix == col
        df2.loc[m, col] = df2.loc[m, col].str.replace(space_char, '')

    # So that summation works
    df2[col_list] = df2[col_list].replace(np.NaN, '')

    # Add together all columns
    df[new_col_name] = df2[col_list].sum(axis=1)
    # If all are missing replace with missing
    df[new_col_name] = df[new_col_name].replace('', np.NaN)

    del df2
    return df
样本数据:
如果列中有
NaN
值需要加在一起,下面是一些逻辑:

def add_cols_w_nan(df, col_list, space_char, new_col_name):
    """ Add together multiple columns where some of the columns 
    may contain NaN, with the appropriate amount of spacing between columns. 

    Examples:
        'Mr.' + NaN + 'Smith' becomes 'Mr. Smith'
        'Mrs.' + 'J.' + 'Smith' becomes 'Mrs. J. Smith'
        NaN + 'J.' + 'Smith' becomes 'J. Smith'

    Args:
        df: pd.DataFrame
            DataFrame for which strings are added together.
        col_list: ORDERED list of column names, eg. ['first_name', 
            'middle_name', 'last_name']. The columns will be added in order. 
        space_char: str
            Character to insert between concatenation of columns.
        new_col_name: str
            Name of the new column after adding together strings.

    Returns: pd.DataFrame with a string addition column

    """
    df2 = df[col_list].copy()

    # Convert to strings, leave nulls alone
    df2 = df2.where(df2.isnull(), df2.astype('str'))

    # Add space character, NaN remains NaN, which is important
    df2.loc[:, col_list[1:]] = space_char + df2.loc[:, col_list[1:]]

    # Fix rows where leading columns are null
    to_fix = df2.notnull().idxmax(1)
    for col in col_list[1:]:
        m = to_fix == col
        df2.loc[m, col] = df2.loc[m, col].str.replace(space_char, '')

    # So that summation works
    df2[col_list] = df2[col_list].replace(np.NaN, '')

    # Add together all columns
    df[new_col_name] = df2[col_list].sum(axis=1)
    # If all are missing replace with missing
    df[new_col_name] = df[new_col_name].replace('', np.NaN)

    del df2
    return df
样本数据:
您的任何列是否缺少数据?这可能会非常烦人,并导致您的approach@ScottBoston非常感谢您的帮助,但找到了答案。@Alolz尽管如此,提出的解决方案还是对我有效。然而,你在某种程度上是正确的,因为我在新专栏的特定位置出现了NaN。我想知道我该怎么解决这个问题。你的专栏有没有遗漏数据?这可能会非常烦人,并导致您的approach@ScottBoston非常感谢您的帮助,但找到了答案。@Alolz尽管如此,提出的解决方案还是对我有效。然而,你在某种程度上是正确的,因为我在新专栏的特定位置出现了NaN。我想知道我怎么才能避开这个问题。谢谢!你知道我如何处理某些列中存在NAN值的情况吗。我的意思是从实际的连接中排除这些?谢谢!你知道我如何处理某些列中存在NAN值的情况吗。我的意思是从实际的连接中排除它们?