Python 如何在数据帧中拆分列值_Python_Pandas_Dataframe_Data Cleaning

Python 如何在数据帧中拆分列值

python pandas dataframe

Python 如何在数据帧中拆分列值,python,pandas,dataframe,data-cleaning,Python,Pandas,Dataframe,Data Cleaning,如何在不创建更多列的情况下拆分具有字符串的数据帧中的单个列。去掉括号 def combine_with_nan(x, cols): combined='' for column in cols: try: np.isnan(x[column]) Temp = '' except: Temp = x[column] combined= combined + ' |

如何在不创建更多列的情况下拆分具有字符串的数据帧中的单个列。去掉括号

def combine_with_nan(x, cols):
    combined=''
    for column in cols:
        try:
            np.isnan(x[column])
            Temp = ''
        except:
            Temp = x[column]
        combined= combined + ' || ' + Temp

    return combined 
cols=['Columns you want to merge']
practicedf = practicedf.apply(combine_with_nan, axis=1,args=(cols,)).to_frame().replace(r"\\n"," || ", regex=True)

例如，两行如下所示：

df = pd.DataFrame({'Ala Carte':'||LA1: 53565 \nCH2: 54565', 
                'Blistex':'|Cust: 65565\nCarrier: 2565|', 
                'Dermatology':'||RTR1\n65331\n\nRTR2\n65331'})

def combine_with_nan(x, cols):
    combined=''
    for column in cols:
        try:
            np.isnan(x[column])
            Temp = ''
        except:
            Temp = x[column]
        combined= combined + ' || ' + Temp

    return combined 
cols=['Columns you want to merge']
practicedf = practicedf.apply(combine_with_nan, axis=1,args=(cols,)).to_frame().replace(r"\\n"," || ", regex=True)

我希望输出数据框看起来像这样，其中信息列是一个字符串：

Customer      Information

Ala Carte     LA1: 53565 
              CH2: 54565

Blistex       Cust: 65565
              Carrier: 2565

Dermatology   RTR1: 65331
              RTR2: 65331

def combine_with_nan(x, cols):
    combined=''
    for column in cols:
        try:
            np.isnan(x[column])
            Temp = ''
        except:
            Temp = x[column]
        combined= combined + ' || ' + Temp

    return combined 
cols=['Columns you want to merge']
practicedf = practicedf.apply(combine_with_nan, axis=1,args=(cols,)).to_frame().replace(r"\\n"," || ", regex=True)

在信息的同一列中

应该这样做：

def combine_with_nan(x, cols):
    combined=''
    for column in cols:
        try:
            np.isnan(x[column])
            Temp = ''
        except:
            Temp = x[column]
        combined= combined + ' || ' + Temp

    return combined 
cols=['Columns you want to merge']
practicedf = practicedf.apply(combine_with_nan, axis=1,args=(cols,)).to_frame().replace(r"\\n"," || ", regex=True)

将熊猫作为pd导入
###创建数据帧
df=pd.DataFrame（{'name'：['alacarte'，'Blistex']，
“信息”：LA1:53565\nCH2:54565，
“|客户：65565\n承运人：2565”
})
###将列拆分为列表
df['information']=df['information'].apply（lambda x:x.replace（“|“，”）.split（“\n”））
###炸柱
df.explode（'信息'）

我决定将“\n”替换为“| |”，作为分隔两个不同值的方法。使用此定义组合两列

def combine_with_nan(x, cols):
    combined=''
    for column in cols:
        try:
            np.isnan(x[column])
            Temp = ''
        except:
            Temp = x[column]
        combined= combined + ' || ' + Temp

    return combined 
cols=['Columns you want to merge']
practicedf = practicedf.apply(combine_with_nan, axis=1,args=(cols,)).to_frame().replace(r"\\n"," || ", regex=True)

在本例中，我将有助于显示输入示例数据的输出字典。df.to_dict（）或者编写代码来生成问题中的输入数据帧。这更多的是关于字符串的问题，而不是关于数据帧的问题，不是吗？能否包含更多的程序内容？数据帧可能不是最好的数据结构。是否唯一的分隔符\n？我认为，由于您输入的数字clude将作为字符串。在您的位置上，我将首先使用str.strip（|）剥离括号，然后我会在/n上拆分，但这会将它变成一个系列。从那里你可以在冒号上拆分，但这会使事情变成更多的列，在这一点上，你可以在客户中迭代，以重复每个#个条目，然后你有信息字符串，然后下一个列你有#s。如果你能提供更多信息，我们可以看到去哪里go.你需要什么样的信息？@CodeMonkey程序的其余部分，数据，…除非我们要保存的某些数据包含

“|”

，对吗？是的，我不确定它们是指示列还是实际在字符串中。我需要一种方法来处理列列表（COL）中的空格在组合它们之前。目前，我以空格结尾（“| | | | |”）为什么要替换行尾？函数的作用是什么？@alexander cécile“\n”在数据帧中不作为行尾分隔符，所以我用“| |”替换它以表示新行。这些数据主要是为了观察，我不需要做任何分析。该函数用于合并我试图合并的列列表中的列。我找到了一种方法，让空白空间空白，代码看起来像预期的那样工作。我很感谢你的帮助。我还没有真正想过，但是应该可以创建一个自定义函数来将字符串转换为数据帧，包括此数据的所有特殊要求，对吗？对，但在我看来，这是一种效率较低的方法，到目前为止，这种方法对我来说效果最好。