Python 如何在dataframe中进行包含空格分隔符的列字符串连接？_Python_Pandas

Python 如何在dataframe中进行包含空格分隔符的列字符串连接？

python pandas

Python 如何在dataframe中进行包含空格分隔符的列字符串连接？,python,pandas,Python,Pandas,我是一个熊猫数据帧，如下所示： df = pd.DataFrame({ 'id': [1,2 ,3], 'txt1': ['Hello there1', 'Hello there2', 'Hello there3'], 'txt2': ['Hello there4', 'Hello there5', 'Hello there6'], 'txt3': ['Hello there7', 'Hello there8', 'Hello there9'] }) df id

我是一个熊猫数据帧，如下所示：

df = pd.DataFrame({
    'id': [1,2 ,3],
    'txt1': ['Hello there1', 'Hello there2', 'Hello there3'],
    'txt2': ['Hello there4', 'Hello there5', 'Hello there6'],
    'txt3': ['Hello there7', 'Hello there8', 'Hello there9']
})
df

id  txt1            txt2            txt3
1   Hello   there1  Hello there4    Hello there7
2   Hello   there2  Hello there5    Hello there8
3   Hello   there3  Hello there6    Hello there9

df['alltext'] = df['txt1']  + df['txt2'] + df['txt3']
df

id  txt1            txt2            txt3            alltext
1   Hello there1    Hello there4    Hello there7    Hello there1Hello there4Hello there7
2   Hello there2    Hello there5    Hello there8    Hello there2Hello there5Hello there8
3   Hello there3    Hello there6    Hello there9    Hello there3Hello there6Hello there9

我想连接列

txt1

、

txt2

和

txt3

。到目前为止，我能够实现以下目标：

df = pd.DataFrame({
    'id': [1,2 ,3],
    'txt1': ['Hello there1', 'Hello there2', 'Hello there3'],
    'txt2': ['Hello there4', 'Hello there5', 'Hello there6'],
    'txt3': ['Hello there7', 'Hello there8', 'Hello there9']
})
df

id  txt1            txt2            txt3
1   Hello   there1  Hello there4    Hello there7
2   Hello   there2  Hello there5    Hello there8
3   Hello   there3  Hello there6    Hello there9

df['alltext'] = df['txt1']  + df['txt2'] + df['txt3']
df

id  txt1            txt2            txt3            alltext
1   Hello there1    Hello there4    Hello there7    Hello there1Hello there4Hello there7
2   Hello there2    Hello there5    Hello there8    Hello there2Hello there5Hello there8
3   Hello there3    Hello there6    Hello there9    Hello there3Hello there6Hello there9

但是如何在两个列字符串之间引入空格字符，同时在Pandas中进行连接

我刚刚开始学习Pandas。

您还可以在列之间添加分隔符：

df['alltext'] = df['txt1']  + ' ' + df['txt2'] + ' ' + df['txt3']

或者只按列名中带有

txt

的列进行筛选，并使用

join

对带有

apply

的行进行筛选：

df['alltext'] = df.filter(like='txt').apply(' '.join, 1)

或者只按以下方式筛选对象列-大多数情况下，具有对象数据类型的

系列

将是

字符串

-但它可以是任何：

或按位置选择列-所有不带首选项的列：

谢谢，@Jon Clements为更好地将列名与

txt

和数字匹配提供了解决方案：

df['alltext'] = df.filter(regex=r'^txt\d+$').apply(' '.join, 1)

只需在两者之间添加空格

df['alltext'] = df['txt1']  + ' ' + df['txt2'] + ' ' + df['txt3']

人们可能想考虑<代码> .FILE（ReGEX＝R'^ ^ txt\d+$'）<代码>，只需明确地知道哪些列是被需要的，而不是“代码”的偶然机会，比如“txt”< /代码>拾取不需要的东西……（虽然-在这里是不太可能的）轻微的吹毛求疵。。。显然，对象是字符串-不完全正确。。。它们是不属于numpy类型的对象。。。它们可能是（虽然在大多数情况下不太可能）任何东西而不是字符串-因此，将

str.join

应用于它们将中断。（请注意，如果你在DF/数组中存储愚蠢的东西，那么这本身就是一个完全不同的问题：p）@JonClements-是的，我想最好写一写，也许这里的对象是字符串更好？或者熊猫世界里的大多数东西都是字符串？是的。。。也不知道该怎么说。。。我认为类似于“大多数情况下，一个数据类型为object的系列将是一个字符串，但它可以是任何Python对象”。。。