使用Python在每行的第一个和第二个单词后插入逗号？_Python_Regex_Csv_Text_Nlp

使用Python在每行的第一个和第二个单词后插入逗号？

python regex csv text nlp

使用Python在每行的第一个和第二个单词后插入逗号？,python,regex,csv,text,nlp,Python,Regex,Csv,Text,Nlp,我有一个.txt文件，我需要将其转换为CSV 这是我用来转换文件的代码： import pandas as pd wb = pd.read_csv('12.txt', encoding='utf-8', delimiter = '،', header = None) wb.to_csv('12.csv',encoding='utf-8-sig', index = None) 问题是，在每一行中，第一个和第二个字需要在单独的单元格中，但它们不能用逗号分隔： This is an, examp

我有一个.txt文件，我需要将其转换为CSV
这是我用来转换文件的代码：

import pandas as pd

wb = pd.read_csv('12.txt', encoding='utf-8', delimiter = '،', header = None)

wb.to_csv('12.csv',encoding='utf-8-sig', index = None)

问题是，在每一行中，第一个和第二个字需要在单独的单元格中，但它们不能用逗号分隔：

This is an, example, to show, you
The second line, is, the, same
My file contains, thousands of, sentences

如示例所示，每行中只有第一个和第二个单词应该在单独的单元格中（其他单元格可能包含多个单词！）。如何使用Python仅在每行的第一个和第二个单词后添加逗号

谢谢

我会在这里使用

str.replace

：

wb['col'] = wb['col'].str.replace('^(\S+) (\S+)', '\1, \2,')

如果您希望每个单词位于不同的单元格中，可以对每行应用以下内容：

line = "This is an, example, to, show, you"

split = line.split(",")

x = [item for sublist in [k.split(" ") for k in s] for item in sublist]
y = list(filter(lambda x: x != "", x))

output: ['This', 'is', 'an', 'example', 'to', 'show', 'you']

非常感谢。我不应该在这一行之前定义“col”吗？我假设您已经有一列

col

，您只想覆盖它。