Python 将列列表转换为字符串

Python 将列列表转换为字符串,python,pandas,dataframe,Python,Pandas,Dataframe,我的数据集如下所示: id关键字 0[word1,word2] 1[字4,字5和字6,字7] 2[word8等 “关键短语”中的每个值都是一个列表。 我想将每个列表展开成一个新行(字符串) “id”列现在不重要 已经尝试了df.values、来自_记录等 预期: 关键词 字1 字2 字3 字4 您可以与数据帧列选择结合使用: import itertools df = pd.DataFrame({ 'keyPhrases': [ ['word1', 'word2'

我的数据集如下所示:


id关键字
0[word1,word2]
1[字4,字5和字6,字7]
2[word8等
“关键短语”中的每个值都是一个列表。 我想将每个列表展开成一个新行(字符串)

“id”列现在不重要

已经尝试了df.values、来自_记录等

预期:


关键词
字1
字2
字3
字4
您可以与数据帧列选择结合使用:

import itertools

df = pd.DataFrame({
    'keyPhrases': [
        ['word1', 'word2'],
        ['word4', 'word5', 'word7'],
        ['word8', 'word9']
    ],
    'id': [1,2,3]
})

for elem in itertools.chain.from_iterable(df['keyPhrases'].values):
    print(elem)
将打印:

word1
word2
word4
word5
word7
word8
word9



一个有趣的方式,但不推荐

df.keyPhrases.sum()
Out[520]: ['word1', 'word2', 'word4', 'word5', 'word7', 'word8', 'word9']

上面给出的numpy库的答案确实非常好,但我通过放置代码网格参与其中,不是以执行的方式,而是以最简单的方式来理解

import pandas as pd

lista = [[['word1', 'word2']], [['word4', 'word5', 'word6', 'word7']], [['word8', 'word9', 'word10']]]
df = pd.DataFrame(lista, columns=['keyPhrases'])

list = []
for key in df.keyPhrases:
    for element in key:
        list.append(element)
list

numpy和itertools方法都非常有效

我最终使用了itertools方法,并使用for将每一行写入一个文件

它节省了我很多时间和代码

非常感谢


for elem in itertools.chain.from_iterable(df['keyPhrases'].values):
    textfile.write(elem + "\n")


找到了另一种方法:

df['keyPhrases'] = df['keyPhrases'].str.split(',') #to make arrays
df['keyPhrases'] = df['keyPhrases'].astype(str) #back to strings
s=''.join(df.keyPhrases).replace('[','').replace(']','\n').replace(',','\n') #replace magic
print(s)


我不确定是否有任何现有函数可以在一行代码中完成此操作。下面的变通代码可以解决您的需求。如果有任何其他内置函数可以轻松完成此操作,我将很高兴知道

import pandas as pd

#Existing DF where the data is in the form of list
df = pd.DataFrame(columns=['ID', 'value_list'])
#New DF where the data should be atomic
df_new = pd.DataFrame(columns=['ID', 'value_single'])

#Sample Data
row_1 = ['A', 'B', 'C', 'D']
row_2 = ['D', 'E', 'F']
row_3 = ['F', 'G']
row_4 = ['H', 'I']
row_5 = ['J']

#Data Push to existing DF
row_ = "row_"
for i in range(5):
    df.loc[i, 'ID'] = i
    df.loc[i, 'value_list'] = eval(row_+str(i+1))

#Data Push to new DF where list is pushed as atomic data
counter = 0
i=0
while(i<len(df)):
    j=0
    while(j<len(df['value_list'][i])):
        df_new.loc[counter, 'ID'] = df['ID'][i]
        df_new.loc[counter, 'value_single'] = df['value_list'][i][j]
        counter = counter + 1
        j = j+1
    i = i+1

print(df_new)
将熊猫作为pd导入
#数据以列表形式存在的现有DF
df=pd.DataFrame(列=['ID','value\u list'])
#新的DF,其中数据应该是原子的
df_new=pd.DataFrame(列=['ID','value_single'])
#样本数据
第1行=['A','B','C','D']
第2行=['D','E','F']
第3行=['F','G']
第4行=['H','I']
第5行=['J']
#数据推送到现有DF
行=“行”
对于范围(5)中的i:
df.loc[i,'ID']=i
df.loc[i,'值列表']=eval(行+str(i+1))
#数据推送到新的DF,其中列表作为原子数据推送到
计数器=0
i=0

然而,你能就预期的结果进行讨论吗?word3是从哪里来的?它实际上运行得很好。为什么不推荐它呢?
import pandas as pd

lista = [[['word1', 'word2']], [['word4', 'word5', 'word6', 'word7']], [['word8', 'word9', 'word10']]]
df = pd.DataFrame(lista, columns=['keyPhrases'])

list = []
for key in df.keyPhrases:
    for element in key:
        list.append(element)
list

for elem in itertools.chain.from_iterable(df['keyPhrases'].values):
    textfile.write(elem + "\n")

df['keyPhrases'] = df['keyPhrases'].str.split(',') #to make arrays
df['keyPhrases'] = df['keyPhrases'].astype(str) #back to strings
s=''.join(df.keyPhrases).replace('[','').replace(']','\n').replace(',','\n') #replace magic
print(s)
word1
 word2
word4
 word 5 and 6
 word7
word8
 etc
 etc
import pandas as pd

#Existing DF where the data is in the form of list
df = pd.DataFrame(columns=['ID', 'value_list'])
#New DF where the data should be atomic
df_new = pd.DataFrame(columns=['ID', 'value_single'])

#Sample Data
row_1 = ['A', 'B', 'C', 'D']
row_2 = ['D', 'E', 'F']
row_3 = ['F', 'G']
row_4 = ['H', 'I']
row_5 = ['J']

#Data Push to existing DF
row_ = "row_"
for i in range(5):
    df.loc[i, 'ID'] = i
    df.loc[i, 'value_list'] = eval(row_+str(i+1))

#Data Push to new DF where list is pushed as atomic data
counter = 0
i=0
while(i<len(df)):
    j=0
    while(j<len(df['value_list'][i])):
        df_new.loc[counter, 'ID'] = df['ID'][i]
        df_new.loc[counter, 'value_single'] = df['value_list'][i][j]
        counter = counter + 1
        j = j+1
    i = i+1

print(df_new)