Python 从列表列表创建数据帧,但有不同的分隔符

Python 从列表列表创建数据帧,但有不同的分隔符,python,pandas,dataframe,Python,Pandas,Dataframe,我有一份清单: [['1', 'Toy Story (1995)', "Animation|Children's|Comedy"], ['2', 'Jumanji (1995)', "Adventure|Children's|Fantasy"], ['3', 'Grumpier Old Men (1995)', 'Comedy|Romance']] 我希望最终得到一个包含这些列的熊猫数据帧 cols = ['MovieID', 'Name', 'Year', 'A

我有一份清单:

     [['1', 'Toy Story (1995)', "Animation|Children's|Comedy"],
     ['2', 'Jumanji (1995)', "Adventure|Children's|Fantasy"],
     ['3', 'Grumpier Old Men (1995)', 'Comedy|Romance']]
我希望最终得到一个包含这些列的熊猫数据帧

cols = ['MovieID', 'Name', 'Year', 'Adventure', 'Children', 'Comedy', 'Fantasy', 'Romance']
对于
“冒险”、“儿童”、“喜剧”、“幻想”、“浪漫”列,数据将为1或0

我试过:

for row in movies_list:
    for element in row:
        if '|' in element:
            element = element.split('|')

但是,原始列表没有任何变化。。这里完全被难住了。

使用
DataFrame
构造函数:

对于列
Name
Year
需要,对于删除尾部
,也将
Year
转换为
int
s

df[['Name','Year']] = df['Name'].str.split('\s\(', expand=True)
df['Year'] = df['Year'].str.rstrip(')').astype(int)
最后删除列
数据
,并通过以下方式将
df1
添加到原始列:


这是我的版本,不足以回答一行,但希望它能帮助你

import pandas as pd
import numpy as np

data = [['1', 'Toy Story (1995)', "Animation|Children's|Comedy"],
     ['2', 'Jumanji (1995)', "Adventure|Children's|Fantasy"],
     ['3', 'Grumpier Old Men (1995)', 'Comedy|Romance']]
cols = ['MovieID', 'Name', 'Year', 'Adventure', 'Children', 'Comedy', 'Fantasy', 'Romance']
final = []
for x in data:
    output = []
    output.append(x[0])
    output.append(x[1].split("(")[0].lstrip().rstrip())
    output.append(x[1].split("(")[1][:4])
    for h in ['Adventure', 'Children', 'Comedy', 'Fantasy', 'Romance']:
        output.append(h in x[2])
    final.append(output)

df = pd.DataFrame(final, columns=cols)
print(df)
输出:

  MovieID              Name  Year  Adventure  Children  Comedy  Fantasy  \
0       1         Toy Story  1995      False      True    True    False   
1       2           Jumanji  1995       True      True   False     True   
2       3  Grumpier Old Men  1995      False     False    True    False   

   Romance  
0    False  
1    False  
2     True  

再次感谢耶斯雷尔!
import pandas as pd
import numpy as np

data = [['1', 'Toy Story (1995)', "Animation|Children's|Comedy"],
     ['2', 'Jumanji (1995)', "Adventure|Children's|Fantasy"],
     ['3', 'Grumpier Old Men (1995)', 'Comedy|Romance']]
cols = ['MovieID', 'Name', 'Year', 'Adventure', 'Children', 'Comedy', 'Fantasy', 'Romance']
final = []
for x in data:
    output = []
    output.append(x[0])
    output.append(x[1].split("(")[0].lstrip().rstrip())
    output.append(x[1].split("(")[1][:4])
    for h in ['Adventure', 'Children', 'Comedy', 'Fantasy', 'Romance']:
        output.append(h in x[2])
    final.append(output)

df = pd.DataFrame(final, columns=cols)
print(df)
  MovieID              Name  Year  Adventure  Children  Comedy  Fantasy  \
0       1         Toy Story  1995      False      True    True    False   
1       2           Jumanji  1995       True      True   False     True   
2       3  Grumpier Old Men  1995      False     False    True    False   

   Romance  
0    False  
1    False  
2     True