Python 将具有值的元组和字典合并到数据帧中_Python_Pandas_Numpy_Dictionary_Dataframe

Python 将具有值的元组和字典合并到数据帧中

python pandas numpy dictionary dataframe

Python 将具有值的元组和字典合并到数据帧中,python,pandas,numpy,dictionary,dataframe,Python,Pandas,Numpy,Dictionary,Dataframe,目前，我有以下格式的数据： start a b ab apple1 ball1 cd apple2 ball2 即 List[Tuple[Any，dict{'key'：List}]] 目标是以以下形式创建熊猫数据框： start a b ab apple1 ball1 cd apple2 ball2 我尝试过以下方法： df = pd.DataFra

目前，我有以下格式的数据：

start   a             b
ab    apple1         ball1
cd    apple2         ball2

即

List[Tuple[Any，dict{'key'：List}]]

目标是以以下形式创建熊猫数据框：

start   a             b
ab    apple1         ball1
cd    apple2         ball2

我尝试过以下方法：

df = pd.DataFrame(columns=['start', 'a', 'b'])
for start, details in mylist:
    df = df.append({'start' : start}, ignore_index= True)
    df = df.append({'a' : details['a']} , ignore_index= True)
    df = df.append({'b': details['b']}, ignore_index=True)

我正试图找出一个优化的方法来实现这一点

熊猫能很好地使用字典或字典列表。你在两者之间有一些东西。在这种情况下，转换为字典非常简单：

L = [('ab', {'a' : ['apple1'], 'b': ['ball1']}),
     ('cd', {'a' : ['apple2'], 'b': ['ball2']})]

res = pd.DataFrame.from_dict(dict(L), orient='index')
res = res.apply(lambda x: x.str[0])

print(res)

         a      b
ab  apple1  ball1
cd  apple2  ball2

熊猫能很好地使用字典或字典列表。你在两者之间有一些东西。在这种情况下，转换为字典非常简单：

L = [('ab', {'a' : ['apple1'], 'b': ['ball1']}),
     ('cd', {'a' : ['apple2'], 'b': ['ball2']})]

res = pd.DataFrame.from_dict(dict(L), orient='index')
res = res.apply(lambda x: x.str[0])

print(res)

         a      b
ab  apple1  ball1
cd  apple2  ball2

像这样：

form = [ ('ab', {'a' : ['apple1'], 'b': ['ball1']}), ('cd', {'a' : ['apple2'], 'b':   ['ball2']})]

# separate 'start' from rest of data - inverse zip
start, data = zip(*form)

# create dataframe
df = pd.DataFrame(list(data))

# remove data from lists in each cell
df = df.applymap(lambda l: l[0])

df.insert(loc=0, column='start', value=start)

print(df)
     start     a      b
0    ab   apple1  ball1
1    cd   apple2  ball2

或者，如果希望开始成为数据帧的索引：

# separate 'start' from rest of data - inverse zip
index, data = zip(*form)

# create dataframe
df = pd.DataFrame(list(data), index=index)
df.index.name = 'start' 

# remove data from lists in each cell
df = df.applymap(lambda l: l[0])

print(df)
start     a      b
ab   apple1  ball1
cd   apple2  ball2

像这样：

form = [ ('ab', {'a' : ['apple1'], 'b': ['ball1']}), ('cd', {'a' : ['apple2'], 'b':   ['ball2']})]

# separate 'start' from rest of data - inverse zip
start, data = zip(*form)

# create dataframe
df = pd.DataFrame(list(data))

# remove data from lists in each cell
df = df.applymap(lambda l: l[0])

df.insert(loc=0, column='start', value=start)

print(df)
     start     a      b
0    ab   apple1  ball1
1    cd   apple2  ball2

或者，如果希望开始成为数据帧的索引：

# separate 'start' from rest of data - inverse zip
index, data = zip(*form)

# create dataframe
df = pd.DataFrame(list(data), index=index)
df.index.name = 'start' 

# remove data from lists in each cell
df = df.applymap(lambda l: l[0])

print(df)
start     a      b
ab   apple1  ball1
cd   apple2  ball2

更新为试用代码更新为试用代码

res.apply（lambda x:x.str[0]）的目的是什么？

您的内部词典中有列表，这是从每个列表中提取第一个（也是唯一一个）元素的一种方法。Nice one@jpp+1

res.apply（lambda x:x.str[0]）的目的是什么？

您的内部字典中有列表，这是从每个列表中提取第一个（也是唯一一个）元素的一种方法。Nice one@jpp+1