Python 如何将数据帧转换为具有聚合级别的嵌套命名元组
我正在寻找一种从数据帧创建嵌套命名元组的方法。 对象Python 如何将数据帧转换为具有聚合级别的嵌套命名元组,python,pandas,tuples,Python,Pandas,Tuples,我正在寻找一种从数据帧创建嵌套命名元组的方法。 对象d是预期的输出。我不确定是否必须直接在Pandas中进行聚合,然后再进行到NamedTuple的转换 from typing import NamedTuple from typing import List import pandas as pd if __name__ == "__main__": data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "
d
是预期的输出。我不确定是否必须直接在Pandas中进行聚合,然后再进行到NamedTuple
的转换
from typing import NamedTuple
from typing import List
import pandas as pd
if __name__ == "__main__":
data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "ab 11"]]
People = pd.DataFrame(data, columns=["Name", "Age", "PostalCode"])
names = list(People[["Name"]].itertuples(name="Names", index=False))
postal_codes = list(
People[["PostalCode"]].itertuples(name="PostalCode", index=False)
)
# ...
# ... The code after produce the expected output even if the name of the NamedTuple doesn't matter
PeopleName = NamedTuple("PeopleName", [("Name", str)])
PeoplePC = NamedTuple("PeoplePC", [("PostalCode", str)])
Demography = NamedTuple(
"Demography", [("names", List[PeopleName]), ("postalcodes", PeoplePC)]
)
d = [
Demography(
[PeopleName(Name="tom"), PeopleName(Name="juli")],
PeoplePC(PostalCode="ab 11"),
),
Demography([PeopleName(Name="nick")], PeoplePC(PostalCode="ab 22"),),
]
您可以在以下组上使用函数()并将其应用于嵌套的元组
:
from typing import NamedTuple, List
import pandas as pd
data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "ab 11"]]
people = pd.DataFrame(data, columns=["Name", "Age", "PostalCode"])
PeopleName = NamedTuple("PeopleName", [("Name", str)])
PeoplePC = NamedTuple("PeoplePC", [("PostalCode", str)])
Demography = NamedTuple("Demography", [("names", List[PeopleName]), ("postalcodes", PeoplePC)])
def to_nested_tuple(k, g):
peoples = list(g['Name'].to_frame().itertuples(name='Person', index=False))
return Demography(peoples, PeoplePC(k))
d = [to_nested_tuple(*item) for item in people.groupby('PostalCode')]
print(d)
输出
[Demography(names=[Person(Name='tom'), Person(Name='juli')], postalcodes=PeoplePC(PostalCode='ab 11')), Demography(names=[Person(Name='nick')], postalcodes=PeoplePC(PostalCode='ab 22'))]
这段代码假设只从数据帧中检索到一个属性,那么检索多个字段的选项是什么,比如
…g['firstname','lastname']]…
-如果我错了,请纠正我的错误,但这不会生成系列
-谢谢如果需要多个字段,请将拖放到\u frame()
调用。这有意义吗?