Python 试图重塑数据帧以上传到Google大查询表
我有这样的数据Python 试图重塑数据帧以上传到Google大查询表,python,python-3.x,pandas,google-bigquery,Python,Python 3.x,Pandas,Google Bigquery,我有这样的数据 Raw_Title Custom_Field Manager Ben Manager Ron Manager Liz Severity 4 - Low Severity 2 - High Severity 1 - Urgent Type of Dataset Private Type of Dataset Public Type of Dataset Public Request Category Company :: Add Request Cate
Raw_Title Custom_Field
Manager Ben
Manager Ron
Manager Liz
Severity 4 - Low
Severity 2 - High
Severity 1 - Urgent
Type of Dataset Private
Type of Dataset Public
Type of Dataset Public
Request Category Company :: Add
Request Category User :: Add User
Request Category User :: Remove User
Incident Category Pipeline :: Cloud
Incident Category UI :: Other
Incident Category UI :: Authentication
Platform Environment Staging
Platform Environment Development
Platform Environment Production
我正试图将其重塑为:
Manager Severity Type of Dataset Request Category Incident Category Platform Environment
Ben 4 - Low Private Company :: Add Pipeline :: Cloud Staging
Ron 2 - High Public User :: Add User UI :: Other Development
Liz 1 - Urgent Public User :: Remove User UI :: Authentication Production
我以为解决办法是这样的:
df = pd.DataFrame(filtered_df, columns = ['Manager','Severity','Type of Dataset','Request Category ','Incident Category','Platform Environment'])
print(df)
然而,这给了我一个完全空的数据帧
我所要做的就是获取“Raw_Title”并将其从行转到列,然后在每个“Raw_Title”下的“Custom_Field”中列出数据点。我该怎么做?我必须把它转换成这种格式,这样我就可以将所有内容导出到Google的大查询表中。感谢您的关注。您可以使用dataframe pivot来实现这一点
结果数据帧将为行中没有该列数据类型的每一列提供空值。如果您有一个将行链接在一起的标识符,请将其用于索引(index=)谢谢。这似乎很有效。唯一的问题是,我在寻找一个更动态的解决方案。所以,我只想列出字段名,就像我在第二幅图中显示的那样,并得到要填充的数据点,这样所有的行都会被填充。看起来您的解决方案可行,但我无法在每次运行时手动键入所有数据点。这是从数据库中提取的,我正在使用Python转换数据并将其加载到另一个数据库中。我只是手动键入它们,因为我无法访问您的数据库。只要数据位于具有这些列名的数据框中,“.pivot”函数就可以工作
import pandas as pd
df = pd.DataFrame({'Raw_Title': ['Manager','Manager','Manager','Severity','Severity','Severity',
'Type of Dataset','Type of Dataset','Type of Dataset',
'Request Category','Request Category','Request Category',
'Incident Category','Incident Category','Incident Category','Platform Environment',
'Platform Environment','Platform Environment'],
'Custom_Field': ['Ben','Ron','Liz',
'4 - Low','2 - High','1 - Urgent',
'Private','Public','Public','Company :: Add',
'User :: Add User','User :: Remove User',
'Pipeline :: Cloud','UI :: Other','UI :: Authentication',
'Staging','Development','Production']})
dfPivoted = df.pivot(columns='Raw_Title', values='Custom_Field')