Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/352.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在加载到json(to_json)之前从数据帧迭代器中删除元素_Python_Json_Pandas_Dataframe - Fatal编程技术网

Python 在加载到json(to_json)之前从数据帧迭代器中删除元素

Python 在加载到json(to_json)之前从数据帧迭代器中删除元素,python,json,pandas,dataframe,Python,Json,Pandas,Dataframe,我正在尝试从前面提取的数据_json字符串中删除电子邮件和用户ID,我不希望这些字段列出两次,使用python脚本: credentials = service_account.Credentials.from_service_account_file('/keys/json_poc.txt') project_id = 'myproject' bq_conn = bigquery.Client(credentials=credentials,project=project_id) # s

我正在尝试从前面提取的数据_json字符串中删除电子邮件和用户ID,我不希望这些字段列出两次,使用python脚本:

credentials = service_account.Credentials.from_service_account_file('/keys/json_poc.txt')
project_id = 'myproject'
bq_conn = bigquery.Client(credentials=credentials,project=project_id)


 # setup sql query
bq_sql = ("""
select email , userid, * except (email , userid) from dataset.usertable
""")

# load up dataframe
df = bq_conn.query(bq_sql).to_dataframe()

# iterate over rows
for i, row in df.iterrows():
    
    # pull out email and userid from df
    email = df.loc[i]['email']
    userid = df.loc[i]['userid']
    
    # build json data fields for iterable
    data_json = df.loc[i].to_json(orient="columns") # remove fields
    
    # json format {"email": "email@yahoo.com","dataFields": {},"userId": "99999"}
    final_json = '{"email": "' + email + '","dataFields": ' + data_json + ',"userId": "' + userid + '"}'
    r = requests.request(method="POST", url=url, headers=headers, data=final_json)
    print('added: ' + email)
API(GET)的输出如下所示:

b'{"user":{"email":"email@yahoo.com","dataFields":{"email":"email@yahoo.com","userId":"99999","StateofResidence":"CA",......,"categoryid":99},"userId":"99999"}}'
在上面,您可以看到email和userid字段列出了两次。我们将为每个用户提供大约500个属性,因此通过手动提取元素来构建json字符串是不实际的(另外,我希望能够在不更新python的情况下向源表添加列)


任何想法都将不胜感激。

整理好后,只需更改以下内容:

def myconverter(o):
if isinstance(o, datetime.datetime):
    return o.__str__()

bq_sql = ("""select email , userid, * except (email , userid) from dataset.usertable 
""")

df = bq_conn.query(bq_sql).to_dataframe()

for i, row in df.iterrows():

    payload_dict = {"email": row["email"], "dataFields": row.drop(labels=['userid', 'email']).to_dict(), "userid": row["userid"]} 

    json_payload = json.dumps(payload_dict, default = myconverter) #cast datetime
    
    r = requests.request(method="POST", url=url, headers=headers, data=json_payload)
    print(r.status_code)
    print(r.content)