在python中返回多个值并将它们附加到数据帧的唯一列中
背景: 我有一个函数,可以从数据库中获取一组属性。以下是函数:在python中返回多个值并将它们附加到数据帧的唯一列中,python,pandas,lambda,Python,Pandas,Lambda,背景: 我有一个函数,可以从数据库中获取一组属性。以下是函数: def getData(key, full_name, address, city, state, zipcode): try: url = 'https://personator.melissadata.net/v3/WEB/ContactVerify/doContactVerify' payload={ 'TransmissionReference': "t
def getData(key, full_name, address, city, state, zipcode):
try:
url = 'https://personator.melissadata.net/v3/WEB/ContactVerify/doContactVerify'
payload={
'TransmissionReference': "test", # used by you to keep track of reference
'Actions': 'Check',
'Columns': 'Gender','DateOfBirth','DateOfDeath','EthnicCode','EthnicGroup','Education','PoliticalParty','MaritalStatus','HouseholdSize','ChildrenAgeRange','PresenceOfChildren','PresenceOfSenior','LengthOfResidence','OwnRent','CreditCardUser','Occupation','HouseholdIncome',
'CustomerID': key,# key
'Records': [{'FullName': str(full_name), 'AddressLine1': str(address), 'City': str(city), 'State': str(state), 'PostalCode': str(zipcode)}]
}
headers = {'Content-Type': 'application/json; charset=utf-8', 'Accept':'application/json', 'Host':'personator.melissadata.net','Expect': '100-continue', 'Connection':'Keep-Alive'}
r = requests.post(url, data=json.dumps(payload), headers=headers)
dom = json.loads(r.text)
Gender = dom['Records'][0]['Gender']
DateOfBirth = dom['Records'][0]['DateOfBirth']
DateOfDeath = dom['Records'][0]['DateOfDeath']
EthnicCode = dom['Records'][0]['EthnicCode']
EthnicGroup = dom['Records'][0]['EthnicGroup']
Education = dom['Records'][0]['Education']
PoliticalParty = dom['Records'][0]['PoliticalParty']
MaritalStatus = dom['Records'][0]['MaritalStatus']
HouseholdSize = dom['Records'][0]['HouseholdSize']
ChildrenAgeRange = dom['Records'][0]['ChildrenAgeRange']
PresenceOfChildren = dom['Records'][0]['PresenceOfChildren']
PresenceOfSenior = dom['Records'][0]['PresenceOfSenior']
LengthOfResidence = dom['Records'][0]['LengthOfResidence']
OwnRent = dom['Records'][0]['OwnRent']
CreditCardUser = dom['Records'][0]['CreditCardUser']
Occupation = dom['Records'][0]['Occupation']
HouseholdIncome = dom['Records'][0]['HouseholdIncome']
return Gender
except:
return None
为了生成一个“Gender”列,我将函数包装成lambda,如下所示
df['Gender'] = df.apply(lambda row: getData(key, row['Full Name'], row['Address'], row['City'], row['State'], row['Zipcode']))
目标:
我想对下面看到的所有其他属性同时执行此过程,如何在Python中执行此操作。您可以返回一个字典,然后展开一系列字典对象:
fields = ['Gender', 'DateOfBirth', etc.]
def getData(key, full_name, address, city, state, zipcode):
try:
# your code as before
dom = json.loads(r.text)
return {k: dom['Records'][0][k] for k in fields}
# modify below: good practice to specify exactly which error(s) to catch
except:
return {}
然后展开您的词典系列:
dcts = df.apply(lambda row: getData(key, row['Full Name'], row['Address'], row['City'],
row['State'], row['Zipcode']), axis=1)
df = df.join(pd.DataFrame(dcts.tolist()))
根据@spaniard的评论,如果您想要所有可用字段,只需使用:
return json.loads(r.text)['Records'][0]
数据库记录是否包含更多您想要获取的属性?好像你在有效负载中有你想要的列,对吧?也许他可以在
getData()
中返回dom['Records'][0]
,然后使用**getData(…)
@spaniard,好的一点,我只是担心字段是所有可用字段的子集。是的,我知道。我提到它是因为他想要的字段
,这与他在有效载荷['Columns']
@西班牙人中要求的相同。如果你的方法更快,请提供一个答案:)jpp解决方案很好,你应该在@snorlaxxx接受它。如果他愿意,他可以编辑我的评论,但一个新的答案没有意义,因为它将是99%相同。我不认为存在性能差异,至少不明显。