Python 将mysql表模式动态写入Avro格式
我想从Mysql表中收集数据,并使用python将其转换为avro格式 考虑mysql中的这个表Python 将mysql表模式动态写入Avro格式,python,mysql,avro,Python,Mysql,Avro,我想从Mysql表中收集数据,并使用python将其转换为avro格式 考虑mysql中的这个表 dept_no, dept_name 'd001', 'Marketing' 'd002', 'Finance' 'd003', 'Human Resources' 'd004', 'Production' 'd005', 'Development' 'd006', 'Quality Management' 'd007', 'Sales' 'd008', 'Research' 'd009', 'Cu
dept_no, dept_name
'd001', 'Marketing'
'd002', 'Finance'
'd003', 'Human Resources'
'd004', 'Production'
'd005', 'Development'
'd006', 'Quality Management'
'd007', 'Sales'
'd008', 'Research'
'd009', 'Customer Service'
mycursor.execute('select*from employees')
结果=mycursor.fetchall()
当我使用上述查询获取结果时
我以类元组格式得到结果
其中,要转换为Avro格式,模式应按以下格式定义:
通过硬编码,我们可以实现以下格式
以下代码用于生成avro文件。
schema = {
'doc': 'A weather reading.',
'name': 'Weather',
'namespace': 'test',
'type': 'record',
'fields': [
{'name': 'dept_no', 'type': 'string'},
{'name': 'dept_name', 'type': 'string'},
],
}
And the records as
records = [
{u'dept_no': u'd001', u'dept_name': 'Marketing'},
{u'dept_no': u'd002', u'dept_name': 'Finance'},
{u'dept_no': u'd003', u'dept_name': 'Human Resources'},
{u'dept_no': u'd004', u'dept_name': 'Production'},
]
这里的问题是如何使用Python动态映射上述格式的模式和数据
import mysql.connector
mydb = mysql.connector.connect(
host="********************",
user="***********",
passwd="**********",
database='***********'
)
def byte_to_string(x):
temp_table_list = []
for row in x:
table = row[0].decode()
temp_table_list.append(table)
return temp_table_list
mycursor = mydb.cursor()
#Query to list all the tables
mycursor.execute("show tables")
r = mycursor.fetchall()
r = byte_to_string(r)
print(r)
x = len(r)
#Fetch all the records from table EMPLOYEES using Select *
mycursor.execute('select * from employees')
results = mycursor.fetchall()
print(type(results))
print(results)
#Displays Data of table employee record by record
for i in results:
print(i)
print(type(i))
#Fectching data from 2nd table departments
mycursor.execute('select * from departments')
data=[i[0] for i in mycursor.fetchall()]
mycursor.execute('select * from departments')
data1=[i[1] for i in mycursor.fetchall()]
print(data)
print(data1)
#zipbObj = zip(data,data1)
#dictOfWords = dict(zipbObj)
#print(dictOfWords)
mycursor.execute('SELECT `COLUMN_NAME`\
FROM `INFORMATION_SCHEMA`.`COLUMNS`\
WHERE `TABLE_SCHEMA`="triggerdb1"\
AND `TABLE_NAME`="departments"')
#Fetching the column names of the table as keys
keys=[i[0] for i in mycursor.fetchall()]
print(keys)
'''
zipbObj = zip(column_schema,data1)
dictOfWords = dict(zipbObj)
print(dictOfWords)
'''
#Finally we get the Header as key and records as values in a dict format
abc = {}
abc[keys[0]] = data1
abc[keys[1]] = data
print(abc)
The result is in the form of
#print(data)
>>>'d001', 'd002', 'd003', 'd004', 'd005', 'd006', 'd007', 'd008', 'd009']
#print(data1)
>>>['Marketing', 'Finance', 'Human Resources', 'Production', 'Development', 'Quality Management', 'Sales', 'Research', 'Customer Service']
#print(keys)
>>>['dept_no', 'dept_name']
#print(abc)
>>>{'dept_no': ['Marketing', 'Finance', 'Human Resources', 'Production', 'Development', 'Quality Management', 'Sales', 'Research', 'Customer Service'], 'dept_name': ['d001', 'd002', 'd003', 'd004', 'd005', 'd006', 'd007', 'd008', 'd009']}
这里的问题是如何使用Python动态地将模式和数据从生成的类Tuple/Dictionary映射到avro。
schema = {
'doc': 'A weather reading.',
'name': 'Weather',
'namespace': 'test',
'type': 'record',
'fields': [
{'name': 'dept_no', 'type': 'string'},
{'name': 'dept_name', 'type': 'string'},
],
}
And the records as
records = [
{u'dept_no': u'd001', u'dept_name': 'Marketing'},
{u'dept_no': u'd002', u'dept_name': 'Finance'},
{u'dept_no': u'd003', u'dept_name': 'Human Resources'},
{u'dept_no': u'd004', u'dept_name': 'Production'},
]
提前谢谢 你似乎有一个宽泛的问题,比如“我如何建立一个网站”。先考虑一下,然后再问一个更具体的问题。是的,这是真的。谢谢你@Eric Fossum我会尽量用更好的方式表达。到目前为止你有什么想法吗?你似乎有一个广泛的问题,比如“我如何制作一个网站?”。先考虑一下,然后再问一个更具体的问题。是的,这是真的。谢谢你@Eric Fossum我会尽量用更好的方式表达。到目前为止有什么想法吗?