Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/353.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将json文档转换为kafka的字节流_Python_Json_Apache Spark_Apache Kafka - Fatal编程技术网

Python 将json文档转换为kafka的字节流

Python 将json文档转换为kafka的字节流,python,json,apache-spark,apache-kafka,Python,Json,Apache Spark,Apache Kafka,我在将数据转换为具有正确模式的spark数据帧的正确字节流时遇到问题 这是我的数据。。在一个文件中有这些json的行 {"crime_id": "Crime Id", "original_crime_type_name": "Original Crime Type Name", "report_date": "Report Date", "call_date": "Call Date", "offense_date": "Offense Date", "call_time": "Call Tim

我在将数据转换为具有正确模式的spark数据帧的正确字节流时遇到问题

这是我的数据。。在一个文件中有这些json的行

{"crime_id": "Crime Id", "original_crime_type_name": "Original Crime Type Name", "report_date": "Report Date", "call_date": "Call Date", "offense_date": "Offense Date", "call_time": "Call Time", "call_date_time": "Call Date Time", "disposition": "Disposition", "address": "Address", "city": "City", "state": "State", "agency_id": "Agency Id", "address_type": "Address Type", "common_location": "Common Location"}
{"crime_id": "192924201", "original_crime_type_name": "Fraud", "report_date": "2019-10-19T00:00:00.000", "call_date": "2019-10-19T00:00:00.000", "offense_date": "2019-10-19T00:00:00.000", "call_time": "23:55", "call_date_time": "2019-10-19T23:55:00.000", "disposition": "REP", "address": "2000 Block Of Mcallister St", "city": "San Francisco", "state": "CA", "agency_id": "1", "address_type": "Premise Address", "common_location": ""}
我正在读取文件,并用

    def generate_data(self):
        with open(self.input_file) as f:
            for line in f:
                message = self.dict_to_binary(line)
                self.send(self.topic, message)
                time.sleep(1)

    def dict_to_binary(self, json_dict):
        return json.dumps(json_dict).encode('utf-8')
于是卡夫卡就产生了

|"{\"Crime Id\": \"Crime Id\", \"Original Crime Type Name\": \"Original Crime Type Name\", \"Report Date\": \"Report Date\", \"Call Date\": \"Call Date\", \"Offense Date\": \"Offense Date\", \"Call Time\": \"Call Time\", \"Call Date Time\": \"Call Date Time\", \"Disposition\": \"Disposition\", \"Address\": \"Address\", \"City\": \"City\", \"State\": \"State\", \"Agency Id\": \"Agency Id\", \"Address Type\": \"Address Type\", \"Common Location\": \"Common Location\"}\n"                                                        |
|"{\"Crime Id\": \"192924201\", \"Original Crime Type Name\": \"Fraud\", \"Report Date\": \"2019-10-19T00:00:00.000\", \"Call Date\": \"2019-10-19T00:00:00.000\", \"Offense Date\": \"2019-10-19T00:00:00.000\", \"Call Time\": \"23:55\", \"Call Date Time\": \"2019-10-19T23:55:00.000\", \"Disposition\": \"REP\", \"Address\": \"2000 Block Of Mcallister St\", \"City\": \"San Francisco\", \"State\": \"CA\", \"Agency Id\": \"1\", \"Address Type\": \"Premise Address\", \"Common Location\": \"\"}\n"                                  |
但是当我试图用控制台调出服务表时,我得到了

+--------+------------------------+-----------+---------+------------+---------+--------------+-----------+-------+----+-----+---------+------------+---------------+
|crime_id|original_crime_type_name|report_date|call_date|offense_date|call_time|call_date_time|disposition|address|city|state|agency_id|address_type|common_location|
+--------+------------------------+-----------+---------+------------+---------+--------------+-----------+-------+----+-----+---------+------------+---------------+
|null    |null                    |null       |null     |null        |null     |null          |null       |null   |null|null |null     |null        |null           |
|null    |null                    |null       |null     |null        |null     |null          |null       |null   |null|null |null     |null        |null           |
|null    |null                    |null       |null     |null        |null     |null          |null       |null   |null|null |null     |null        |null           |
架构都在
StringType()
中,我不确定为什么转换没有正确进行。。有什么帮助吗

+--------+------------------------+-----------+---------+------------+---------+--------------+-----------+-------+----+-----+---------+------------+---------------+
|crime_id|original_crime_type_name|report_date|call_date|offense_date|call_time|call_date_time|disposition|address|city|state|agency_id|address_type|common_location|
+--------+------------------------+-----------+---------+------------+---------+--------------+-----------+-------+----+-----+---------+------------+---------------+
|null    |null                    |null       |null     |null        |null     |null          |null       |null   |null|null |null     |null        |null           |
|null    |null                    |null       |null     |null        |null     |null          |null       |null   |null|null |null     |null        |null           |
|null    |null                    |null       |null     |null        |null     |null          |null       |null   |null|null |null     |null        |null           |