Apache spark 如何使用融合的kafka python将数据科学体验/Spark作为服务连接到消息中心?

Apache spark 如何使用融合的kafka python将数据科学体验/Spark作为服务连接到消息中心?,apache-spark,ibm-cloud,message-hub,data-science-experience,dsx,Apache Spark,Ibm Cloud,Message Hub,Data Science Experience,Dsx,Bluemix MessageHub文档将python用户指向合流的kafka库: 因此,我尝试安装: !pip install --user confluent-kafka 但是,我遇到了以下错误: Collecting confluent-kafka Using cached confluent-kafka-0.9.1.2.tar.gz Installing collected packages: confluent-kafka Running setup.py install

Bluemix MessageHub文档将python用户指向合流的kafka库:

因此,我尝试安装:

!pip install --user confluent-kafka
但是,我遇到了以下错误:

Collecting confluent-kafka
  Using cached confluent-kafka-0.9.1.2.tar.gz
Installing collected packages: confluent-kafka
  Running setup.py install for confluent-kafka ... - \ error
    Complete output from command /usr/local/src/bluemix_jupyter_bundle.v22/notebook/bin/python -u -c "import setuptools, tokenize;__file__='/gpfs/global_fs01/sym_shared/YPProdSpark/user/xxxx/notebook/tmp/pip-build-N3zDUh/confluent-kafka/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /gpfs/fs01/user/xxxx/notebook/tmp/pip-PyAwq2-record/install-record.txt --single-version-externally-managed --compile --user --prefix=:
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-2.7
    creating build/lib.linux-x86_64-2.7/confluent_kafka
    copying confluent_kafka/__init__.py -> build/lib.linux-x86_64-2.7/confluent_kafka
    creating build/lib.linux-x86_64-2.7/confluent_kafka/kafkatest
    copying confluent_kafka/kafkatest/__init__.py -> build/lib.linux-x86_64-2.7/confluent_kafka/kafkatest
    copying confluent_kafka/kafkatest/verifiable_consumer.py -> build/lib.linux-x86_64-2.7/confluent_kafka/kafkatest
    copying confluent_kafka/kafkatest/verifiable_producer.py -> build/lib.linux-x86_64-2.7/confluent_kafka/kafkatest
    copying confluent_kafka/kafkatest/verifiable_client.py -> build/lib.linux-x86_64-2.7/confluent_kafka/kafkatest
    running build_ext
    building 'confluent_kafka.cimpl' extension
    creating build/temp.linux-x86_64-2.7
    creating build/temp.linux-x86_64-2.7/confluent_kafka
    creating build/temp.linux-x86_64-2.7/confluent_kafka/src
    gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/local/src/bluemix_jupyter_bundle.v22/notebook/include/python2.7 -c confluent_kafka/src/confluent_kafka.c -o build/temp.linux-x86_64-2.7/confluent_kafka/src/confluent_kafka.o
    In file included from confluent_kafka/src/confluent_kafka.c:17:0:
    confluent_kafka/src/confluent_kafka.h:20:32: fatal error: librdkafka/rdkafka.h: No such file or directory
     #include <librdkafka/rdkafka.h>
                                    ^
    compilation terminated.
    error: command 'gcc' failed with exit status 1

    ----------------------------------------
Command "/usr/local/src/bluemix_jupyter_bundle.v22/notebook/bin/python -u -c "import setuptools, tokenize;__file__='/gpfs/global_fs01/sym_shared/YPProdSpark/user/xxxx/notebook/tmp/pip-build-N3zDUh/confluent-kafka/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /gpfs/fs01/user/xxxx/notebook/tmp/pip-PyAwq2-record/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /gpfs/global_fs01/sym_shared/YPProdSpark/user/xxxx/notebook/tmp/pip-build-N3zDUh/confluent-kafka/
收集合流卡夫卡
使用缓存的confluent-kafka-0.9.1.2.tar.gz
安装收集的软件包:confluent kafka
正在为confluent kafka运行setup.py安装…-\错误
从命令/usr/local/src/bluemix\u jupyter\u bundle.v22/notebook/bin/python-u-c“导入setuptools,tokenize;\uuuu file\uuuu='/gpfs/global\u fs01/sym\u shared/YPProdSpark/user/xxxx/notebook/tmp/pip-build-N3zDUh/confluent kafka/setup.py';exec(编译(getattr(tokenize,'open,'open,'open)(\uu文件)')).read()。替换('r\n',\n,'exec')”install--record/gpfs/fs01/user/xxxx/notebook/tmp/pip-PyAwq2-record/install-record.txt--外部管理的单一版本--编译--用户--前缀=:
正在运行的安装
运行构建
运行build\u py
创建构建
创建build/lib.linux-x86_64-2.7
创建build/lib.linux-x86\u 64-2.7/confluent\u kafka
正在复制confluent_kafka/_init__.py->build/lib.linux-x86_64-2.7/confluent_kafka
创建build/lib.linux-x86\u 64-2.7/confluent\u kafka/kafkatest
正在复制confluent_kafka/kafkatest/_init__.py->build/lib.linux-x86_64-2.7/confluent_kafka/kafkatest
复制confluent_kafka/kafkatest/verifiable_consumer.py->build/lib.linux-x86_64-2.7/confluent_kafka/kafkatest
复制confluent_kafka/kafkatest/verifiable_producer.py->build/lib.linux-x86_64-2.7/confluent_kafka/kafkatest
复制confluent_kafka/kafkatest/verifiable_client.py->build/lib.linux-x86_64-2.7/confluent_kafka/kafkatest
运行build_ext
建筑“confluent_kafka.cimpl”扩建
创建build/temp.linux-x86_64-2.7
创建build/temp.linux-x86\u 64-2.7/confluent\u kafka
创建build/temp.linux-x86\u 64-2.7/confluent\u kafka/src
gcc-pthread-fno严格别名-g-O2-DNDEBUG-g-fwrapv-O3-Wall-Wstrict原型-fPIC-I/usr/local/src/bluemix_jupyter_bundle.v22/notebook/include/python2.7-c confluent_kafka/src/confluent_kafka.c-o build/temp.linux-x86_64-2.7/confluent_kafka/src/confluent_kafka.o
在confluent_kafka/src/confluent_kafka.c:17:0中包含的文件中:
confluent_kafka/src/confluent_kafka.h:20:32:致命错误:librdkafka/rdkafka.h:没有这样的文件或目录
#包括
^
编译终止。
错误:命令“gcc”失败,退出状态为1
----------------------------------------
命令“/usr/local/src/bluemix_jupyter_bundle.v22/notebook/bin/python-u-c”导入setuptools,标记化__文件“/gpfs/global\u fs01/sym\u shared/YPProdSpark/user/xxxx/notebook/tmp/pip-build-N3zDUh/confluent kafka/setup.py”;exec(compile(getattr(tokenize,'open',open)('uuuuu file,'uuuuuu).read().replace('\r\n','\n'),'uuuuu file,'exec'))“install--record/gpfs/fs01/user/xxxx/notebook/tmp/pip-PyAwq2-record/install-record.txt--外部管理的单一版本--compile----user--prefix=”失败,错误代码为1,位于/gpfs/global_fs01/sym_shared/YPProdSpark/user/xxxx/notebook/tmp/pip-build-N3zDUh/confluent kafka中/
根据confluent,您需要先安装librdkafka本机库,然后才能使用其Python客户端

正如您在另一条评论中所发现的,他们的是另一个纯Python库,对于Bluemix应用程序,您可能会发现它更容易使用。根据confluent,您需要先安装librdkafka本机库,然后才能使用他们的Python客户机


正如您在其他评论中所发现的,他们的替代纯Python库对于Bluemix应用程序来说可能更容易使用

注意,我是abe使用替代kafka Python库连接的注意,我是abe使用替代kafka Python库连接的,我想我的主要问题是某服务未提供其姊妹服务messagehub推荐的驱动程序。我猜这是因为spark streaming对kafka与sasl的python支持不可用,因此将连接messagehub的用例限制为仅在驱动程序主机上运行的程序。我认为对我来说,主要问题是spark as服务没有提供其姐妹服务messagehub推荐的驱动程序。我猜这是因为spark streaming对带有sasl的kafka的python支持不可用,因此将连接messagehub的用例限制为仅在驱动程序主机上运行的程序。