使用Python绑定时Apache Arrow总线错误/Seg错误

使用Python绑定时Apache Arrow总线错误/Seg错误,python,c++,boost-python,pybind11,apache-arrow,Python,C++,Boost Python,Pybind11,Apache Arrow,我正在将数据写入拼花地板文件。Apache Arrow提供了一个简单的示例,其中数据流基本上是:data=>Arrow::ArrayBuilder=>Arrow::Array=>Arrow::Table=>parquet文件。这是作为独立的C++工作的,但是当我尝试将这个代码绑定到Python模块并从Python调用它(我使用Python 3.8.)时,总线错误10(或SEG故障11)在箭头上始终出现::AARYBuueLe=:箭头::数组(即在ArrayBuilder::完成函数)。有人知道为

我正在将数据写入拼花地板文件。Apache Arrow提供了一个简单的示例,其中数据流基本上是:data=>Arrow::ArrayBuilder=>Arrow::Array=>Arrow::Table=>parquet文件。这是作为独立的C++工作的,但是当我尝试将这个代码绑定到Python模块并从Python调用它(我使用Python 3.8.)时,总线错误10(或SEG故障11)在箭头上始终出现::AARYBuueLe=:箭头::数组(即在ArrayBuilder::完成函数)。有人知道为什么会发生这种情况,或者如何纠正它吗

为了解决这个问题,我尝试了一些调整,例如使用静态与动态库链接,使用ArrayBuilder::Finish重载的变体,以及使用不同的工具来创建python模块/。因此(尝试了pybind11和boost python),但错误仍然存在。它在arrow::ArrayBuilder::Finish(shared_ptrarrow::Array*)中持续崩溃。我在运行macOS。此简单的.py和.cc代码足以重新创建错误:

import pybindtest
pybindtest.python_bind_test()
#包括
#包括
#包括
#包括
#包括
std::shared_ptr generate_table(){
箭头::Int64Builder i64builder;
std::共享的ptr I64阵列;
拼花地板抛出不正常(i64builder.AppendValues({2,4}));
拼花地板抛出不正常(i64builder.Finish(&i64array));
箭头::StringBuilder strbuilder;
std::共享ptr strarray;
拼花地板(strbuilder.Append(“some”));
拼花地板(strbuilder.Append(“content”));
拼花地板不合格(strbuilder.Finish(&strarray));
std::shared_ptr schema=arrow::schema(
{arrow::field(“int”,arrow::int64()),
arrow::field(“str”,arrow::utf8())});
返回箭头::表::Make(模式,{i64array,strarray});
}
无效写入拼花文件(常量箭头::表格和表格){
std::共享的ptr输出文件;
拼花(outfile,arrow::io::FileOutputStream::Open(“pybindtest.PARQUET”));
拼花地板抛出不正常(拼花地板::箭头::可写(表,箭头::默认内存池(),输出文件,3));
}
void python_bind_test(){
std::shared_ptr table=生成_table();
写入拼花文件(*表格);
}
PYBIND11_模块(pybindtest,m){
m、 def(“python_bind_test”和python_bind_test);
}
这是其中一个核心的回溯:

$ lldb -c core.84103 
(lldb) target create --core "core.84103"
Core file '/cores/core.84103' (x86_64) was loaded.

(lldb) bt
* thread #1, stop reason = signal SIGSTOP
  * frame #0: 0x00007fff91b52a58 libc++abi.dylib`vtable for __cxxabiv1::__si_class_type_info + 16
    frame #1: 0x0000000103b1f4c8 libarrow.300.0.0.dylib`arrow::ArrayBuilder::Finish(std::__1::shared_ptr<arrow::Array>*) + 40
    frame #2: 0x0000000103a0c492 pybindtest.cpython-38-darwin.so`generate_table() + 642
    frame #3: 0x0000000103a0e298 pybindtest.cpython-38-darwin.so`python_bind_test() + 24
    frame #4: 0x0000000103a4425f pybindtest.cpython-38-darwin.so`void pybind11::detail::argument_loader<>::call_impl<void, void (*&)(), pybind11::detail::void_type>(void (*&)(), pybind11::detail::index_sequence<>, pybind11::detail::void_type&&) && + 31
    frame #5: 0x0000000103a44136 pybindtest.cpython-38-darwin.so`std::__1::enable_if<std::is_void<void>::value, pybind11::detail::void_type>::type pybind11::detail::argument_loader<>::call<void, pybind11::detail::void_type, void (*&)()>(void (*&)()) && + 54
    frame #6: 0x0000000103a43ff2 pybindtest.cpython-38-darwin.so`void pybind11::cpp_function::initialize<void (*&)(), void, pybind11::name, pybind11::scope, pybind11::sibling>(void (*&)(), void (*)(), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::'lambda'(pybind11::detail::function_call&)::operator()(pybind11::detail::function_call&) const + 130
    frame #7: 0x0000000103a43f55 pybindtest.cpython-38-darwin.so`void pybind11::cpp_function::initialize<void (*&)(), void, pybind11::name, pybind11::scope, pybind11::sibling>(void (*&)(), void (*)(), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&) + 21
    frame #8: 0x0000000103a2cb62 pybindtest.cpython-38-darwin.so`pybind11::cpp_function::dispatcher(_object*, _object*, _object*) + 4818
    frame #9: 0x00000001035cf164 python`cfunction_call_varargs + 68
    frame #10: 0x00000001035ce3a7 python`_PyObject_MakeTpCall + 167
    frame #11: 0x0000000103713228 python`_PyEval_EvalFrameDefault + 45944
    frame #12: 0x0000000103706060 python`_PyEval_EvalCodeWithName + 560
    frame #13: 0x0000000103780a7c python`PyRun_FileExFlags + 364
    frame #14: 0x0000000103780171 python`PyRun_SimpleFileExFlags + 529
    frame #15: 0x00000001037a8c5a python`pymain_run_file + 394
    frame #16: 0x00000001037a81b6 python`pymain_run_python + 486
    frame #17: 0x00000001037a7f88 python`Py_RunMain + 24
    frame #18: 0x00000001037a9670 python`pymain_main + 32
    frame #19: 0x00000001035a1cb9 python`main + 57
    frame #20: 0x00007fff6b8b7cc9 libdyld.dylib`start + 1
    frame #21: 0x00007fff6b8b7cc9 libdyld.dylib`start + 1
$lldb-c core.84103
(lldb)目标创建--core“core.84103”
已加载核心文件“/cores/Core.84103”(x86_64)。
(lldb)英国电信
*线程#1,停止原因=信号SIGSTOP
*帧#0:0x00007fff91b52a58 libc++abi.dylib`vtable for uuu cxxabiv1::uu si_class_type_info+16
帧#1:0x0000000103b1f4c8 libarrow.300.0.0.dylib`arrow::ArrayBuilder::Finish(std::u 1::shared_ptr*)+40
帧#2:0x0000000103a0c492 pybindtest.cpython-38-darwin.so`generate#u table()+642
第3帧:0x0000000103a0e298 pybindtest.cpython-38-darwin.so`python\u bind\u test()+24
帧#4:0x0000000103a4425f pybindtest.cpython-38-darwin.so`void pybind11::detail::argument_loader::call_impl(void(*&)),pybind11::detail::index_sequence,pybind11::detail::void_type&&&&+31
帧#5:0x0000000103a44136 pybindtest.cpython-38-darwin.so`std::u 1::enable_if::type pybind11::detail::argument_loader::call(void(*&)()&&+54
帧#6:0x0000000103a43ff2 pybindtest.cpython-38-darwin.so`void pybind11::cpp_函数::initialize(void(*&)(),void(*)(),pybind11::name const&,pybind11::scope const&,pybind11::sibling const&):'lambda'(pybind11::detail::function_call&)::operator()(pybind11::detail::function_call&)const+130
帧#7:0x0000000103a43f55 pybindtest.cpython-38-darwin.so`void pybind11::cpp_函数::initialize(void(*&)(),void(*)(),pybind11::name const&,pybind11::scope const&,pybind11::sibling const&):'lambda'(pybind11::detail::function_调用&)::u调用(pybind11::detail::function_调用&)+21
帧#8:0x0000000103a2cb62 pybindtest.cpython-38-darwin.so`pybind11::cpp_函数::调度程序(_对象*,_对象*,_对象*)+4818
帧#9:0x00000001035cf164 python`cfunction_call_varargs+68
帧#10:0x00000001035ce3a7 python`\u PyObject\u MakeTpCall+167
帧#11:0x0000000103713228 python`_PyEval_EvalFrameDefault+45944
帧#12:0x0000000103706060 python`_PyEval_EvalCodeWithName+560
帧#13:0x0000000103780a7c python`PyRun#u FileExFlags+364
帧#14:0x0000000103780171 python`PyRun_simplefleexflags+529
帧#15:0x00000001037a8c5a python`pymain_run_file+394
帧#16:0x00000001037a81b6 python`pymain_run_python+486
帧#17:0x00000001037a7f88 python`Py#u RunMain+24
帧18:0x00000001037a9670 python`pymain#u main+32
帧#19:0x00000001035a1cb9 python`main+57
帧#20:0x00007fff6b8b7cc9 libdyld.dylib`start+1
帧#21:0x00007fff6b8b7cc9 libdyld.dylib`start+1

进一步调查后,此错误似乎是由我从源代码构建的arrow cpp库与我从conda forge安装的Pyarow包之间的冲突触发的。我可以通过pip将pyarrow安装到我的conda env中,而不是从conda forge通道中取出它来解决这个问题(在我的例子中,pyarrow也是如此,因为它依赖于pyarrow)

虽然我不知道这种不兼容的确切原因,但它可能与当前的MacOS警告有关,该警告在

使用conda在macOS上构建Arrow非常复杂,因为conda forge编译器需要旧的macOS SDK。康达提供了一些安装说明;替代方案是使用自制和pip


我在导入时遇到一个错误,请您解释一下您是如何使用pybind编译扩展的,尤其是对于arrow库?@Christian感谢您的提问,这促使我创建了bar bones build repo,令我惊讶的是,错误没有发生,这使我找到了解决方案。如果您仍然对构建配置感兴趣,请参阅:
$ lldb -c core.84103 
(lldb) target create --core "core.84103"
Core file '/cores/core.84103' (x86_64) was loaded.

(lldb) bt
* thread #1, stop reason = signal SIGSTOP
  * frame #0: 0x00007fff91b52a58 libc++abi.dylib`vtable for __cxxabiv1::__si_class_type_info + 16
    frame #1: 0x0000000103b1f4c8 libarrow.300.0.0.dylib`arrow::ArrayBuilder::Finish(std::__1::shared_ptr<arrow::Array>*) + 40
    frame #2: 0x0000000103a0c492 pybindtest.cpython-38-darwin.so`generate_table() + 642
    frame #3: 0x0000000103a0e298 pybindtest.cpython-38-darwin.so`python_bind_test() + 24
    frame #4: 0x0000000103a4425f pybindtest.cpython-38-darwin.so`void pybind11::detail::argument_loader<>::call_impl<void, void (*&)(), pybind11::detail::void_type>(void (*&)(), pybind11::detail::index_sequence<>, pybind11::detail::void_type&&) && + 31
    frame #5: 0x0000000103a44136 pybindtest.cpython-38-darwin.so`std::__1::enable_if<std::is_void<void>::value, pybind11::detail::void_type>::type pybind11::detail::argument_loader<>::call<void, pybind11::detail::void_type, void (*&)()>(void (*&)()) && + 54
    frame #6: 0x0000000103a43ff2 pybindtest.cpython-38-darwin.so`void pybind11::cpp_function::initialize<void (*&)(), void, pybind11::name, pybind11::scope, pybind11::sibling>(void (*&)(), void (*)(), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::'lambda'(pybind11::detail::function_call&)::operator()(pybind11::detail::function_call&) const + 130
    frame #7: 0x0000000103a43f55 pybindtest.cpython-38-darwin.so`void pybind11::cpp_function::initialize<void (*&)(), void, pybind11::name, pybind11::scope, pybind11::sibling>(void (*&)(), void (*)(), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&) + 21
    frame #8: 0x0000000103a2cb62 pybindtest.cpython-38-darwin.so`pybind11::cpp_function::dispatcher(_object*, _object*, _object*) + 4818
    frame #9: 0x00000001035cf164 python`cfunction_call_varargs + 68
    frame #10: 0x00000001035ce3a7 python`_PyObject_MakeTpCall + 167
    frame #11: 0x0000000103713228 python`_PyEval_EvalFrameDefault + 45944
    frame #12: 0x0000000103706060 python`_PyEval_EvalCodeWithName + 560
    frame #13: 0x0000000103780a7c python`PyRun_FileExFlags + 364
    frame #14: 0x0000000103780171 python`PyRun_SimpleFileExFlags + 529
    frame #15: 0x00000001037a8c5a python`pymain_run_file + 394
    frame #16: 0x00000001037a81b6 python`pymain_run_python + 486
    frame #17: 0x00000001037a7f88 python`Py_RunMain + 24
    frame #18: 0x00000001037a9670 python`pymain_main + 32
    frame #19: 0x00000001035a1cb9 python`main + 57
    frame #20: 0x00007fff6b8b7cc9 libdyld.dylib`start + 1
    frame #21: 0x00007fff6b8b7cc9 libdyld.dylib`start + 1