Apache spark PySpark无法通过sparkContext/hiveContext读取配置单元ORC事务表？我们可以使用Pyspark更新/删除配置单元表数据吗？_Apache Spark_Hadoop_Hive_Pyspark_Pyspark Sql - Fatal编程技术网

Apache spark PySpark无法通过sparkContext/hiveContext读取配置单元ORC事务表？我们可以使用Pyspark更新/删除配置单元表数据吗？

apache-spark hadoop hive pyspark

Apache spark PySpark无法通过sparkContext/hiveContext读取配置单元ORC事务表？我们可以使用Pyspark更新/删除配置单元表数据吗？,apache-spark,hadoop,hive,pyspark,pyspark-sql,Apache Spark,Hadoop,Hive,Pyspark,Pyspark Sql,我曾尝试使用PySpark访问Hive ORC事务表（在HDFS上有底层增量文件），但无法通过sparkContext/hiveContext读取事务表 /mydim/三角洲0117202三角洲0117202 /mydim/delta_0117203_0117203 官方Spark尚未支持蜂巢酸表，请获取 acid表的完全转储/增量转储到常规的蜂窝orc/拼花地板分区表，然后使用spark读取数据有一个开放的Jira来添加对读取蜂巢酸表的支持如果您在Acid表（从hive）上运行主要压缩，

我曾尝试使用PySpark访问Hive ORC事务表（在HDFS上有底层增量文件），但无法通过sparkContext/hiveContext读取事务表

/mydim/三角洲0117202三角洲0117202

/mydim/delta_0117203_0117203

官方Spark
尚未支持蜂巢酸表，请获取 acid表的完全转储/增量转储
到常规的蜂窝orc/拼花地板
分区表，然后使用spark读取数据
有一个开放的Jira来添加对读取蜂巢酸表的支持

如果您在Acid表（从hive）上运行主要压缩
，则spark只能读取base\u XXX
目录，而不能读取本jira中寻址的增量目录

使用本链接中提到的方法读取acid表有一些变通方法

我认为从HDP-3.X开始
能够支持读取HiveAcid表

我已经测试过了。从CDP-HDP-3.0开始，将Hive Warehouse连接器库/插件与Spark一起使用，将使Hive表（ORC格式表）符合ACID要求

[hadoop]相关文章推荐

Hadoop 通过JRuby脚本从HBase获取列数据 hadoop hbase

Raspberry Pi 2的Hadoop配置 hadoop

BOINC与Hadoop/Spark/etc的区别 hadoop

Hadoop 剥离蜂巢中的空白 hadoop hive

随机文章推荐

Spring 对引用数据使用@modeldattribute-避免绑定 spring spring-mvc

未在Tomcat中使用Spring和JPA启动事务 spring hibernate tomcat jpa

Spring 将对象绑定到apache mq jndi提供程序 spring jms activemq

Spring安全性：同时对两个或多个web应用程序进行授权 spring security authentication spring-security

Spring inside war在内部jar文件中找不到类路径资源 spring tomcat

Spring 如何动态选择要使用的数据库 spring hibernate

Spring 无法在Jhipster中解密密码 spring jhipster

Spring 特里梅莱夫赢得'；不取页 spring spring-mvc spring-boot

发生异常时SpringMVC Null主体 spring security authentication servlets

SpringJavaConfig中的限定符 spring

如何将自定义方法添加到Spring存储库？ spring methods

Spring安全注释don'；不能在服务层上工作 spring spring-mvc spring-security

Spring 身份验证0:NoSuchMethodError身份验证时出错 spring grails spring-boot

Spring安全身份验证主体作为UserDetails spring

Spring Struts html：按钮等价物 spring spring-mvc struts2

Spring引导数据JPA是否在内部为其使用hibernate'；什么是实施？ spring hibernate jpa

创建messageSource后，Spring停止服务静态资源 spring spring-mvc

Spring BindingResult和bean名称的普通目标对象都没有-请告诉mi错误在哪里 spring

Spring Kotlin中实体的延迟加载 spring kotlin

Spring 如何使用Entity Manager添加与多对多相关的记录 spring hibernate jpa

[apache spark]相关推荐

Apache spark 在Apache Spark中仅使用一个处理器映射作业
Apache Spark

Apache spark 访问朴素贝叶斯&x27；Spark中的s后验概率'；s MLlib
Apache Spark

Apache spark apache zeppelin已启动，但localhost:8080中存在连接错误
Apache Spark

Apache spark 如何在Apache spark中启动项目？
Apache Spark

Apache spark 使用虚拟机为大数据创建集群？
Apache Spark Virtual Machine Virtualbox

Apache spark 有时Spark会丢失节点之间的连接
Apache Spark

Apache spark 如何使用列的平均值将列添加到数据帧
Apache Spark Dataframe

Apache spark 将经过培训的机器学习模型部署到生产中的步骤
Apache Spark Machine Learning

Apache spark Spark和Sparkyr错误“；增长超过64 KB”；
Apache Spark

Apache spark 如何计算匹配相关条件的行数？
Apache Spark

Apache spark ERROR.ApplicationMaster:未捕获异常：java.util.concurrent.TimeoutException:期货在100000毫秒后超时
Apache Spark Akka

Apache spark 如何在pyspark中进行Rdd和广播Rdd乘法？
Apache Spark Pyspark

Apache spark 为什么格式（“卡夫卡”）会因“卡夫卡”而失败；未能找到数据源：kafka。”；（即使使用uber jar）？
Apache Spark

Apache spark 如何在RDD中拆分字符串并检索它
Apache Spark

Apache spark 如何使作业幂等于它'；的多次运行在S3中生成相同的结果文件
Apache Spark Amazon S3

Apache spark 将Spark数据帧插入配置单元表会导致配置单元设置损坏
Apache Spark Hive

Apache spark spark streaming with kafka如何处理驱动程序异常
Apache Spark Apache Kafka

Apache spark 如何在scala中使用spark cassandra连接器API
Apache Spark Cassandra

Apache spark 创建一个热编码器。CountVectorizer返回ArrayType错误（IntergerType，true）
Apache Spark Pyspark

Apache spark 何时使用低级API？
Apache Spark Pyspark

Apache spark 为什么不是'；使用所有可用的执行器不是一个非常大的火花阶段吗？
Apache Spark

Apache spark 如何检查RDD
Apache Spark Pyspark

Apache spark 用于求和的Spark数据帧查询
Apache Spark Pyspark

Apache spark 火花作业RPC问题
Apache Spark

Apache spark 计算格式为yyyyMM的两个日期之间的月差的最佳方法是什么。在派斯帕克？
Apache Spark Pyspark

Apache spark Spark数据帧中从数组中提取单个元素
Apache Spark

Apache spark 如何知道执行pysaprk代码需要多长时间
Apache Spark Pyspark

Apache spark 我的可变映射在spark scala中的Foreach中没有得到迭代
Apache Spark

Apache spark 截断数据块上的表
Apache Spark Pyspark Azure Sql Database

Apache spark 在Spark和Cassandra中，如何将数据从较小的表映射到较大的表？（第一个表的主键是第二个表的分区键）
Apache Spark Cassandra

Tags

Silverlight Linux Kernel Sql System Verilog Flask Compilation Google App Maker Twig Netlogo Interface Navigation Wcf Xpath Flutter Netbeans Time Complexity Regex Teradata Iis 7 Opencl Embedded Functional Programming Matlab Three.js Events Smalltalk Tcl Speech Recognition Sharepoint 2010 Language Agnostic Timer Ipython Java Me Asterisk Protocol Buffers Junit Uwp Matplotlib Content Management System Openshift Jar Yaml Moodle Spring Boot Openid Rally Silverstripe Jaxb Indexing Yii2 Reflection Vuejs2 Android Emulator Cocos2d Iphone Vbscript Microsoft Graph Api Installation For Loop Computer Science Internet Explorer Omnet++ Composer Php Transactions Xmpp Directx Google Chrome Visual Studio 2013 Sql Server Mapping Botframework Entity Framework Core Editor Debugging Ruby On Rails 3 Facebook Akka Cmake Electron Umbraco Jekyll Winapi Azure Functions Spotify Dynamics Crm Unit Testing Spring Mvc Antlr Angular6 Random Google Maps Api 3 Animation Wicket Internationalization Chef Infra Twitter Typescript Blackberry Printing Mqtt Gnuplot Architecture Winforms Sqlite Testng Cryptography Ibm Midrange Logic Url Rewriting Cluster Computing Firebase Clang Logging Logstash Sharepoint Vb6 Xcode Gwt Magento Sencha Touch Templates F# Documentation Drupal 6 Seo Sql Server 2005 Csv Oracle11g Postgresql Routes Sdk Filter Xslt Swing Assembly Windows Phone Jasmine Jsp Sql Server 2008 R2 Pyspark Asp.net Mvc Google Sheets Rabbitmq Xamarin.ios Titanium Exception Objective C Hash Nginx R Gatsby Model View Controller Amazon S3 Jetty Migration Azure Service Fabric Jasper Reports Cocoa Touch Sip File Io Windows Phone 8 Java Serial Port Axapta Collections Docker Compose Core Data Webpack Enums Fiware EmptyTag Asp.net Svg Filesystems Android Layout Sails.js Raspberry Pi Phpstorm Certificate Stream Tree Hibernate Apache Camel .net Core Apache Storm Latex Websphere Css Symfony Acumatica Snowflake Cloud Data Platform Microservices Sprite Kit Android Ndk Parameters Lucene Mvvm Spring Security Continuous Integration Eclipse Rcp Codenameone Concurrency

Copyright © 2024. All Rights Reserved by - Fatal编程技术网