如何从spark thrift server使用hadoop？请考虑以下设置：_Hadoop_Apache Spark_Thrift_Beeline

如何从spark thrift server使用hadoop？请考虑以下设置：

hadoop apache-spark

如何从spark thrift server使用hadoop？请考虑以下设置：,hadoop,apache-spark,thrift,beeline,Hadoop,Apache Spark,Thrift,Beeline,hadoop版本2.6.4 spark版本2.1.0 OS CentOS Linux 7.2.1511版（核心版）所有软件作为单节点群集安装在一台机器上，spark以独立模式安装。我正在尝试使用Spark Thrift Server。要启动spark thrift服务器，我运行shell脚本启动thriftserver.sh 运行thrift服务器后，我可以运行beeline命令行工具并发出以下命令：命令成功运行： !connect jdbc:hive2://localhost:100

hadoop版本2.6.4

spark版本2.1.0

OS CentOS Linux 7.2.1511版（核心版）

所有软件作为单节点群集安装在一台机器上，spark以独立模式安装。我正在尝试使用Spark Thrift Server。要启动spark thrift服务器，我运行shell脚本

启动thriftserver.sh

运行thrift服务器后，我可以运行beeline命令行工具并发出以下命令：命令成功运行：

!connect jdbc:hive2://localhost:10000 user_name '' org.apache.hive.jdbc.HiveDriver
create database testdb;
use testdb;
create table names_tab(a int, name string) row format delimited fields terminated by ' ';

我的第一个问题是，haddop上创建的表/数据库的底层文件/文件夹在哪里？问题是，即使使用stop-all.sh停止hadoop，创建表/数据库命令仍然成功，这让我觉得这个表根本不是在hadoop上创建的

我的第二个问题是，我如何告诉spark hadoop到底安装在哪里？并要求spark使用hadoop作为从beeline运行的所有查询的底层数据存储

我应该在其他模式下安装spark吗

提前感谢。

我的目标是通过使用hadoop作为底层数据存储的Spark Thrift Server让beeline命令行实用程序工作，我让它工作了。我的设置如下：

Hadoop  <-->  Spark  <-->  SparkThriftServer  <--> beeline

默认情况下，spark将derby用于元数据和数据本身（在spark中称为仓库）为了让spark使用hadoop作为仓库，我必须添加这个属性

下面是一个示例输出

./beeline
Beeline version 1.0.1 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000 abbasbutt '' org.apache.hive.jdbc.HiveDriver
Connecting to jdbc:hive2://localhost:10000
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/abbasbutt/Projects/hadoop_fdw/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/abbasbutt/Projects/hadoop_fdw/apache-hive-1.0.1-bin/lib/hive-jdbc-1.0.1-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Connected to: Spark SQL (version 2.1.0)
Driver: Hive JDBC (version 1.0.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000> create database my_test_db;
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.379 seconds)
0: jdbc:hive2://localhost:10000> use my_test_db;
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.03 seconds)
0: jdbc:hive2://localhost:10000> create table my_names_tab(a int, b string) row format delimited fields terminated by ' ';
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.11 seconds)
0: jdbc:hive2://localhost:10000>

以下是hadoop中的相应文件

[abbasbutt@localhost test]$ hadoop fs -ls /user/hive/warehouse/
17/01/19 10:48:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 4 items
drwxrwxr-x   - abbasbutt supergroup          0 2017-01-18 23:45 /user/hive/warehouse/fdw_db.db
drwxrwxr-x   - abbasbutt supergroup          0 2017-01-18 23:23 /user/hive/warehouse/my_spark_db.db
drwxrwxr-x   - abbasbutt supergroup          0 2017-01-19 10:47 /user/hive/warehouse/my_test_db.db
drwxrwxr-x   - abbasbutt supergroup          0 2017-01-18 23:45 /user/hive/warehouse/testdb.db

[abbasbutt@localhost test]$ hadoop fs -ls /user/hive/warehouse/my_test_db.db/
17/01/19 10:50:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxrwxr-x   - abbasbutt supergroup          0 2017-01-19 10:50 /user/hive/warehouse/my_test_db.db/my_names_tab
[abbasbutt@localhost test]$

我的目标是通过Spark Thrift Server使用hadoop作为底层数据存储，让beeline命令行实用程序工作，我让它开始工作。我的设置如下：

Hadoop  <-->  Spark  <-->  SparkThriftServer  <--> beeline

默认情况下，spark将derby用于元数据和数据本身（在spark中称为仓库）为了让spark使用hadoop作为仓库，我必须添加这个属性

下面是一个示例输出

./beeline
Beeline version 1.0.1 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000 abbasbutt '' org.apache.hive.jdbc.HiveDriver
Connecting to jdbc:hive2://localhost:10000
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/abbasbutt/Projects/hadoop_fdw/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/abbasbutt/Projects/hadoop_fdw/apache-hive-1.0.1-bin/lib/hive-jdbc-1.0.1-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Connected to: Spark SQL (version 2.1.0)
Driver: Hive JDBC (version 1.0.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000> create database my_test_db;
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.379 seconds)
0: jdbc:hive2://localhost:10000> use my_test_db;
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.03 seconds)
0: jdbc:hive2://localhost:10000> create table my_names_tab(a int, b string) row format delimited fields terminated by ' ';
+---------+--+
| Result  |
+---------+--+
+---------+--+
No rows selected (0.11 seconds)
0: jdbc:hive2://localhost:10000>

以下是hadoop中的相应文件

[abbasbutt@localhost test]$ hadoop fs -ls /user/hive/warehouse/
17/01/19 10:48:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 4 items
drwxrwxr-x   - abbasbutt supergroup          0 2017-01-18 23:45 /user/hive/warehouse/fdw_db.db
drwxrwxr-x   - abbasbutt supergroup          0 2017-01-18 23:23 /user/hive/warehouse/my_spark_db.db
drwxrwxr-x   - abbasbutt supergroup          0 2017-01-19 10:47 /user/hive/warehouse/my_test_db.db
drwxrwxr-x   - abbasbutt supergroup          0 2017-01-18 23:45 /user/hive/warehouse/testdb.db

[abbasbutt@localhost test]$ hadoop fs -ls /user/hive/warehouse/my_test_db.db/
17/01/19 10:50:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxrwxr-x   - abbasbutt supergroup          0 2017-01-19 10:50 /user/hive/warehouse/my_test_db.db/my_names_tab
[abbasbutt@localhost test]$