View 在配置单元表上创建视图：丢失每个变量的注释_View_Hive_Pyspark_Comments

View 在配置单元表上创建视图：丢失每个变量的注释

view hive pyspark

View 在配置单元表上创建视图：丢失每个变量的注释,view,hive,pyspark,comments,View,Hive,Pyspark,Comments,我创建了一个配置单元表，我们在“注释”字段中为每个变量添加了一些描述，如下所示： spark.sql("create table test_comment (col string comment 'col comment') comment 'hello world table comment ' ") spark.sql("describe test_comment").show() +--------+---------+-----------+ |col_name|data_type|

我创建了一个配置单元表，我们在“注释”字段中为每个变量添加了一些描述，如下所示：

spark.sql("create table test_comment (col string comment 'col comment') comment 'hello world table comment ' ")
spark.sql("describe test_comment").show()
+--------+---------+-----------+
|col_name|data_type|    comment|
+--------+---------+-----------+
|     col|   string|col comment|
+--------+---------+-----------+

一切都很好，我们在变量“col”的起始字段中看到注释“col comment”

现在，当我在此表上创建视图时，“comment”字段不会传播到视图，“comment”列为空：

spark.sql("""create view test_comment_view as select * from test_comment""")
spark.sql("describe test_comment_view")

+--------+---------+-------+
|col_name|data_type|comment|
+--------+---------+-------+
|     col|   string|   null|
+--------+---------+-------+

在创建视图时，是否有方法保留注释字段的值？这一“特征”的原因是什么

我正在使用：

Hadoop 2.6.0-cdh5.8.0

蜂巢1.1.0-cdh5.8.0

Spark 2.1.0.cloudera1

我观察到，即使从另一个表创建表，注释也不会继承。看起来这是默认行为

create table t1 like another_table 
desc t1  //includes comments
+-----------+------------+------------------+--+
| col_name  | data_type  |     comment      |
+-----------+------------+------------------+--+
| id        | int        | new employee id  |
| name      | string     | employee name    |
+-----------+------------+------------------+--+

create table t1 as select * from another_table
desc t1 //excludes comments
+-----------+------------+----------+--+
| col_name  | data_type  | comment  |
+-----------+------------+----------+--+
| id        | int        |          |
| name      | string     |          |
+-----------+------------+----------+--+

但有一个解决办法。创建视图时，可以指定带有注释的各个列

create view v2(id2 comment 'vemp id', name2 comment 'vemp name') as select * from another_table;

+-----------+------------+------------+--+
| col_name  | data_type  |  comment   |
+-----------+------------+------------+--+
| id2       | int        | vemp id    |
| name2     | string     | vemp name  |
+-----------+------------+------------+--+

我观察到的是，即使从另一个表创建表，注释也不会被继承。看起来这是默认行为

create table t1 like another_table 
desc t1  //includes comments
+-----------+------------+------------------+--+
| col_name  | data_type  |     comment      |
+-----------+------------+------------------+--+
| id        | int        | new employee id  |
| name      | string     | employee name    |
+-----------+------------+------------------+--+

create table t1 as select * from another_table
desc t1 //excludes comments
+-----------+------------+----------+--+
| col_name  | data_type  | comment  |
+-----------+------------+----------+--+
| id        | int        |          |
| name      | string     |          |
+-----------+------------+----------+--+

但有一个解决办法。创建视图时，可以指定带有注释的各个列

create view v2(id2 comment 'vemp id', name2 comment 'vemp name') as select * from another_table;

+-----------+------------+------------+--+
| col_name  | data_type  |  comment   |
+-----------+------------+------------+--+
| id2       | int        | vemp id    |
| name2     | string     | vemp name  |
+-----------+------------+------------+--+

你的继承权是对的。我不确定我是否理解原因，但正如你所说，我们可以利用你的伎俩。因为我有200条注释的表和10多个表，所以我在pyspark中自动使用了几行。谢谢你的继承权是正确的。我不确定我是否理解原因，但正如你所说，我们可以利用你的伎俩。因为我有200条注释的表和10多个表，所以我在pyspark中自动使用了几行。谢谢