Hadoop 使用配置单元解析Amazon Review文件

Hadoop 使用配置单元解析Amazon Review文件,hadoop,hive,Hadoop,Hive,我想用amazon review文件创建一个具有以下格式的表 product/productId: B00006HAXW review/userId: A1RSDE90N6RSZF review/profileName: Joseph M. Kotow review/helpfulness: 9/9 review/score: 5.0 review/time: 1042502400 review/summary: Pittsburgh - Home of the OLDIES review/te

我想用amazon review文件创建一个具有以下格式的表

product/productId: B00006HAXW
review/userId: A1RSDE90N6RSZF
review/profileName: Joseph M. Kotow
review/helpfulness: 9/9
review/score: 5.0
review/time: 1042502400
review/summary: Pittsburgh - Home of the OLDIES
review/text: I have all of the doo wop DVD's and this one is as good or better than the
1st ones. Remember once these performers are gone, we'll never get to see them again.
Rhino did an excellent job and if you like or love doo wop and Rock n Roll you'll LOVE
this DVD !!
我的SQL:

CREATE EXTERNAL TABLE reviews (id int, user_id int, profile_name int, helpfulness string, review_score float, review_time int, review_summary string, review_text string)
我知道hive可以使用行和字段定界器加载数据。但在这里,我并不是每一行都有相同的格式。有人能帮我用hive解析这个文件格式,这样我就可以把它加载到我的文件系统中吗


谢谢大家!

它的格式准确吗?。我在某个地方看到Amazon Review数据是json格式的。可能,问题是我需要解析这个特定的文件,它只是一个.txt文件。您可以在这里查看: