Docker上的配置单元:失败:执行错误,从org.apache.hadoop.Hive.ql.exec.mr.MapRedTask返回代码2

Docker上的配置单元:失败:执行错误,从org.apache.hadoop.Hive.ql.exec.mr.MapRedTask返回代码2,docker,hive,Docker,Hive,我最近开始在Docker上使用Hive。我有两张表格,结构如下: 0: jdbc:hive2://localhost:10000> describe users; +-------------+------------+----------+ | col_name | data_type | comment | +-------------+------------+----------+ | userid | int | | | gen

我最近开始在Docker上使用Hive。我有两张表格,结构如下:

0: jdbc:hive2://localhost:10000> describe users;
+-------------+------------+----------+
|  col_name   | data_type  | comment  |
+-------------+------------+----------+
| userid      | int        |          |
| gender      | string     |          |
| age         | int        |          |
| occupation  | int        |          |
| zipcode     | string     |          |
+-------------+------------+----------+

0: jdbc:hive2://localhost:10000> describe users_2;
+-------------+------------+----------+
|  col_name   | data_type  | comment  |
+-------------+------------+----------+
| userid      | int        |          |
| gender      | string     |          |
| age         | int        |          |
| occupation  | string     |          |
| zipcode     | string     |          |
+-------------+------------+----------+
我想做的是通过将某个字符串关联到第一个字符串的每个INT,将用户的内容复制到users2。为此,我编写了以下python脚本:

import sys

    occupation_dict = { 
        0: "other or not specified",
        1: "academic/educator",
        2: "artist",
        3: "clerical/admin",
        4: "college/grad student",
        5: "customer service",
        6: "doctor/health care",
        7: "executive/managerial",
        8: "farmer",
        9: "homemaker",
        10: "K-12 student",
        11: "lawyer",
        12: "programmer",
        13: "retired",
        14: "sales/marketing",
        15: "scientist",
        16: "self-employed",
        17: "technician/engineer",
        18: "tradesman/craftsman",
        19: "unemployed",
        20: "writer"
    }

    for line in sys.stdin:
        line = line.strip()
        userid, gender, age, occupation, zipcode = line.split('#')
        occupation_str = occupation_dict[occupation]
        print '#'.join([userid, gender, age, occupation_str, zipcode])
因此,在Docker中,我运行以下命令:

0: jdbc:hive2://localhost:10000> add FILE /data/ml-1m/map_fun.py;
No rows affected (0.007 seconds)
0: jdbc:hive2://localhost:10000> INSERT OVERWRITE TABLE users_2
. . . . . . . . . . . . . . . .> SELECT
. . . . . . . . . . . . . . . .> TRANSFORM (userid, gender, age, occupation, zipcode)
. . . . . . . . . . . . . . . .> USING 'python map_fun.py'
. . . . . . . . . . . . . . . .> AS (userid, gender, age, occupation_str, zipcode)
. . . . . . . . . . . . . . . .> FROM users;
但我犯了以下错误,无法克服:

WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Error: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
        at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257)
        at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
        at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748) (state=08S01,code=2)
如果我走得太久,我很抱歉,我希望我已经把你需要的一切都放好了