Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/shell/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Shell 安排和自动化sqoop导入/导出任务_Shell_Hadoop_Automation_Hive_Sqoop - Fatal编程技术网

Shell 安排和自动化sqoop导入/导出任务

Shell 安排和自动化sqoop导入/导出任务,shell,hadoop,automation,hive,sqoop,Shell,Hadoop,Automation,Hive,Sqoop,我有一个sqoop工作,需要将数据从oracle导入hdfs 我使用的sqoop查询是 sqoop导入-连接jdbc:oracle:thin:@hostname:port/service-用户名sqoop-密码sqoop-查询从orderdate=To_date'10/08/2013','mm/dd/yyyy'和partitionid='1',rownum

我有一个sqoop工作,需要将数据从oracle导入hdfs

我使用的sqoop查询是 sqoop导入-连接jdbc:oracle:thin:@hostname:port/service-用户名sqoop-密码sqoop-查询从orderdate=To_date'10/08/2013','mm/dd/yyyy'和partitionid='1',rownum<10001和\$CONDITIONS-目标目录/test1-字段以'\t'结尾的订单中选择*

我一次又一次地重新运行相同的查询,分区ID从1更改为96。因此,我应该手动执行sqoop import命令96次。表“ORDERS”包含数百万行,每行的partitionid为1到96。我需要将每个分区ID中的10001行导入hdfs

有没有办法做到这一点?如何自动化sqoop作业?

使用crontab进行调度。可以找到Crontab文档,也可以在终端中使用mancrontab


在shell脚本中添加sqoop import命令,并使用crontab执行此shell脚本。

运行脚本:$./script.sh 20/----用于第20个条目

ramisetty@HadoopVMbox:~/ramu$ cat script.sh
#!/bin/bash

PART_ID=$1
TARGET_DIR_ID=$PART_ID
echo "PART_ID:" $PART_ID  "TARGET_DIR_ID: "$TARGET_DIR_ID
sqoop import --connect jdbc:oracle:thin:@hostname:port/service --username sqoop --password sqoop --query "SELECT * FROM ORDERS WHERE orderdate = To_date('10/08/2013', 'mm/dd/yyyy') AND partitionid = '$PART_ID' AND rownum < 10001 AND \$CONDITIONS" --target-dir /test/$TARGET_DIR_ID --fields-terminated-by '\t'
对于所有1到96-单次射击


谢谢Rajesh,但是如何在shell脚本中更改partitionid列变量呢?我需要执行sqoop命令96次,同时更改partitionid和-target目录。我应该用96个sqoop命令编写一个shell脚本,然后执行它吗?
ramisetty@HadoopVMbox:~/ramu$ cat script_for_all.sh
#!/bin/bash

for part_id in {1..96};
do
 PART_ID=$part_id
 TARGET_DIR_ID=$PART_ID
 echo "PART_ID:" $PART_ID  "TARGET_DIR_ID: "$TARGET_DIR_ID
 sqoop import --connect jdbc:oracle:thin:@hostname:port/service --username sqoop --password sqoop --query "SELECT * FROM ORDERS WHERE orderdate = To_date('10/08/2013', 'mm/dd/yyyy') AND partitionid = '$PART_ID' AND rownum < 10001 AND \$CONDITIONS" --target-dir /test/$TARGET_DIR_ID --fields-terminated-by '\t'
done