Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop ApachePig查询占用的Cpu时间(以Pig拉丁语表示)_Hadoop_Apache Pig - Fatal编程技术网

Hadoop ApachePig查询占用的Cpu时间(以Pig拉丁语表示)

Hadoop ApachePig查询占用的Cpu时间(以Pig拉丁语表示),hadoop,apache-pig,Hadoop,Apache Pig,ApachePig查询执行需要多少时间? 该查询使用Pig拉丁语获取多达400万个具有43个字段的元组行记录 A = LOAD '/user/PigTest/year_14/mon_nov/6_sms_03_01.csv' USING PigStorage(','); bt = foreach A generate $0 as id,$3; dump bt; ct = filter bt by id == 3981042 ; dump ct; dump MinutesBetween(Curren

ApachePig查询执行需要多少时间? 该查询使用Pig拉丁语获取多达400万个具有43个字段的元组行记录

A = LOAD '/user/PigTest/year_14/mon_nov/6_sms_03_01.csv' USING PigStorage(',');
bt = foreach A generate $0 as id,$3;
dump bt;
ct = filter bt by id == 3981042 ;
dump ct;
dump MinutesBetween(CurrentTime(),$ti);
并将文件调用为: pig-param ti='date'try.pig

我的系统环境是Linux

错误是: 错误1200:不匹配的输入需要右对齐

   Two problems here
    1. You should print only the relation in DUMP stmt but you are trying to print the function MinutesBetween().
       If you remove the last line the error will be gone.
    2. In command line you are passing 'date' as parameter. In pig 'date' is not a buildin command. so you need to construct the date atleast any one of the format that pig supports.

    Example:
       I am using this date format '2014-11-06T06:01:13' and more date formats are available in the pig docs. you can check it.

    In command line
    >>pig -param ti='2014-11-06T06:01:13' -f try.pig 

    Change the last line of the pig script like this.
    test = FOREACH ct GENERATE MinutesBetween(CurrentTime(),ToDate('$ti'));
    DUMP test;
org.apache.pig.impl.logicalLayer.FrontendException:错误1000:解析过程中出错。不匹配的输入需要右对齐 位于org.apache.pig.PigServer$Graph.parseQueryPigServer.java:1725 位于org.apache.pig.PigServer$Graph.access$000PigServer.java:1420 位于org.apache.pig.PigServer.parseAndBuildPigServer.java:364 位于org.apache.pig.PigServer.executeBatchPigServer.java:389 位于org.apache.pig.PigServer.executeBatchPigServer.java:375 位于org.apache.pig.tools.grunt.GruntParser.executeBatchGruntParser.java:170 位于org.apache.pig.tools.grunt.GruntParser.parseStopOnErrorGruntParser.java:232 位于org.apache.pig.tools.grunt.GruntParser.parseStopOnErrorGruntParser.java:203 位于org.apache.pig.tools.grunt.grunt.execGrunt.java:81 位于org.apache.pig.Main.runMain.java:608 位于org.apache.pig.Main.mainMain.java:156 在sun.reflect.NativeMethodAccessorImpl.invoke0Native方法中 位于sun.reflect.NativeMethodAccessorImpl.invokeNativeMethodAccessorImpl.java:57 在sun.reflect.DelegatingMethodAccessorImpl.invokeDelegatingMethodAccessorImpl.java:43 位于java.lang.reflect.Method.invokeMethod.java:606 位于org.apache.hadoop.util.RunJar.mainRunJar.java:212 原因:未能分析:不匹配的输入需要正确的参数

   Two problems here
    1. You should print only the relation in DUMP stmt but you are trying to print the function MinutesBetween().
       If you remove the last line the error will be gone.
    2. In command line you are passing 'date' as parameter. In pig 'date' is not a buildin command. so you need to construct the date atleast any one of the format that pig supports.

    Example:
       I am using this date format '2014-11-06T06:01:13' and more date formats are available in the pig docs. you can check it.

    In command line
    >>pig -param ti='2014-11-06T06:01:13' -f try.pig 

    Change the last line of the pig script like this.
    test = FOREACH ct GENERATE MinutesBetween(CurrentTime(),ToDate('$ti'));
    DUMP test;
更新:

创建一个shell脚本,比如test.sh 1.获取当前时间即开始时间 2.称猪为scripttry.pig 3.获取当前时间,即结束时间 4获取时间差并打印出来,这样您就可以得到pig脚本实际花费的时间。您还可以修改脚本以包括小时和毫秒

test.sh

样本输出:


我已经使用了你的两个建议,它删除了错误并打印了0秒,但没有解决问题,即通过查询来知道要执行的时间(以毫秒为单位)。更新了另一个解决方案,你能否尝试让我知道这是否适用于你。
0 minutes and 2 seconds.