使用java使用pig拉丁语连接到Cassandra

使用java使用pig拉丁语连接到Cassandra,java,hadoop,cassandra,apache-pig,Java,Hadoop,Cassandra,Apache Pig,我不想使用grunt,而是想编写一个java程序,使用PigServer连接到Cassandra。它无法找到我为Pig设置的环境变量。感谢您的任何帮助。或者如果有比Pig和Java map reduce更好的选择 以下是环境变量 export PATH=/Users/rachana/software/pig-0.11.1/bin:$PATH export PIG_HOME=/Users/rachana/software/pig-0.11.1 export PIG_CONF_DIR=/Users/

我不想使用grunt,而是想编写一个java程序,使用PigServer连接到Cassandra。它无法找到我为Pig设置的环境变量。感谢您的任何帮助。或者如果有比Pig和Java map reduce更好的选择

以下是环境变量

export PATH=/Users/rachana/software/pig-0.11.1/bin:$PATH
export PIG_HOME=/Users/rachana/software/pig-0.11.1
export PIG_CONF_DIR=/Users/rachana/software/hadoop-1.1.2/conf
export PIG_INITIAL_ADDRESS=localhost
export PIG_RPC_PORT=9160
export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
export PIG_OPTS=-Dudf.import.list=org.apache.cassandra.hadoop.pig:$PIG_OPTS
代码是

package com.chegg.hwh.tracking.dao;

import java.util.Properties;

import org.apache.cassandra.hadoop.pig.CassandraStorage;
import org.apache.pig.ExecType;
import org.apache.pig.PigServer;
import org.apache.pig.impl.PigContext;

public class HWHDataPigMapReduce {

public static void main(String args[]) throws Exception {
    Properties properties = new Properties();
    properties.put("PIG_HOME", "/Users/rachana/software/pig-0.11.1");
    properties.put("PIG_CONF_DIR", "/Users/rachana/software/hadoop-1.1.2/conf");
    properties.put("PIG_INITIAL_ADDRESS", "localhost");
    properties.put("PIG_RPC_PORT", "9160");
    properties.put("PIG_PARTITIONER","org.apache.cassandra.dht.Murmur3Partitioner");
    PigContext pigContext = new PigContext(ExecType.LOCAL,properties);

    CassandraStorage cassandraStorage = new CassandraStorage();

    PigServer pigServer = new PigServer(pigContext);


    pigServer.registerQuery("LOAD 'cassandra://hwh_tracking/users' USING org.apache.cassandra.hadoop.pig.CassandraStorage();");
    pigServer.registerQuery("emailgroup = group rows by email;");
    pigServer.dumpSchema("emailgroup");


}
}

错误是

13/07/05 16:56:19 INFO executionengine.HExecutionEngine: Connecting to hadoop file system at: file:///
2013-07-05 16:56:19.117 java[3413:1c03] Unable to load realm mapping info from SCDynamicStore
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/rachana/astyanax_lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/rachana/astyanax_lib/pig-0.11.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
Exception in thread "main" org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Cannot get schema from loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607)
    at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:529)
    at com.chegg.hwh.tracking.dao.HWHDataPigMapReduce.main(HWHDataPigMapReduce.java:21)
Caused by: Failed to parse: Can not retrieve schema from loader org.apache.cassandra.hadoop.pig.CassandraStorage@beeb7e9
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:193)
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)
    ... 4 more
Caused by: java.lang.RuntimeException: Can not retrieve schema from loader org.apache.cassandra.hadoop.pig.CassandraStorage@beeb7e9
    at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:90)
    at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:839)
    at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3236)
    at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1315)
    at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:799)
    at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:517)
    at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:392)
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)
    ... 5 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: Cannot get schema from loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage
    at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:178)
    at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:88)
    ... 12 more
Caused by: java.io.IOException: PIG_INPUT_INITIAL_ADDRESS or PIG_INITIAL_ADDRESS environment variable not set
    at org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:404)
    at org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:414)
    at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:174)
    ... 13 more
13/07/05 16:56:19信息executionengine.HExecutionEngine:连接到hadoop文件系统:文件:///
2013-07-05 16:56:19.117 java[3413:1c03]无法从SCDynamicStore加载领域映射信息
SLF4J:类路径包含多个SLF4J绑定。
SLF4J:在[jar:file:/Users/rachana/astyanax_lib/SLF4J-log4j12-1.6.4.jar!/org/SLF4J/impl/StaticLoggerBinder.class]中找到绑定
SLF4J:在[jar:file:/Users/rachana/astyanax_lib/pig-0.11.1.jar!/org/SLF4J/impl/StaticLoggerBinder.class]中找到绑定
SLF4J:参见http://www.slf4j.org/codes.html#multiple_bindings 我需要一个解释。
线程“main”org.apache.pig.impl.logicalayer.FrontendException中的异常:错误1000:解析期间出错。无法从loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage获取架构
位于org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607)
位于org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546)
位于org.apache.pig.PigServer.registerQuery(PigServer.java:516)
位于org.apache.pig.PigServer.registerQuery(PigServer.java:529)
位于com.chegg.hwh.tracking.dao.HWHDataPigMapReduce.main(HWHDataPigMapReduce.java:21)
原因:解析失败:无法从加载程序org.apache.cassandra.hadoop.pig检索架构。CassandraStorage@beeb7e9
位于org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:193)
位于org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)
... 4更多
原因:java.lang.RuntimeException:无法从加载程序org.apache.cassandra.hadoop.pig检索架构。CassandraStorage@beeb7e9
位于org.apache.pig.newplan.logical.relational.LOLoad.(LOLoad.java:90)
位于org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:839)
位于org.apache.pig.parser.LogicalPlanGenerator.load_子句(LogicalPlanGenerator.java:3236)
位于org.apache.pig.parser.logicalplanggenerator.op_子句(logicalplanggenerator.java:1315)
位于org.apache.pig.parser.LogicalPlanGenerator.general_语句(LogicalPlanGenerator.java:799)
位于org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:517)
位于org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:392)
位于org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)
... 还有5个
原因:org.apache.pig.impl.logicalayer.FrontendException:错误2245:无法从loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage获取架构
位于org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:178)
位于org.apache.pig.newplan.logical.relational.LOLoad(LOLoad.java:88)
... 还有12个
原因:java.io.IOException:未设置PIG\u输入\u初始\u地址或PIG\u初始\u地址环境变量
位于org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(CassandraStorage.java:404)
位于org.apache.cassandra.hadoop.pig.CassandraStorage.getSchema(CassandraStorage.java:414)
位于org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:174)
... 还有13个

当Pig连接到Cassandra时,它通过调用Cassandra ColumnFamilyOutputFormat和ColumnFamilyOutputFormat类来实现,这些类反过来调用连接到Cassandra的Hector API。如果您想编写一个访问Cassandra的Java程序,只需编写Hector API就容易得多。在这个级别上还有其他Cassandra API,比如通过JDBC的CQL。

谢谢Chris,我通过在eclipse->run配置中设置env变量来运行它。@plzDontKill告诉我它是如何解决的,就像我在使用grunt shell一样,我也遇到了同样的异常。