Hadoop 通过java代码从本地计算机连接AWS EMR上的HDFS

Hadoop 通过java代码从本地计算机连接AWS EMR上的HDFS,hadoop,hdfs,emr,Hadoop,Hdfs,Emr,我试图了解如何从本地机器连接到hdfs(在aws EMR上) 我的示例程序 public class EMRConnection { public static void main(String[] args) throws IOException, URISyntaxException { Configuration config = new Configuration(); FileSystem hdfs = FileSystem.get(new URI("hdfs:

我试图了解如何从本地机器连接到hdfs(在aws EMR上)

我的示例程序

public class EMRConnection {


public static void main(String[] args) throws IOException, URISyntaxException {
     Configuration config = new Configuration();
     FileSystem hdfs = FileSystem.get(new URI("hdfs://***-**-**-***-***.compute-1.amazonaws.com:50070"), config);
     hdfs.mkdirs(new Path("/user/test/"));

}
}

我已验证并允许EMR接受来自我的ip的连接。我越来越不正常了

Exception in thread "main" java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: "{xyz}"; destination host is: "ec2-(xyz...).amazonaws.com":50070; 
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1351)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy7.mkdirs(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy7.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:467)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2394)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2365)
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:817)
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:813)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:813)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:806)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1933)
at EMRConnection.main(EMRConnection.java:16)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.
    at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
    at com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
    at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
    at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:996)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:891)
could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
有人能告诉我我该怎么连接吗?。我是否使用了错误的IP或端口

我发现端口应该是8020。在此之后,我可以创建文件夹,但当我试图写一个文件 它正在抛出异常

Exception in thread "main" java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: "{xyz}"; destination host is: "ec2-(xyz...).amazonaws.com":50070; 
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1351)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy7.mkdirs(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy7.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:467)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2394)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2365)
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:817)
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:813)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:813)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:806)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1933)
at EMRConnection.main(EMRConnection.java:16)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.
    at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94)
    at com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
    at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202)
    at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:996)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:891)
could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

当我使用端口as 8020时。它起作用了。和其他人一样,共享也会面临同样的问题。

EMR连接现在起作用了,但如果我试图写一个文件,它会说