来自Twitter的Flume HDFS接收器问题

来自Twitter的Flume HDFS接收器问题,twitter,hdfs,flume,sink,Twitter,Hdfs,Flume,Sink,我目前在Flume中有以下配置: # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this

我目前在Flume中有以下配置:

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# The configuration file needs to define the sources,
# the channels and the sinks.
# Sources, channels and sinks are defined per agent,
# in this case called 'TwitterAgent'
TwitterAgent.sources = Twitter
TwitterAgent.channels = MemChannel
TwitterAgent.sinks = HDFS

TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey = YPTxqtRamIZ1bnJXYwGW
TwitterAgent.sources.Twitter.consumerSecret = Wjyw9714OBzao7dktH0csuTByk4iLG9Zu4ddtI6s0ho
TwitterAgent.sources.Twitter.accessToken = 2340010790-KhWiNLt63GuZ6QZNYuPMJtaMVjLFpiMP4A2v
TwitterAgent.sources.Twitter.accessTokenSecret = x1pVVuyxfvaTbPoKvXqh2r5xUA6tf9einoByLIL8rar
TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics, bigdata, cloudera, data science, data scientiest, business intelligence, mapreduce, data warehouse, data warehousing, mahout, hbase, nosql, newsql, businessintelligence, cloudcomputing
TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://hadoop1:8020/user/flume/tweets/%Y/%m/%d/%H/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10000
TwitterAgent.channels.MemChannel.transactionCapacity = 100
twitter应用程序验证密钥是正确的。 我在flume日志文件中不断发现这个错误:

ERROR   org.apache.flume.SinkRunner     

Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.IllegalArgumentException: java.net.UnknownHostException: hadoop1
    at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:446)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
    at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: hadoop1
    at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
    at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
    at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:448)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:410)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:128)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2310)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2344)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2326)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:353)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
    at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:227)
    at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:221)
    at org.apache.flume.sink.hdfs.BucketWriter$8$1.run(BucketWriter.java:589)
    at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:161)
    at org.apache.flume.sink.hdfs.BucketWriter.access$800(BucketWriter.java:57)
    at org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:586)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    ... 1 more
Caused by: java.net.UnknownHostException: hadoop1
    ... 23 more
ERROR org.apache.flume.SinkRunner
无法传递事件。例外情况如下。
org.apache.flume.EventDeliveryException:java.lang.IllegalArgumentException:java.net.UnknownHostException:hadoop1
位于org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:446)
位于org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
位于org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
运行(Thread.java:662)
原因:java.lang.IllegalArgumentException:java.net.UnknownHostException:hadoop1
位于org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
位于org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
位于org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
位于org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:448)
位于org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:410)
位于org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:128)
位于org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2310)
位于org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
位于org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2344)
位于org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2326)
位于org.apache.hadoop.fs.FileSystem.get(FileSystem.java:353)
位于org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
在org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:227)
在org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:221)上
位于org.apache.flume.sink.hdfs.BucketWriter$8$1.run(BucketWriter.java:589)
位于org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:161)
访问org.apache.flume.sink.hdfs.BucketWriter.access$800(BucketWriter.java:57)
在org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:586)
位于java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
位于java.util.concurrent.FutureTask.run(FutureTask.java:138)
位于java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
... 还有一个
原因:java.net.UnknownHostException:hadoop1
... 23多
这里有人知道原因并能向我解释吗?
提前感谢。

根据例外情况,问题是主机hadoop1未知

根据flume配置文件,您给出的路径为

hdfs://hadoop1:8020/user/flume/tweets/%Y/%m/%d/%H/

应该可以通过水槽代理从机器上访问。由于机器名不能用于访问HDFS而不在同一个域中,因此您需要使用
core site.xml中设置的IP地址访问HDFS。很好,它可以工作。该文件是core-site.xml。谢谢。core-site.xml文件中的IP地址设置在哪里?