使用flume将twitter数据流式传输到hadoop的HDFS接收器中

使用flume将twitter数据流式传输到hadoop的HDFS接收器中,hadoop,twitter,flume,flume-ng,flume-twitter,Hadoop,Twitter,Flume,Flume Ng,Flume Twitter,我安装了Flume来运行cloudera的twitter情绪分析 当我通过这个命令运行twitter.conf时 bin/flume-ng agent start --conf conf/ -f conf/twitter.conf -Dflume.root.logger=DEBUG,console -n TwitterAgent 我试着更改命令,试着将JAR从hadoop导入flume,但都没有效果 这是问题发生的具体地点 2014-10-13 02:40:16,511 (lifecycle

我安装了Flume来运行cloudera的twitter情绪分析

当我通过这个命令运行twitter.conf

 bin/flume-ng agent start --conf conf/ -f conf/twitter.conf -Dflume.root.logger=DEBUG,console -n TwitterAgent
我试着更改命令,试着将JAR从hadoop导入flume,但都没有效果

这是问题发生的具体地点

2014-10-13 02:40:16,511 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:119)] 
Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean.
2014-10-13 02:40:16,511 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:95)] 
Component type: SINK, name: HDFS started
2014-10-13 02:40:16,514 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] 
Polling sink runner starting
在此之后,下一行继续重复,直到被用户中断

2014-10-13 02:40:46,509 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)]
Checking file:conf/twitter.conf for changes
2014-10-13 02:41:16,510 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)]
Checking file:conf/twitter.conf for changes
我正在发布输出日志(没有加载的jar)


HDFS中没有任何更改。

您能否提供执行上述命令后生成的twitter.conf文件的内容和完整日志?此处没有错误,这些只是来自记录器的正常信息。
Info: Sourcing environment configuration script /home/gautham/Downloads/apache-flume-1.5.0.1-bin/conf/flume-env.sh
Info: Including Hadoop libraries found via (/usr/local/hadoop-2.4.1/bin/hadoop) for HDFS access
Info: Excluding /usr/local/hadoop-2.4.1/share/hadoop/common/lib/slf4j-api-1.7.5.jar from classpath
Info: Excluding /usr/local/hadoop-2.4.1/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar from classpath
2014-10-13 02:40:15,948 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:61)] Configuration provider starting
2014-10-13 02:40:15,955 (lifecycleSupervisor-1-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:78)] Configuration provider started
2014-10-13 02:40:15,958 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:40:15,960 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:133)] Reloading configuration file:conf/twitter.conf
2014-10-13 02:40:15,971 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,971 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1020)] Created context for HDFS: hdfs.rollCount
2014-10-13 02:40:15,972 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,972 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,972 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,972 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,973 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:930)] Added sinks: HDFS Agent: TwitterAgent
2014-10-13 02:40:15,973 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,973 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,973 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:HDFS
2014-10-13 02:40:15,974 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:313)] Starting validation of configuration for agent: TwitterAgent, initial-configuration: AgentConfiguration[TwitterAgent]SOURCES: {Twitter={ parameters:{consumerSecret=bVlUbZwHzCnpOfWc8MrWStzV7Mj4GUtAHex2pfLKOsgGJ3CA6T, keywords=kathi, channels=MemChannel, accessToken=1954292516-So7GAid1x2NzxQXauP6qkQ0Ha7wzyMOPXwoeNqt, consumerKey=GSmUZJz8XQsMM89d3gpJ1sdW1, type=com.cloudera.flume.source.TwitterSource, accessTokenSecret=uo126JopSBYQVBf3PaWBaMYdEiVxCONJnaTBu4tOaiMmB} }CHANNELS: {MemChannel={ parameters:{type=memory, transactionCapacity=100, capacity=10000} }}
SINKS: {HDFS={ parameters:{hdfs.batchSize=10, hdfs.path=hdfs://gautham-Lenovo-IdeaPad-Z500:54310/home/kathireal/tweets/%Y/%m/%d/%H, hdfs.writeFormat=Text, hdfs.rollSize=0, hdfs.rollCount=10000, channel=MemChannel, hdfs.fileType=DataStream, type=hdfs} }}

2014-10-13 02:40:15,984 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateChannels(FlumeConfiguration.java:468)] Created channel MemChannel
2014-10-13 02:40:15,990 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:596)] Configuration empty for: Twitternf.Removed.
2014-10-13 02:40:15,992 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:674)] Creating sink: HDFS using HDFS
2014-10-13 02:40:15,996 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:371)] Post validation configuration for TwitterAgent AgentConfiguration created without Configuration stubs for which only basic syntactical validation was performed[TwitterAgent] CHANNELS: {MemChannel={ parameters:{type=memory, transactionCapacity=100, capacity=10000}}}

SINKS: {HDFS={ parameters:{hdfs.batchSize=10, hdfs.path=hdfs://gautham-Lenovo-IdeaPad-Z500:54310/home/kathireal/tweets/%Y/%m/%d/%H, hdfs.writeFormat=Text, hdfs.rollSize=0, hdfs.rollCount=10000, channel=MemChannel, hdfs.fileType=DataStream, type=hdfs} }}

2014-10-13 02:40:15,996 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:135)] Channels:MemChannel

2014-10-13 02:40:15,996 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:136)] Sinks HDFS

2014-10-13 02:40:15,997 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:137)] Sources null

2014-10-13 02:40:15,997 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:140)] Post-validation flume configuration contains configuration for agents: [TwitterAgent]
2014-10-13 02:40:15,997 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:150)] Creating channels
2014-10-13 02:40:16,009 (conf-file-poller-0) [INFO - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:40)] Creating instance of channel MemChannel type memory
2014-10-13 02:40:16,017 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:205)] Created channel MemChannel
2014-10-13 02:40:16,019 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:40)] Creating instance of sink: HDFS, type: hdfs
2014-10-13 02:40:16,331 (conf-file-poller-0) [INFO - org.apache.flume.sink.hdfs.HDFSEventSink.authenticate(HDFSEventSink.java:555)] Hadoop Security enabled: false
2014-10-13 02:40:16,335 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:119)] Channel MemChannel connected to [HDFS]
2014-10-13 02:40:16,349 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)] Starting new configuration:{ sourceRunners:{} sinkRunners:{HDFS=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@32f5a9 counterGroup:{ name:null counters:{} } }} channels:{MemChannel=org.apache.flume.channel.MemoryChannel{name: MemChannel}} }
2014-10-13 02:40:16,375 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:145)] Starting Channel MemChannel
2014-10-13 02:40:16,505 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:119)] Monitored counter group for type: CHANNEL, name: MemChannel: Successfully registered new MBean.
2014-10-13 02:40:16,506 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:95)] Component type: CHANNEL, name: MemChannel started
2014-10-13 02:40:16,507 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink HDFS
2014-10-13 02:40:16,511 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:119)] Monitored counter group for type: SINK, name: HDFS: Successfully registered new MBean.
2014-10-13 02:40:16,511 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:95)] Component type: SINK, name: HDFS started
2014-10-13 02:40:16,514 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:143)] Polling sink runner starting
2014-10-13 02:40:46,509 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:41:16,510 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes


2014-10-13 02:41:46,510 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:42:16,511 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:42:46,512 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:43:16,512 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:43:46,513 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:44:16,514 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:44:46,514 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:45:16,515 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:126)] Checking file:conf/twitter.conf for changes
2014-10-13 02:45:40,220 (agent-shutdown-hook) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.stop(LifecycleSupervisor.java:79)] Stopping lifecycle supervisor 12
2014-10-13 02:45:40,224 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:149)] Component type: CHANNEL, name: MemChannel stopped
2014-10-13 02:45:40,225 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:155)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.start.time == 1413148216506
2014-10-13 02:45:40,225 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:161)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.stop.time == 1413148540224
2014-10-13 02:45:40,225 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.capacity == 10000
2014-10-13 02:45:40,225 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.current.size == 0
2014-10-13 02:45:40,225 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.put.attempt == 0
2014-10-13 02:45:40,226 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.put.success == 0
2014-10-13 02:45:40,226 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.take.attempt == 42
2014-10-13 02:45:40,226 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.take.success == 0
2014-10-13 02:45:40,226 (agent-shutdown-hook) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.stop(PollingPropertiesFileConfigurationProvider.java:83)] Configuration provider stopping
2014-10-13 02:45:40,226 (agent-shutdown-hook) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.stop(PollingPropertiesFileConfigurationProvider.java:95)] Configuration provider stopped
2014-10-13 02:45:40,227 (agent-shutdown-hook) [DEBUG - org.apache.flume.SinkRunner.stop(SinkRunner.java:104)] Waiting for runner thread to exit
2014-10-13 02:45:40,227 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:157)] Interrupted while processing an event. Exiting.
2014-10-13 02:45:40,227 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:173)] Polling runner exiting. Metrics:{ name:null counters:{runner.interruptions=1, runner.backoffs.consecutive=42, runner.backoffs=42} }
2014-10-13 02:45:40,228 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:149)] Component type: SINK, name: HDFS stopped
2014-10-13 02:45:40,228 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:155)] Shutdown Metric for type: SINK, name: HDFS. sink.start.time == 1413148216511
2014-10-13 02:45:40,228 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:161)] Shutdown Metric for type: SINK, name: HDFS. sink.stop.time == 1413148540228
2014-10-13 02:45:40,228 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.batch.complete == 0
2014-10-13 02:45:40,228 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.batch.empty == 42
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.batch.underflow == 0
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.connection.closed.count == 0
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.connection.creation.count == 0
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.connection.failed.count == 0
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)] Shutdown Metric for type: SINK, name: HDFS. sink.event.drain.attempt == 0
2014-10-13 02:45:40,229 (agent-shutdown-hook) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:177)
  HDFS
Shutdown Metric for type: SINK, name: HDFS. sink.event.drain.sucess == 0