Linux Hadoop:JobTracker正在吃。。。300%CPU?
最近,Ganglia几乎以红色显示Hadoop服务器 查看顶部的Linux Hadoop:JobTracker正在吃。。。300%CPU?,linux,hadoop,Linux,Hadoop,最近,Ganglia几乎以红色显示Hadoop服务器 查看顶部的,htop。。。我看到jobtracker和namenode进程正在消耗约300%的CPU: 30858 adtech adtech 42 15.54s 13.17s 0K 0K 0K 0K -- - S 6 332% java 31066 adtech adtech 24 6.18s 8.86s
,htop
。。。我看到jobtracker
和namenode
进程正在消耗约300%的CPU:
30858 adtech adtech 42 15.54s 13.17s 0K 0K 0K 0K -- - S 6 332% java
31066 adtech adtech 24 6.18s 8.86s 0K 0K 0K 0K -- - S 7 174% java
31164 adtech adtech 44 5.66s 8.38s 0K 0K 0K 4K -- - S 5 162% java
以下是相关的配置文件:
核心站点.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hmaster90:9000</value>
<description>
The name of the default file system. Either the literal string
"local" or a host:port for NDFS.
</description>
</property>
</configuration>
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/data/hdfs11/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data/hdfs11/dfs/data</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hdfs11/dfs/tmp</value>
</property>
</configuration>
<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hmaster90:9001</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/data/hdfs11/mapreduce/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/data/hdfs11/mapreduce/local</value>
</property>
<property>
<name>mapred.temp.dir</name>
<value>/data/hdfs11/mapreduce/temp</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>7</value>
<description>
The maximum number of map tasks that will be run simultaneously by a task tracker.
</description>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>7</value>
<description>
The maximum number of reduce tasks that will be run simultaneously by a task tracker.
</description>
</property>
<property>
<name>mapred.map.tasks</name>
<value>56</value>
<description>
This should be a prime number larger than multiple number of slave hosts,
e.g. for 3 nodes set this to 17
</description>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>21</value>
<description>
This should be a prime number close to a low multiple of slave hosts,
e.g. for 3 nodes set this to 7
</description>
</property>
</configuration>
fs.default.name
hdfs://hmaster90:9000
默认文件系统的名称。文本字符串
“本地”或主机:NDF的端口。
hdfs site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hmaster90:9000</value>
<description>
The name of the default file system. Either the literal string
"local" or a host:port for NDFS.
</description>
</property>
</configuration>
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/data/hdfs11/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data/hdfs11/dfs/data</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hdfs11/dfs/tmp</value>
</property>
</configuration>
<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hmaster90:9001</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/data/hdfs11/mapreduce/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/data/hdfs11/mapreduce/local</value>
</property>
<property>
<name>mapred.temp.dir</name>
<value>/data/hdfs11/mapreduce/temp</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>7</value>
<description>
The maximum number of map tasks that will be run simultaneously by a task tracker.
</description>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>7</value>
<description>
The maximum number of reduce tasks that will be run simultaneously by a task tracker.
</description>
</property>
<property>
<name>mapred.map.tasks</name>
<value>56</value>
<description>
This should be a prime number larger than multiple number of slave hosts,
e.g. for 3 nodes set this to 17
</description>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>21</value>
<description>
This should be a prime number close to a low multiple of slave hosts,
e.g. for 3 nodes set this to 7
</description>
</property>
</configuration>
dfs.replication
2.
dfs.name.dir
/数据/hdfs11/dfs/name
dfs.data.dir
/数据/hdfs11/dfs/data
hadoop.tmp.dir
/数据/hdfs11/dfs/tmp
mapred site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hmaster90:9000</value>
<description>
The name of the default file system. Either the literal string
"local" or a host:port for NDFS.
</description>
</property>
</configuration>
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/data/hdfs11/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data/hdfs11/dfs/data</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hdfs11/dfs/tmp</value>
</property>
</configuration>
<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hmaster90:9001</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/data/hdfs11/mapreduce/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/data/hdfs11/mapreduce/local</value>
</property>
<property>
<name>mapred.temp.dir</name>
<value>/data/hdfs11/mapreduce/temp</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>7</value>
<description>
The maximum number of map tasks that will be run simultaneously by a task tracker.
</description>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>7</value>
<description>
The maximum number of reduce tasks that will be run simultaneously by a task tracker.
</description>
</property>
<property>
<name>mapred.map.tasks</name>
<value>56</value>
<description>
This should be a prime number larger than multiple number of slave hosts,
e.g. for 3 nodes set this to 17
</description>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>21</value>
<description>
This should be a prime number close to a low multiple of slave hosts,
e.g. for 3 nodes set this to 7
</description>
</property>
</configuration>
mapred.job.tracker
hmaster90:9001
mapred.system.dir
/data/hdfs11/mapreduce/system
mapred.local.dir
/数据/hdfs11/mapreduce/local
mapred.temp.dir
/数据/hdfs11/mapreduce/temp
mapred.tasktracker.map.tasks.max
7.
任务跟踪器将同时运行的最大映射任务数。
mapred.tasktracker.reduce.tasks.max
7.
任务跟踪器将同时运行的最大reduce任务数。
mapred.map.tasks
56
这应该是一个素数,大于从主机数的倍数,
e、 g.对于3个节点,将其设置为17
mapred.reduce.tasks
21
这应该是一个接近从主机的低倍数的素数,
e、 g.对于3个节点,将其设置为7
jobtracker.log:
我应该先看看哪里
回复@J-16SDiZ:
# su - adtech
$ jstack -J-d64 -m 3135
Attaching to process ID 3135, please wait...
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at sun.tools.jstack.JStack.runJStackTool(JStack.java:136)
at sun.tools.jstack.JStack.main(JStack.java:102)
Caused by: sun.jvm.hotspot.runtime.VMVersionMismatchException: Supported versions are 14.0-b16. Target VM is 20.5-b03
at sun.jvm.hotspot.runtime.VM.checkVMVersion(VM.java:223)
at sun.jvm.hotspot.runtime.VM.<init>(VM.java:286)
at sun.jvm.hotspot.runtime.VM.initialize(VM.java:344)
at sun.jvm.hotspot.bugspot.BugSpotAgent.setupVM(BugSpotAgent.java:594)
at sun.jvm.hotspot.bugspot.BugSpotAgent.go(BugSpotAgent.java:494)
at sun.jvm.hotspot.bugspot.BugSpotAgent.attach(BugSpotAgent.java:332)
at sun.jvm.hotspot.tools.Tool.start(Tool.java:163)
at sun.jvm.hotspot.tools.JStack.main(JStack.java:86)
#su-adtech
$jstack-J-d64-m 3135
正在附加到进程ID 3135,请稍候。。。
线程“main”java.lang.reflect.InvocationTargetException中出现异常
在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处
在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)中
在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中
位于java.lang.reflect.Method.invoke(Method.java:616)
位于sun.tools.jstack.jstack.runJStackTool(jstack.java:136)
位于sun.tools.jstack.jstack.main(jstack.java:102)
原因:sun.jvm.hotspot.runtime.VMVersionMismatchException:支持的版本为14.0-b16。目标VM为20.5-b03
位于sun.jvm.hotspot.runtime.VM.checkVMVersion(VM.java:223)
位于sun.jvm.hotspot.runtime.VM.(VM.java:286)
位于sun.jvm.hotspot.runtime.VM.initialize(VM.java:344)
位于sun.jvm.hotspot.bugspot.BugSpotAgent.setupVM(BugSpotAgent.java:594)
位于sun.jvm.hotspot.bugspot.BugSpotAgent.go(BugSpotAgent.java:494)
位于sun.jvm.hotspot.bugspot.BugSpotAgent.attach(BugSpotAgent.java:332)
位于sun.jvm.hotspot.tools.Tool.start(Tool.java:163)
位于sun.jvm.hotspot.tools.JStack.main(JStack.java:86)
我知道这一点。我想问的是,是什么原因导致jobtracker最近几乎使用了CPU,而它以前只使用了约20%(没有新作业)。先转储stackstacekill-3
或jstack
。这可能是闰秒的复仇吗?@J-16SDiZ:jstack-J-d64-m 1741正在连接进程ID 1741,请稍候。。。附加到进程时出错:sun.jvm.hotspot.debugger.DebuggerException:无法附加到进程