Java 为什么我的线程在不同的时间执行相同的工作量?
我有一个简单的Java多线程应用程序,如下所示:Java 为什么我的线程在不同的时间执行相同的工作量?,java,multithreading,performance-testing,Java,Multithreading,Performance Testing,我有一个简单的Java多线程应用程序,如下所示: class MyThreads extends Thread{ public void run() { { // some thread initializations // every thread reads 2 files (its own files, // so node 0 will read A0.txt and B0.txt
class MyThreads extends Thread{
public void run() {
{
// some thread initializations
// every thread reads 2 files (its own files,
// so node 0 will read A0.txt and B0.txt
// and node 1 will read A1.txt and B1.txt)
// he files have sizes between 10-20MB.
// A's files contain different information for different nodes (A0.txt != A1.txt),
// but B's files are the same(B0.txt has
// the same info as B1.txt). This is just a scenario.
// it stores the data that was
// read before in the memory.
// Again, i know B can be shared since
// it has the same info in both threads, but it's not.
}
{
// simple computation on the data retrieved
// (addition, multiplication, etc)
// I assume there is no need to synchronize
// the threads since they apply operations on their own data.
// Here, every thread executes the same number of operations
}
{
// writing the results on different files. This phase in unimportant.
}
}
public static void main(String args[]) {
// start 4 threads
}
}
在测试初始化部分的性能时,计算部分I得到了以下奇怪的结果:
2016-03-11-NodeThread:1 time[2318] tag[initialization]
2016-03-11-NodeThread:0 time[2379] tag[initialization]
2016-03-11-NodeThread:2 time[2474] tag[initialization]
2016-03-11-NodeThread:3 time[2481] tag[initialization]
2016-03-11-NodeThread:2 time[30ms] tag[computation]
2016-03-11-NodeThread:1 time[6ms] tag[computation]
2016-03-11-NodeThread:3 time[7ms] tag[computation]
2016-03-11-NodeThread:0 time[6ms] tag[computation]
可以看出,NodeThread:2的计算耗时30毫秒,而其他节点的计算耗时不到10毫秒
不过,在初始化和计算之间插入屏障后,我得到了很好的结果:
2016-03-11-NodeThread:1 time[2318] tag[initialization]
2016-03-11-NodeThread:0 time[2379] tag[initialization]
2016-03-11-NodeThread:2 time[2474] tag[initialization]
2016-03-11-NodeThread:3 time[2481] tag[initialization]
2016-03-11-NodeThread:2 time[30ms] tag[computation]
2016-03-11-NodeThread:1 time[33ms] tag[computation]
2016-03-11-NodeThread:3 time[29ms] tag[computation]
2016-03-11-NodeThread:0 time[31ms] tag[computation]
我的问题是:如果线程根本不通信,它们从磁盘的不同部分读取数据,并且执行相同的计算量,那么为什么需要在计算之前对它们进行同步呢?我的猜测可能涉及缓存,但我无法解释原因
注意。我测试代码的机器有超过4个内核,没有其他cpu消耗进程在运行。为了测量时间,我像这样使用perf4j
class MyThreads extends Thread{
public void run() {
{
StopWatch stopWatch = new Log4JStopWatch();
// some thread initializations
// every thread reads 2 files (its own files,
// so node 0 will read A0.txt and B0.txt
// and node 1 will read A1.txt and B1.txt)
// he files have sizes between 10-20MB.
// A's files contain different information for different nodes (A0.txt != A1.txt),
// but B's files are the same(B0.txt has
// the same info as B1.txt). This is just a scenario.
// it stores the data that was
// read before in the memory.
// Again, i know B can be shared since
// it has the same info in both threads, but it's not.
stopWatch.stop("initialization");
// barrier
}
{
StopWatch stopWatch = new Log4JStopWatch();
// simple computation on the data retrieved
// (addition, multiplication, etc)
// I assume there is no need to synchronize
// the threads since they apply operations on their own data.
// Here, every thread executes the same number of operations
stopWatch.stop("computation");
}
{
// writing the results on different files. This phase in unimportant.
}
}
public static void main(String args[]) {
// start 4 threads
}
}
我只能猜测,因为我们需要更多的细节才能真正确定,但可能发生的是,您的第一个线程经常执行一些代码,以至于它被热点编译器和JVM中构建的其他神奇的东西编译和优化
您的同步尝试可能会阻止这种情况发生,可能是因为线程在编译之前已经完成了计算,因为它们现在几乎同时启动。您认为为什么需要同步?您所展示的是,同步会使事情变慢,这本身并不奇怪。您是否使用ExecutorService运行线程?文件大小可能会影响线程执行时间。根据我的观察,线程数应该等于核心数(或者由于上下文切换而为+1)。是否同步常见监视器对象上的线程?请发表更多的代码而不是评论。你的问题是哪一个?标题里的那个还是正文里的那个?它们几乎没有共同点。标题中的那个,对不起。要使用的线程数本身就是一门科学。首先,不同版本的Java可能会为同一硬件提供不同的“内核数”。