Java服务性能

Java服务性能,java,multithreading,performance,executorservice,Java,Multithreading,Performance,Executorservice,嗨,我实现了一个方法,该方法从数百万个元素(整数)的数组中计算模式值。 我现在正在比较一个顺序版本和一个(应该是)改进的版本,它使用了Executor服务。。。不幸的是,性能不如预期的好: Sequentiallly iterating hashMap (version 0) #size #time #memory 10000000 13772ms 565mb 20000000 35355ms 1135mb 30000000 45879ms

嗨,我实现了一个方法,该方法从数百万个元素(整数)的数组中计算模式值。 我现在正在比较一个顺序版本和一个(应该是)改进的版本,它使用了Executor服务。。。不幸的是,性能不如预期的好:

Sequentiallly iterating hashMap (version 0)

#size   #time       #memory
10000000    13772ms     565mb
20000000    35355ms     1135mb
30000000    45879ms     1633mb

Assigning jobs to a Service Executor (version 2)
#size   #time       #memory
10000000    16186ms     573mb
20000000    34561ms     1147mb
30000000    54792ms     1719mb
Executor服务的代码如下所示:

 /* Optimised-Threaded Method to calculate the Mode */
    private int getModeOptimisedThread(int[] mybigarray){
        System.out.println("calculating mode (optimised w/ ExecutorService)... ");

        int mode = -1;

        //create an hashmap to calculating the frequencies        
        TreeMap<Integer, Integer> treemap = new TreeMap<Integer, Integer>();

        //for each integer in the array, we put an entry into the hashmap with the 'array value' as a 'key' and frecuency as 'value'.
        for (int i : mybigarray) {
            //we check if that element already exists in the Hashmap, by getting the element with Key 'i'
            // if the element exists, we increment the frequency, otherwise we insert it with frecuency = 1;
            Integer frequency = treemap.get(i);
            int value = 0;
            if (frequency == null){ //element not found
                value = 1;
            }
            else{                   //element found
                value = frequency + 1;
            }

            //insert the element into the hashmap
            treemap.put(i, value);
        }



        //Look for the most frequent element in the Hashmap        
        int maxCount = 0;

        int n_threads = Runtime.getRuntime().availableProcessors();
        ExecutorService es = Executors.newFixedThreadPool(n_threads);


        //create a common variable to store maxCount and mode values
        Result r = new Result(mode, maxCount);

        //set the umber of jobs
        int num_jobs = 10;
        int job_size = treemap.size()/num_jobs;        

        System.out.println("Map size "+treemap.size());
        System.out.println("Job size "+job_size);

        //new MapWorker(map, 0, halfmapsize, r);
        int start_index, finish_index;

        List<Callable<Object>> todolist = new ArrayList<Callable<Object>>(num_jobs);

        //assign threads to pool

            for (int i=0; i<num_jobs; i++)
            {   
                    start_index=i*job_size;
                    finish_index = start_index+job_size;

                    System.out.println("start index: "+start_index+". Finish index: "+finish_index);
                    todolist.add(Executors.callable(new MapWorker(treemap.subMap(start_index, finish_index), r)));

            }        
       try{
           //invoke all will not return until all the tasks are completed
           es.invokeAll(todolist);
        } catch (Exception e) { 
            System.out.println("Error in the Service executor "+e);
        } finally {
           //finally the result
            mode = r.getMode(); 
        }

        //return the result
        return mode;
    }
  • 对于每个作业,使用单独的结果对象(不同步)。当所有作业完成时,选择具有最大值的结果

  • int num_jobs=n_线程


  • 大部分工作是在计算频率时完成的。这将显著地支配通过尝试更新结果而获得的并行性的任何好处。在最后更新全局频率之前,您需要通过本地计算频率的每个工作人员来并行计算模式。可以考虑在全局存储中使用该模式来存储线程,以确保线程安全。频率计算完成后,您可以在最后按顺序计算模式,因为按顺序遍历贴图的计算成本要低得多

    类似于以下内容的内容应该更有效: 编辑:修改updateScore()方法以修复数据争用。 private static class ResultStore {

        private Map<Integer, AtomicInteger> store = new ConcurrentHashMap<Integer, AtomicInteger>();
    
        public int size() {
            return store.size();
        }
    
        public int updateScore(int key, int freq) {
            AtomicInteger value = store.get(key);
            if (value == null) {
                store.putIfAbsent(key, new AtomicInteger(0));
                value = store.get(key);
            }
            return value.addAndGet(freq);
        }
    
        public int getMode() {
            int mode = 0;
            int modeFreq = 0;
            for (Integer key : store.keySet()) {
                int value = store.get(key).intValue();
                if (modeFreq < value) {
                    modeFreq = value;
                    mode = key;
                }
            }
            return mode;
        }
    }
    
    private static int computeMode(final int[] mybigarray) {
    
        int n_threads = Runtime.getRuntime().availableProcessors();
        ExecutorService es = Executors.newFixedThreadPool(n_threads);
        final ResultStore rs = new ResultStore();
    
        //set the number of jobs
        int num_jobs = 10;
        int job_size = mybigarray.length / num_jobs;
    
        System.out.println("Map size " + mybigarray.length);
        System.out.println("Job size " + job_size);
    
        List<Callable<Object>> todolist = new ArrayList<Callable<Object>>(num_jobs);
        for (int i = 0; i < num_jobs; i++) {
            final int start_index = i * job_size;
            final int finish_index = start_index + job_size;
    
            System.out.println("Start index: " + start_index + ". Finish index: " + finish_index);
    
            todolist.add(Executors.callable(new Runnable() {
                @Override
                public void run() {
                    final Map<Integer, Integer> localStore = new HashMap<Integer, Integer>();
                    for (int i = start_index; i < finish_index; i++) {
                        final Integer loopKey = mybigarray[i];
                        Integer loopValue = localStore.get(loopKey);
                        if (loopValue == null) {
                            localStore.put(loopKey, 1);
                        } else {
                            localStore.put(loopKey, loopValue + 1);
                        }
                    }
                    for (Integer loopKey : localStore.keySet()) {
                        final Integer loopValue = localStore.get(loopKey);
                        rs.updateScore(loopKey, loopValue);
                    }
                }
            }));
    
        }
        try {
            //invoke all will not return until all the tasks are completed
            es.invokeAll(todolist);
        } catch (Exception e) {
            System.out.println("Error in the Service executor " + e);
        }
    
        return rs.getMode();
    }
    
    私有静态类ResultStore{

    private Map store=new ConcurrentHashMap();
    公共整数大小(){
    返回store.size();
    }
    公共int-updateScore(int-key,int-freq){
    AtomicInteger值=store.get(键);
    如果(值==null){
    store.putIfAbsent(键,新的原子整数(0));
    value=store.get(键);
    }
    返回值.addAndGet(freq);
    }
    public int getMode(){
    int模式=0;
    int modeFreq=0;
    for(整数键:store.keySet()){
    int value=store.get(key.intValue();
    if(modeFreq<值){
    modeFreq=值;
    模式=键;
    }
    }
    返回模式;
    }
    }
    私有静态int computeMode(最终int[]mybigarray){
    int n_threads=Runtime.getRuntime().availableProcessors();
    ExecutorService es=Executors.newFixedThreadPool(n个线程);
    最终结果存储rs=新结果存储();
    //设置作业的数量
    int num_jobs=10;
    int job_size=mybigarray.length/num_jobs;
    System.out.println(“映射大小”+mybigarray.length);
    系统输出打印项次(“作业大小”+作业大小);
    List-todolist=new-ArrayList(num_作业);
    对于(int i=0;i

    如果不了解MapWorker的工作,就不可能回答。我能想到的两个潜在问题是:(I)每个任务做得太少,上下文切换会抵消任务并行化带来的收益;(ii)线程之间存在某种形式的同步,这会造成争用(即瓶颈)。感谢您的评论,我还更新了问题,以防你想再看一眼。感谢同步的setNewMode和getCount方法肯定会降低您的性能。我非常感谢您的帮助,您的解决方案看起来很健壮,不幸的是速度较慢,但我会按照您的建议在本地计算频率,并使用“结果收集器”。Thanks@RNO使用21474830的数组大小,均匀分布的值范围为100000,此版本的运行时间为1.5秒,而四核intel i7上的其他代码段的运行时间为9秒。
    public class Result {
        private int mode;
        private int maxCount;
    
        Result(int _mode, int _maxcount){
            mode = _mode;
            maxCount = _maxcount;
        }
    
        public synchronized void setNewMode(int _newmode, int _maxcount) {
            this.mode = _newmode;
            this.maxCount = _maxcount;
        }
    
        public int getMode() {
            return mode;
        }
    
        public synchronized int getCount() {
            return maxCount;
        }
    
    }
    
    private static class ResultStore {

        private Map<Integer, AtomicInteger> store = new ConcurrentHashMap<Integer, AtomicInteger>();
    
        public int size() {
            return store.size();
        }
    
        public int updateScore(int key, int freq) {
            AtomicInteger value = store.get(key);
            if (value == null) {
                store.putIfAbsent(key, new AtomicInteger(0));
                value = store.get(key);
            }
            return value.addAndGet(freq);
        }
    
        public int getMode() {
            int mode = 0;
            int modeFreq = 0;
            for (Integer key : store.keySet()) {
                int value = store.get(key).intValue();
                if (modeFreq < value) {
                    modeFreq = value;
                    mode = key;
                }
            }
            return mode;
        }
    }
    
    private static int computeMode(final int[] mybigarray) {
    
        int n_threads = Runtime.getRuntime().availableProcessors();
        ExecutorService es = Executors.newFixedThreadPool(n_threads);
        final ResultStore rs = new ResultStore();
    
        //set the number of jobs
        int num_jobs = 10;
        int job_size = mybigarray.length / num_jobs;
    
        System.out.println("Map size " + mybigarray.length);
        System.out.println("Job size " + job_size);
    
        List<Callable<Object>> todolist = new ArrayList<Callable<Object>>(num_jobs);
        for (int i = 0; i < num_jobs; i++) {
            final int start_index = i * job_size;
            final int finish_index = start_index + job_size;
    
            System.out.println("Start index: " + start_index + ". Finish index: " + finish_index);
    
            todolist.add(Executors.callable(new Runnable() {
                @Override
                public void run() {
                    final Map<Integer, Integer> localStore = new HashMap<Integer, Integer>();
                    for (int i = start_index; i < finish_index; i++) {
                        final Integer loopKey = mybigarray[i];
                        Integer loopValue = localStore.get(loopKey);
                        if (loopValue == null) {
                            localStore.put(loopKey, 1);
                        } else {
                            localStore.put(loopKey, loopValue + 1);
                        }
                    }
                    for (Integer loopKey : localStore.keySet()) {
                        final Integer loopValue = localStore.get(loopKey);
                        rs.updateScore(loopKey, loopValue);
                    }
                }
            }));
    
        }
        try {
            //invoke all will not return until all the tasks are completed
            es.invokeAll(todolist);
        } catch (Exception e) {
            System.out.println("Error in the Service executor " + e);
        }
    
        return rs.getMode();
    }