Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/multithreading/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 并行化快速排序使其速度变慢_Java_Multithreading_Sorting_Parallel Processing_Quicksort - Fatal编程技术网

Java 并行化快速排序使其速度变慢

Java 并行化快速排序使其速度变慢,java,multithreading,sorting,parallel-processing,quicksort,Java,Multithreading,Sorting,Parallel Processing,Quicksort,我正在对大量数据进行快速排序,为了好玩,我尝试将其并行化以加快排序。但是,在其当前形式中,由于同步阻塞点,多线程版本比单线程版本慢。 每次生成一个线程时,我都会锁定一个int并递增它,每次线程结束时,除了检查是否有任何线程仍在运行(int>0)之外,我还会再次获得一个锁定并递减。若并没有,我将唤醒我的主线程并处理排序后的数据 我相信有更好的办法。不知道是什么。非常感谢您的帮助 编辑: 我想我没有提供足够的信息。 这是octo core Opteron上的Java代码。我无法切换语言。 我正在排序

我正在对大量数据进行快速排序,为了好玩,我尝试将其并行化以加快排序。但是,在其当前形式中,由于同步阻塞点,多线程版本比单线程版本慢。
每次生成一个线程时,我都会锁定一个int并递增它,每次线程结束时,除了检查是否有任何线程仍在运行(int>0)之外,我还会再次获得一个锁定并递减。若并没有,我将唤醒我的主线程并处理排序后的数据

我相信有更好的办法。不知道是什么。非常感谢您的帮助

编辑: 我想我没有提供足够的信息。
这是octo core Opteron上的Java代码。我无法切换语言。
我正在排序的数量适合内存,并且在调用quicksort时它已经存在于内存中,因此没有理由将其写入磁盘,而只是将其读回内存。

“获取锁”是指在整数上有一个同步块

线程非常昂贵。如果没有大量数据要排序,请不要使用线程。或者您可以使用一种具有更好并发设计的语言。例如,Erlang具有非常轻的线程,可用于排序

我说的“弄把锁”是指有一把锁 整数上的同步块。 如果我没弄错的话:您正在锁定实际排序的每个元素,这听起来会非常慢

听起来你的线程太多了。。。您还没有告诉我们您实际生成了多少个线程,但是如果您每整数生成一个线程,那么它几乎肯定会变慢(几乎可以肯定这是一个轻描淡写的说法)。您需要做的是生成8个线程,因为您有8个内核,并将数组“划分”为8个部分,分别进行快速排序,然后像在原始算法中一样进行连接


以下是一些如何实现的示例:

在不了解更多实施信息的情况下,以下是我的建议和/或评论:

  • 限制在任何给定时间可以运行的线程数。Pergaps 8或10(也许是为了给调度程序更多的回旋余地,不过最好是为每个核心/硬件线程设置一个)。如果亲和性不支持,那么在CPU受限的问题上运行更多线程以获得“吞吐量”实际上没有意义

  • 不要在树叶附近穿行仅在较大的分支上执行线程。产生一个线程来对相对较少的项目进行排序是没有意义的,在这个级别上有许多小分支!线程将增加更多的相对开销。(这类似于切换到树叶的“简单排序”)

  • 确保每个线程都可以独立工作——工作期间不应踩踏另一个线程->无锁,只需等待加入即可。分而治之

  • 可以考虑执行“宽度优先”的方法来生成线程

  • 考虑一下快速排序上的合并排序(我倾向于合并排序:-),记住有许多不同类型的合并排序,包括自底向上

  • 编辑

  • 确保它实际工作。记住要正确利用线程之间的内存屏障——即使没有两个线程同时修改相同的数据,也需要这样做,以确保正确的可见性
  • 编辑(概念验证): 我做了这个简单的演示。在我的Intel Core2 Duo@2Ghz上,我可以在大约2/3到3/4的时间内运行它,这肯定是一些改进:)(设置:数据大小=3000000,最大线程数=4,最小并行数=1000)。这是从维基百科中提取的基本就地快速排序代码,它没有利用任何其他基本优化

    它确定一个新线程是否可以/应该启动的方法也是非常原始的——如果没有可用的新线程,它只会快速启动(因为,你知道,为什么要等待?)

    这段代码也应该(希望)在线程的宽度上展开。对于数据局部性来说,这可能比保持深度更有效,但是如果我的头脑清醒的话,这个模型似乎足够简单

    executor服务还用于简化设计,并能够重用相同的线程(而不是生成新线程)。在执行器开销开始显示之前,MIN_PARALLEL可能会变得非常小(比如说,大约20个)——线程的最大数量和仅使用新线程(如果可能的话)可能也会保持这种状态

    qsort average seconds: 0.6290541056 pqsort average seconds: 0.4513915392 qsort平均秒数:0.6290541056 pqsort平均秒数:0.4513915392 我绝对不能保证这段代码的有用性或正确性,但它在这里“似乎有效”。请注意ThreadPoolExecutor旁边的警告,因为它清楚地表明我不完全确定发生了什么:-)我相当确定该设计在线程利用不足方面存在一些缺陷。

    package psq;
    
    import java.util.Arrays;
    import java.util.Random;
    import java.util.concurrent.*;
    
    public class Main {
    
        int[] genData (int len) {
            Random r = new Random();
            int[] newData = new int[len];
            for (int i = 0; i < newData.length; i++) {
                newData[i] = r.nextInt();
            }
            return newData;
        }      
    
        boolean check (int[] arr) {
            if (arr.length == 0) {
                return true;
            }
            int lastValue = arr[0];
            for (int i = 1; i < arr.length; i++) {
                //System.out.println(arr[i]);
                if (arr[i] < lastValue) {
                    return false;
                }
                lastValue = arr[i];
            }
            return true;
        }
    
        int partition (int[] arr, int left, int right, int pivotIndex) {
            // pivotValue := array[pivotIndex]
            int pivotValue = arr[pivotIndex];
            {
                // swap array[pivotIndex] and array[right] // Move pivot to end
                int t = arr[pivotIndex];
                arr[pivotIndex] = arr[right];
                arr[right] = t;
            }
            // storeIndex := left
            int storeIndex = left;
            // for i  from  left to right - 1 // left ≤ i < right
            for (int i = left; i < right; i++) {
                //if array[i] ≤ pivotValue
                if (arr[i] <= pivotValue) {
                    //swap array[i] and array[storeIndex]
                    //storeIndex := storeIndex + 1            
                    int t = arr[i];
                    arr[i] = arr[storeIndex];
                    arr[storeIndex] = t;
                    storeIndex++;                   
                }
            }
            {
                // swap array[storeIndex] and array[right] // Move pivot to its final place
                int t = arr[storeIndex];
                arr[storeIndex] = arr[right];
                arr[right] = t;
            }
            // return storeIndex
            return storeIndex;
        }
    
        void quicksort (int[] arr, int left, int right) {
            // if right > left
            if (right > left) {            
                // select a pivot index //(e.g. pivotIndex := left + (right - left)/2)
                int pivotIndex = left + (right - left) / 2;
                // pivotNewIndex := partition(array, left, right, pivotIndex)
                int pivotNewIndex = partition(arr, left, right, pivotIndex);
                // quicksort(array, left, pivotNewIndex - 1)
                // quicksort(array, pivotNewIndex + 1, right)
                quicksort(arr, left, pivotNewIndex - 1);
                quicksort(arr, pivotNewIndex + 1, right);
            }
        }
    
        static int DATA_SIZE = 3000000;
        static int MAX_THREADS = 4;
        static int MIN_PARALLEL = 1000;
    
        // NOTE THAT THE THREAD POOL EXECUTER USES A LINKEDBLOCKINGQUEUE
        // That is, because it's possible to OVER SUBMIT with this code,
        // even with the semaphores!
        ThreadPoolExecutor tp = new ThreadPoolExecutor(
                MAX_THREADS,
                MAX_THREADS,
                Long.MAX_VALUE,
                TimeUnit.NANOSECONDS,
                new LinkedBlockingQueue<Runnable>());
        // if there are no semaphore available then then we just continue
        // processing from the same thread and "deal with it"
        Semaphore sem = new Semaphore(MAX_THREADS, false); 
    
        class QuickSortAction implements Runnable {
            int[] arr;
            int left;
            int right;
    
            public QuickSortAction (int[] arr, int left, int right) {
                this.arr = arr;
                this.left = left;
                this.right = right;
            }
    
            public void run () {
                try {
                    //System.out.println(">>[" + left + "|" + right + "]");
                    pquicksort(arr, left, right);
                    //System.out.println("<<[" + left + "|" + right + "]");
                } catch (Exception ex) {
                    // I got nothing for this
                    throw new RuntimeException(ex); 
                }
            }
    
        }
    
        // pquicksort
        // threads will [hopefully] fan-out "breadth-wise"
        // this is because it's likely that the 2nd executer (if needed)
        // will be submitted prior to the 1st running and starting its own executors
        // of course this behavior is not terribly well-define
        void pquicksort (int[] arr, int left, int right) throws ExecutionException, InterruptedException {
            if (right > left) {
                // memory barrier -- pquicksort is called from different threads
                synchronized (arr) {}
    
                int pivotIndex = left + (right - left) / 2;
                int pivotNewIndex = partition(arr, left, right, pivotIndex);
    
                Future<?> f1 = null;
                Future<?> f2 = null;
    
                if ((pivotNewIndex - 1) - left > MIN_PARALLEL) {
                    if (sem.tryAcquire()) {
                        f1 = tp.submit(new QuickSortAction(arr, left, pivotNewIndex - 1));
                    } else {
                        pquicksort(arr, left, pivotNewIndex - 1);
                    }
                } else {
                    quicksort(arr, left, pivotNewIndex - 1);
                }
                if (right - (pivotNewIndex + 1) > MIN_PARALLEL) {
                    if (sem.tryAcquire()) {
                        f2 = tp.submit(new QuickSortAction(arr, pivotNewIndex + 1, right));
                    } else {
                        pquicksort(arr, pivotNewIndex + 1, right);
                    }
                } else {
                    quicksort(arr, pivotNewIndex + 1, right);
                }
    
                // join back up
                if (f1 != null) {
                    f1.get();
                    sem.release();
                }
                if (f2 != null) {
                    f2.get();
                    sem.release();
                }
            }        
        }
    
        long qsort_call (int[] origData) throws Exception {
            int[] data = Arrays.copyOf(origData, origData.length);
            long start = System.nanoTime();
            quicksort(data, 0, data.length - 1);
            long duration = System.nanoTime() - start;
            if (!check(data)) {
                throw new Exception("qsort not sorted!");
            }
            return duration;
        }
    
        long pqsort_call (int[] origData) throws Exception {
            int[] data = Arrays.copyOf(origData, origData.length);
            long start = System.nanoTime();
            pquicksort(data, 0, data.length - 1);
            long duration = System.nanoTime() - start;
            if (!check(data)) {
                throw new Exception("pqsort not sorted!");
            }
            return duration;
        }
    
        public Main () throws Exception {
            long qsort_duration = 0;
            long pqsort_duration = 0;
            int ITERATIONS = 10;
            for (int i = 0; i < ITERATIONS; i++) {
                System.out.println("Iteration# " + i);
                int[] data = genData(DATA_SIZE);
                if ((i & 1) == 0) {
                    qsort_duration += qsort_call(data);
                    pqsort_duration += pqsort_call(data);
                } else {
                    pqsort_duration += pqsort_call(data);
                    qsort_duration += qsort_call(data);
                }
            }
            System.out.println("====");
            System.out.println("qsort average seconds: " + (float)qsort_duration / (ITERATIONS * 1E9));
            System.out.println("pqsort average seconds: " + (float)pqsort_duration / (ITERATIONS * 1E9));
        }
    
        public static void main(String[] args) throws Exception {
            new Main();
        }
    
    }
    
    包psq;
    导入java.util.array;
    导入java.util.Random;
    导入java.util.concurrent.*;
    公共班机{
    int[]性别数据(int len){
    随机r=新随机();
    int[]newData=newint[len];
    for(int i=0;ipackage psq;
    
    import java.util.Arrays;
    import java.util.Random;
    import java.util.concurrent.*;
    
    public class Main {
    
        int[] genData (int len) {
            Random r = new Random();
            int[] newData = new int[len];
            for (int i = 0; i < newData.length; i++) {
                newData[i] = r.nextInt();
            }
            return newData;
        }      
    
        boolean check (int[] arr) {
            if (arr.length == 0) {
                return true;
            }
            int lastValue = arr[0];
            for (int i = 1; i < arr.length; i++) {
                //System.out.println(arr[i]);
                if (arr[i] < lastValue) {
                    return false;
                }
                lastValue = arr[i];
            }
            return true;
        }
    
        int partition (int[] arr, int left, int right, int pivotIndex) {
            // pivotValue := array[pivotIndex]
            int pivotValue = arr[pivotIndex];
            {
                // swap array[pivotIndex] and array[right] // Move pivot to end
                int t = arr[pivotIndex];
                arr[pivotIndex] = arr[right];
                arr[right] = t;
            }
            // storeIndex := left
            int storeIndex = left;
            // for i  from  left to right - 1 // left ≤ i < right
            for (int i = left; i < right; i++) {
                //if array[i] ≤ pivotValue
                if (arr[i] <= pivotValue) {
                    //swap array[i] and array[storeIndex]
                    //storeIndex := storeIndex + 1            
                    int t = arr[i];
                    arr[i] = arr[storeIndex];
                    arr[storeIndex] = t;
                    storeIndex++;                   
                }
            }
            {
                // swap array[storeIndex] and array[right] // Move pivot to its final place
                int t = arr[storeIndex];
                arr[storeIndex] = arr[right];
                arr[right] = t;
            }
            // return storeIndex
            return storeIndex;
        }
    
        void quicksort (int[] arr, int left, int right) {
            // if right > left
            if (right > left) {            
                // select a pivot index //(e.g. pivotIndex := left + (right - left)/2)
                int pivotIndex = left + (right - left) / 2;
                // pivotNewIndex := partition(array, left, right, pivotIndex)
                int pivotNewIndex = partition(arr, left, right, pivotIndex);
                // quicksort(array, left, pivotNewIndex - 1)
                // quicksort(array, pivotNewIndex + 1, right)
                quicksort(arr, left, pivotNewIndex - 1);
                quicksort(arr, pivotNewIndex + 1, right);
            }
        }
    
        static int DATA_SIZE = 3000000;
        static int MAX_EXTRA_THREADS = 7;
        static int MIN_PARALLEL = 500;
    
        // To get to reducePermits
        @SuppressWarnings("serial")
        class Semaphore2 extends Semaphore {
            public Semaphore2(int permits, boolean fair) {
                super(permits, fair);
            }
            public void removePermit() {
                super.reducePermits(1);
            }
        }
    
        class QuickSortAction implements Runnable {
            final int[] arr;
            final int left;
            final int right;
            final SortState ss;
    
            public QuickSortAction (int[] arr, int left, int right, SortState ss) {
                this.arr = arr;
                this.left = left;
                this.right = right;
                this.ss = ss;
            }
    
            public void run () {
                try {
                    //System.out.println(">>[" + left + "|" + right + "]");
                    pquicksort(arr, left, right, ss);
                    //System.out.println("<<[" + left + "|" + right + "]");
                    ss.limit.release();
                    ss.countdown.release();
                } catch (Exception ex) {
                    // I got nothing for this
                    throw new RuntimeException(ex); 
                }
            }
    
        }
    
        class SortState {
            final public ThreadPoolExecutor pool = new ThreadPoolExecutor(
                MAX_EXTRA_THREADS,
                MAX_EXTRA_THREADS,
                Long.MAX_VALUE,
                TimeUnit.NANOSECONDS,
                new LinkedBlockingQueue<Runnable>());
            // actual limit: executor may actually still have "active" things to process
            final public Semaphore limit = new Semaphore(MAX_EXTRA_THREADS, false); 
            final public Semaphore2 countdown = new Semaphore2(1, false); 
        }
    
        void pquicksort (int[] arr) throws Exception {
            SortState ss = new SortState();
            pquicksort(arr, 0, arr.length - 1, ss);
            ss.countdown.acquire();
        }
    
        // pquicksort
        // threads "fork" if available.
        void pquicksort (int[] arr, int left, int right, SortState ss) throws ExecutionException, InterruptedException {
            if (right > left) {
                // memory barrier -- pquicksort is called from different threads
                // and those threads may be created because they are in an executor
                synchronized (arr) {}
    
                int pivotIndex = left + (right - left) / 2;
                int pivotNewIndex = partition(arr, left, right, pivotIndex);
    
                {
                    int newRight = pivotNewIndex - 1;
                    if (newRight - left > MIN_PARALLEL) {
                        if (ss.limit.tryAcquire()) {
                            ss.countdown.removePermit();
                            ss.pool.submit(new QuickSortAction(arr, left, newRight, ss));
                        } else {
                            pquicksort(arr, left, newRight, ss);
                        }
                    } else {
                        quicksort(arr, left, newRight);
                    }
                }
    
                {
                    int newLeft = pivotNewIndex + 1;
                    if (right - newLeft > MIN_PARALLEL) {
                        if (ss.limit.tryAcquire()) {
                            ss.countdown.removePermit();
                            ss.pool.submit(new QuickSortAction(arr, newLeft, right, ss));
                        } else {
                            pquicksort(arr, newLeft, right, ss);
                        }
                    } else {
                        quicksort(arr, newLeft, right);
                    }
                }
    
            }        
        }
    
        long qsort_call (int[] origData) throws Exception {
            int[] data = Arrays.copyOf(origData, origData.length);
            long start = System.nanoTime();
            quicksort(data, 0, data.length - 1);
            long duration = System.nanoTime() - start;
            if (!check(data)) {
                throw new Exception("qsort not sorted!");
            }
            return duration;
        }
    
        long pqsort_call (int[] origData) throws Exception {
            int[] data = Arrays.copyOf(origData, origData.length);
            long start = System.nanoTime();
            pquicksort(data);
            long duration = System.nanoTime() - start;
            if (!check(data)) {            
                throw new Exception("pqsort not sorted!");
            }
            return duration;
        }
    
        public Main () throws Exception {
            long qsort_duration = 0;
            long pqsort_duration = 0;
            int ITERATIONS = 10;
            for (int i = 0; i < ITERATIONS; i++) {
                System.out.println("Iteration# " + i);
                int[] data = genData(DATA_SIZE);
                if ((i & 1) == 0) {
                    qsort_duration += qsort_call(data);
                    pqsort_duration += pqsort_call(data);
                } else {
                    pqsort_duration += pqsort_call(data);
                    qsort_duration += qsort_call(data);
                }
            }
            System.out.println("====");
            System.out.println("qsort average seconds: " + (float)qsort_duration / (ITERATIONS * 1E9));
            System.out.println("pqsort average seconds: " + (float)pqsort_duration / (ITERATIONS * 1E9));
        }
    
        public static void main(String[] args) throws Exception {
            new Main();
        }
    
    }