Java流并行化的可视化

Java流并行化的可视化,java,parallel-processing,java-8,java-stream,Java,Parallel Processing,Java 8,Java Stream,通常不太清楚并行流是如何将输入分割成块的,以及块的连接顺序。是否有任何方法可以可视化任何流源的整个过程,以便更好地了解发生了什么?假设我创建了这样一个流: Stream<Integer> stream = IntStream.range(0, 100).boxed().parallel(); 这意味着整个输入范围[0..99]被拆分为[0..49]和[50..99]范围,而这两个范围又被进一步拆分。当然,这样的图应该反映流API的实际工作,因此,如果我对这样的流执行一些实际操作,则

通常不太清楚并行流是如何将输入分割成块的,以及块的连接顺序。是否有任何方法可以可视化任何流源的整个过程,以便更好地了解发生了什么?假设我创建了这样一个流:

Stream<Integer> stream = IntStream.range(0, 100).boxed().parallel();

这意味着整个输入范围
[0..99]
被拆分为
[0..49]
[50..99]
范围,而这两个范围又被进一步拆分。当然,这样的图应该反映流API的实际工作,因此,如果我对这样的流执行一些实际操作,则拆分应该以相同的方式执行。

当前流API实现使用收集器合并器以与先前拆分完全相同的方式合并中间结果。此外,拆分策略取决于源池和公共池的并行度级别,但不取决于所使用的精确缩减操作(与
reduce
collect
forEach
count
等操作相同)。基于此,创建可视化收集器并不十分困难:

public static Collector<Object, ?, List<String>> parallelVisualize() {
    class Range {
        private String first, last;
        private Range left, right;

        void accept(Object obj) {
            if (first == null)
                first = obj.toString();
            else
                last = obj.toString();
        }

        Range combine(Range that) {
            Range p = new Range();
            p.first = first == null ? that.first : first;
            p.last = Stream
                    .of(that.last, that.first, this.last, this.first)
                    .filter(Objects::nonNull).findFirst().orElse(null);
            p.left = this;
            p.right = that;
            return p;
        }

        String pad(String s, int left, int len) {
            if (len == s.length())
                return s;
            char[] result = new char[len];
            Arrays.fill(result, ' ');
            s.getChars(0, s.length(), result, left);
            return new String(result);
        }

        public List<String> finish() {
            String cur = toString();
            if (left == null) {
                return Collections.singletonList(cur);
            }
            List<String> l = left.finish();
            List<String> r = right.finish();
            int len1 = l.get(0).length();
            int len2 = r.get(0).length();
            int totalLen = len1 + len2 + 1;
            int leftAdd = 0;
            if (cur.length() < totalLen) {
                cur = pad(cur, (totalLen - cur.length()) / 2, totalLen);
            } else {
                leftAdd = (cur.length() - totalLen) / 2;
                totalLen = cur.length();
            }
            List<String> result = new ArrayList<>();
            result.add(cur);

            char[] dashes = new char[totalLen];
            Arrays.fill(dashes, ' ');
            Arrays.fill(dashes, len1 / 2 + leftAdd + 1, len1 + len2 / 2 + 1
                    + leftAdd, '_');
            int mid = totalLen / 2;
            dashes[mid] = '/';
            dashes[mid + 1] = '\\';
            result.add(new String(dashes));

            Arrays.fill(dashes, ' ');
            dashes[len1 / 2 + leftAdd] = '|';
            dashes[len1 + len2 / 2 + 1 + leftAdd] = '|';
            result.add(new String(dashes));

            int maxSize = Math.max(l.size(), r.size());
            for (int i = 0; i < maxSize; i++) {
                String lstr = l.size() > i ? l.get(i) : String.format("%"
                        + len1 + "s", "");
                String rstr = r.size() > i ? r.get(i) : String.format("%"
                        + len2 + "s", "");
                result.add(pad(lstr + " " + rstr, leftAdd, totalLen));
            }
            return result;
        }

        public String toString() {
            if (first == null)
                return "(empty)";
            else if (last == null)
                return "[" + first + "]";
            return "[" + first + ".." + last + "]";
        }
    }
    return Collector.of(Range::new, Range::accept, Range::combine,
            Range::finish);
}
甚至分为16项任务:

                                                                  [0..99]                                                                   
                                   ___________________________________/\________________________________                                    
                                  |                                                                     |                                   
                              [0..49]                                                               [50..99]                                
                 _________________/\______________                                     _________________/\________________                  
                |                                 |                                   |                                   |                 
            [0..24]                           [25..49]                            [50..74]                            [75..99]              
        ________/\_____                   ________/\_______                   ________/\_______                   ________/\_______         
       |               |                 |                 |                 |                 |                 |                 |        
   [0..11]         [12..24]          [25..36]          [37..49]          [50..61]          [62..74]          [75..86]          [87..99]     
    ___/\_          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___    
   |      |        |        |        |        |        |        |        |        |        |        |        |        |        |        |   
[0..5] [6..11] [12..17] [18..24] [25..30] [31..36] [37..42] [43..49] [50..55] [56..61] [62..67] [68..74] [75..80] [81..86] [87..92] [93..99]
两个流的拆分连接

IntStream.range(0, 100)
        .boxed().parallel().collect(parallelVisualize())
        .forEach(System.out::println);
IntStream
        .concat(IntStream.range(0, 10), IntStream.range(10, 100))
        .boxed().parallel().collect(parallelVisualize())
        .forEach(System.out::println);
Stream.concat(IntStream.range(0, 50).boxed().parallel(), IntStream.range(50, 100).boxed())
        .collect(parallelVisualize())
        .forEach(System.out::println);
Stream.of(0, 50)
        .flatMap(start -> IntStream.range(start, start+50).boxed().parallel())
        .parallel().collect(parallelVisualize())
        .forEach(System.out::println);
StreamSupport
        .stream(Spliterators.spliterator(IntStream.range(0, 7000)
                .iterator(), 7000, Spliterator.ORDERED), true)
        .collect(parallelVisualize()).forEach(System.out::println);
如您所见,first split取消连接流:

                                                                           [0..99]                                                                           
       _______________________________________________________________________/\_____                                                                        
      |                                                                              |                                                                       
   [0..9]                                                                        [10..99]                                                                    
    __/\__                                        ___________________________________/\__________________________________                                    
   |      |                                      |                                                                       |                                   
[0..4] [5..9]                                [10..54]                                                                [55..99]                                
                                _________________/\________________                                     _________________/\________________                  
                               |                                   |                                   |                                   |                 
                           [10..31]                            [32..54]                            [55..76]                            [77..99]              
                       ________/\_______                   ________/\_______                   ________/\_______                   ________/\_______         
                      |                 |                 |                 |                 |                 |                 |                 |        
                  [10..20]          [21..31]          [32..42]          [43..54]          [55..65]          [66..76]          [77..87]          [88..99]     
                   ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___    
                  |        |        |        |        |        |        |        |        |        |        |        |        |        |        |        |   
              [10..14] [15..20] [21..25] [26..31] [32..36] [37..42] [43..48] [49..54] [55..59] [60..65] [66..70] [71..76] [77..81] [82..87] [88..93] [94..99]
两个流连接的拆分,其中在连接之前执行了中间操作(boxed())

IntStream.range(0, 100)
        .boxed().parallel().collect(parallelVisualize())
        .forEach(System.out::println);
IntStream
        .concat(IntStream.range(0, 10), IntStream.range(10, 100))
        .boxed().parallel().collect(parallelVisualize())
        .forEach(System.out::println);
Stream.concat(IntStream.range(0, 50).boxed().parallel(), IntStream.range(50, 100).boxed())
        .collect(parallelVisualize())
        .forEach(System.out::println);
Stream.of(0, 50)
        .flatMap(start -> IntStream.range(start, start+50).boxed().parallel())
        .parallel().collect(parallelVisualize())
        .forEach(System.out::println);
StreamSupport
        .stream(Spliterators.spliterator(IntStream.range(0, 7000)
                .iterator(), 7000, Spliterator.ORDERED), true)
        .collect(parallelVisualize()).forEach(System.out::println);
如果其中一个输入流在连接之前未转换为并行模式,则它将拒绝拆分:

                                   [0..99]                                   
                                   ___/\_________________________________    
                                  |                                      |   
                              [0..49]                                [50..99]
                 _________________/\______________                           
                |                                 |                          
            [0..24]                           [25..49]                       
        ________/\_____                   ________/\_______                  
       |               |                 |                 |                 
   [0..11]         [12..24]          [25..36]          [37..49]              
    ___/\_          ___/\___          ___/\___          ___/\___             
   |      |        |        |        |        |        |        |            
[0..5] [6..11] [12..17] [18..24] [25..30] [31..36] [37..42] [43..49]         
平面映射的拆分

IntStream.range(0, 100)
        .boxed().parallel().collect(parallelVisualize())
        .forEach(System.out::println);
IntStream
        .concat(IntStream.range(0, 10), IntStream.range(10, 100))
        .boxed().parallel().collect(parallelVisualize())
        .forEach(System.out::println);
Stream.concat(IntStream.range(0, 50).boxed().parallel(), IntStream.range(50, 100).boxed())
        .collect(parallelVisualize())
        .forEach(System.out::println);
Stream.of(0, 50)
        .flatMap(start -> IntStream.range(start, start+50).boxed().parallel())
        .parallel().collect(parallelVisualize())
        .forEach(System.out::println);
StreamSupport
        .stream(Spliterators.spliterator(IntStream.range(0, 7000)
                .iterator(), 7000, Spliterator.ORDERED), true)
        .collect(parallelVisualize()).forEach(System.out::println);
平面贴图从不在嵌套流内并行:

    [0..99]     
    ____/\__    
   |        |   
[0..49] [50..99]
来自7000个元素的未知大小迭代器的流(请参阅上下文):

分裂真的很糟糕,大家都在等待最重要的部分[3072..6143]:

                       [0..6999]                        
     _______________________/\___                       
    |                            |                      
[0..1023]                  [1024..6999]                 
                 ________________/\____                 
                |                      |                
          [1024..3071]           [3072..6999]           
                              _________/\_____          
                             |                |         
                       [3072..6143]     [6144..6999]    
                                           ___/\____    
                                          |         |   
                                    [6144..6999] (empty)
已知大小的迭代器源

IntStream.range(0, 100)
        .boxed().parallel().collect(parallelVisualize())
        .forEach(System.out::println);
IntStream
        .concat(IntStream.range(0, 10), IntStream.range(10, 100))
        .boxed().parallel().collect(parallelVisualize())
        .forEach(System.out::println);
Stream.concat(IntStream.range(0, 50).boxed().parallel(), IntStream.range(50, 100).boxed())
        .collect(parallelVisualize())
        .forEach(System.out::println);
Stream.of(0, 50)
        .flatMap(start -> IntStream.range(start, start+50).boxed().parallel())
        .parallel().collect(parallelVisualize())
        .forEach(System.out::println);
StreamSupport
        .stream(Spliterators.spliterator(IntStream.range(0, 7000)
                .iterator(), 7000, Spliterator.ORDERED), true)
        .collect(parallelVisualize()).forEach(System.out::println);
提供大小可以更好地解锁进一步拆分:

                                                                                                    [0..6999]                                                                                                     
           ______________________________________________________________________________________________/\________                                                                                               
          |                                                                                                        |                                                                                              
     [0..1023]                                                                                               [1024..6999]                                                                                         
     _____/\__                                 ____________________________________________________________________/\________________________                                                                     
    |         |                               |                                                                                              |                                                                    
[0..511] [512..1023]                    [1024..3071]                                                                                   [3072..6999]                                                               
                                  ____________/\___________                                                                  ________________/\__________________________________________________                 
                                 |                         |                                                                |                                                                    |                
                           [1024..2047]              [2048..3071]                                                     [3072..6143]                                                         [6144..6999]           
                            _____/\_____              _____/\_____                                 _________________________/\________________________                                        ___/\___________    
                           |            |            |            |                               |                                                   |                                      |                |   
                     [1024..1535] [1536..2047] [2048..2559] [2560..3071]                    [3072..4607]                                        [4608..6143]                           [6144..6999]        (empty)
                                                                                      ____________/\___________                           ____________/\___________                     _____/\_____              
                                                                                     |                         |                         |                         |                   |            |             
                                                                               [3072..3839]              [3840..4607]              [4608..5375]              [5376..6143]        [6144..6571] [6572..6999]        
                                                                                _____/\_____              _____/\_____              _____/\_____              _____/\_____                                        
                                                                               |            |            |            |            |            |            |            |                                       
                                                                         [3072..3455] [3456..3839] [3840..4223] [4224..4607] [4608..4991] [4992..5375] [5376..5759] [5760..6143]                                  
这种收集器的进一步改进可以生成图形图像(如svg)、跟踪处理每个节点的线程、显示每个组的元素数等等。如果您愿意,可以使用它。

我想增加一个解决方案,以监控源端甚至中间操作的拆分(当前的流API实现施加了一些限制):

将打印

[0..99]
___________________________________/\________________________________                                    
|                                                                     |                                   
[0..49]                                                               [50..99]                                
_________________/\______________                                     _________________/\________________                  
|                                 |                                   |                                   |                 
[0..24]                           [25..49]                            [50..74]                            [75..99]              
________/\_____                   ________/\_______                   ________/\_______                   ________/\_______         
|               |                 |                 |                 |                 |                 |                 |        
[0..11]         [12..24]          [25..36]          [37..49]          [50..61]          [62..74]          [75..86]          [87..99]     
___/\_          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___    
|      |        |        |        |        |        |        |        |        |        |        |        |        |        |        |   
[0..5] [6..11] [12..17] [18..24] [25..30] [31..36] [37..42] [43..49] [50..55] [56..61] [62..67] [68..74] [75..80] [81..86] [87..92] [93..99]
鉴于

try(Stream<String> s=proxy(IntStream.range(0, 100).parallel().filter(i -> i/20%2==0)
      .mapToObj(ix->"\""+ix+'"'))) {
    s.forEach(str->{});
}
正如我们在这里所看到的,我们正在监视
.filter(…).mapToObj(…)
的结果,但是块显然是由源确定的,可能会根据过滤器的条件在下游生成空块

请注意,我们可以将源监控与Tagir的收集器监控结合起来:

try(IntStream s=proxy(IntStream.range(0, 100))) {
    s.parallel().filter(i -> i/20%2==0)
     .boxed().collect(parallelVisualize())
     .forEach(System.out::println);
}
这将打印(请注意,
collect
输出首先打印):

[0..99]
________________________________/\_______________________________                                 
|                                                                 |                                
[0..49]                                                           [50..99]                             
________________/\______________                                   _______________/\_______________                
|                                |                                 |                                |               
[0..19]                          [40..49]                          [50..59]                         [80..99]            
________/\_____                  ________/\______                   _______/\_______                ________/\_____         
|               |                |                |                 |                |              |               |        
[0..11][12..19](空)[40..49][50..59](空)[80..86][87..99