为什么java DirectoryStream的执行速度如此之慢?

为什么java DirectoryStream的执行速度如此之慢?,java,performance,stream,nio,Java,Performance,Stream,Nio,我已经用流做了一些测试,特别是nio包的DirectoryStreams。我只是尝试获取一个目录中所有文件的列表,按上次修改的日期和大小排序 old File.listFiles()的JavaDoc对Files中的方法声明了: 注意,Files类定义了newDirectoryStream方法 打开一个目录,并在 目录在处理非常大的数据时,这可能会使用较少的资源 目录 我在下面多次运行代码(下面的前三次): 首次运行: Run time of Arrays.sort: 1516 Run time

我已经用流做了一些测试,特别是nio包的DirectoryStreams。我只是尝试获取一个目录中所有文件的列表,按上次修改的日期和大小排序

old File.listFiles()的JavaDoc对Files中的方法声明了:

注意,Files类定义了newDirectoryStream方法 打开一个目录,并在 目录在处理非常大的数据时,这可能会使用较少的资源 目录

我在下面多次运行代码(下面的前三次):

首次运行:

Run time of Arrays.sort: 1516
Run time of Stream.sorted as Array: 2912
Run time of Stream.sorted as List: 2875
第二轮:

Run time of Arrays.sort: 1557
Run time of Stream.sorted as Array: 2978
Run time of Stream.sorted as List: 2937
第三次运行:

Run time of Arrays.sort: 1563
Run time of Stream.sorted as Array: 2919
Run time of Stream.sorted as List: 2896
我的问题是:为什么流的性能如此糟糕

import java.io.File;
import java.io.IOException;
import java.io.UncheckedIOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.attribute.FileTime;
import java.util.Arrays;
import java.util.Comparator;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class FileSorter {

  // This sorts from old to young and from big to small
  Comparator<Path> timeSizeComparator = (Path o1, Path o2) -> {
    int sorter = 0;
    try {
      FileTime lm1 = Files.getLastModifiedTime(o1);
      FileTime lm2 = Files.getLastModifiedTime(o2);
      if (lm2.compareTo(lm1) == 0) {
        Long s1 = Files.size(o1);
        Long s2 = Files.size(o2);
        sorter = s2.compareTo(s1);
      } else {
        sorter = lm1.compareTo(lm2);
      }
    } catch (IOException ex) {
      throw new UncheckedIOException(ex);
    }
    return sorter;
  };

  public String[] getSortedFileListAsArray(Path dir) throws IOException {
    Stream<Path> stream = Files.list(dir);
    return stream.sorted(timeSizeComparator).
            map(Path::getFileName).map(Path::toString).toArray(String[]::new);
  }

  public List<String> getSortedFileListAsList(Path dir) throws IOException {
    Stream<Path> stream = Files.list(dir);
    return stream.sorted(timeSizeComparator).
            map(Path::getFileName).map(Path::toString).collect(Collectors.
            toList());
  }

  public String[] sortByDateAndSize(File[] fileList) {
    Arrays.sort(fileList, (File o1, File o2) -> {
      int r = Long.compare(o1.lastModified(), o2.lastModified());
      if (r != 0) {
        return r;
      }
      return Long.compare(o1.length(), o2.length());
    });
    String[] fileNames = new String[fileList.length];
    for (int i = 0; i < fileNames.length; i++) {
      fileNames[i] = fileList[i].getName();
    }
    return fileNames;
  }

  public static void main(String[] args) throws IOException {
    // File (io package)
    File f = new File("C:\\Windows\\system32");
    // Path (nio package)
    Path dir = Paths.get("C:\\Windows\\system32");

    FileSorter fs = new FileSorter();

    long before = System.currentTimeMillis();
    String[] names = fs.sortByDateAndSize(f.listFiles());
    long after = System.currentTimeMillis();
    System.out.println("Run time of Arrays.sort: " + ((after - before)));

    long before2 = System.currentTimeMillis();
    String[] names2 = fs.getSortedFileListAsArray(dir);
    long after2 = System.currentTimeMillis();
    System.out.
            println("Run time of Stream.sorted as Array: " + ((after2 - before2)));

    long before3 = System.currentTimeMillis();
    List<String> names3 = fs.getSortedFileListAsList(dir);
    long after3 = System.currentTimeMillis();
    System.out.
            println("Run time of Stream.sorted as List: " + ((after3 - before3)));
  }
}
更新2 在对Peter的解决方案进行了一些研究之后,我可以说,使用for ex.Files.getLastModified读取文件属性肯定是一项艰巨的任务。仅将比较器中的该部分更改为:

  Comparator<Path> timeSizeComparator = (Path o1, Path o2) -> {
    File f1 = o1.toFile();
    File f2 = o2.toFile();
    long lm1 = f1.lastModified();
    long lm2 = f2.lastModified();
    int cmp = Long.compare(lm2, lm1);
    if (cmp == 0) {
      cmp = Long.compare(f2.length(), f1.length());
    }
    return cmp;
  };
但是正如您所看到的,缓存对象是最好的方法。正如jtahlborn提到的,这是一种稳定的排序

更新3(我找到的最佳解决方案) 经过进一步的研究,我发现Files.lastModified和Files.size方法在同一件事情上都做了大量的工作:属性。因此,我制作了三个版本的PathInfo类进行测试:

  • 彼得斯版本如下所述
  • 一个旧式的文件版本,其中我在构造函数中执行了一次Path.toFile(),并使用f.lastModified和f.length从该文件中获取所有值
  • Peters解决方案的一个版本,但现在我读取了一个带有Files.readAttributes(path,BasicFileAttributes.class)的属性对象,并在此基础上完成了一些操作
  • 把所有这些都放在一个循环中,每次做100次,我得出以下结果:

    After doing all hundred times
    Mean performance of Peters solution: 432.26
    Mean performance of old File solution: 343.11
    Mean performance of read attribute object once solution: 255.66
    
    最佳解决方案的PathInfo构造函数中的代码:

    public PathInfo(Path path) {
      try {
        // read the whole attributes once
        BasicFileAttributes bfa = Files.readAttributes(path, BasicFileAttributes.class);
        fileName = path.getFileName().toString();
        modified = bfa.lastModifiedTime().toMillis();
        size = bfa.size();
      } catch (IOException ex) {
        throw new UncheckedIOException(ex);
      }
    }
    
    我的结果:从不读取属性两次,在对象中缓存会提高性能。

    Files.list()是一个O(N)操作,而排序是O(N log N)。更可能的情况是,排序中的操作很重要。考虑到这些比较结果并不相同,这是最有可能的解释。在C:/Windows/System32下有许多文件具有相同的修改日期,这意味着需要经常检查文件大小

    为了表明大部分时间并没有花在FIles.list(dir)流中,我优化了比较,以便每个文件只获取一次有关文件的数据

    import java.io.File;
    import java.io.IOException;
    import java.io.UncheckedIOException;
    import java.nio.file.Files;
    import java.nio.file.Path;
    import java.nio.file.Paths;
    import java.nio.file.attribute.FileTime;
    import java.util.Arrays;
    import java.util.Comparator;
    import java.util.List;
    import java.util.stream.Collectors;
    import java.util.stream.Stream;
    
    public class FileSorter {
    
        // This sorts from old to young and from big to small
        Comparator<Path> timeSizeComparator = (Path o1, Path o2) -> {
            int sorter = 0;
            try {
                FileTime lm1 = Files.getLastModifiedTime(o1);
                FileTime lm2 = Files.getLastModifiedTime(o2);
                if (lm2.compareTo(lm1) == 0) {
                    Long s1 = Files.size(o1);
                    Long s2 = Files.size(o2);
                    sorter = s2.compareTo(s1);
                } else {
                    sorter = lm1.compareTo(lm2);
                }
            } catch (IOException ex) {
                throw new UncheckedIOException(ex);
            }
            return sorter;
        };
    
        public String[] getSortedFileListAsArray(Path dir) throws IOException {
            Stream<Path> stream = Files.list(dir);
            return stream.sorted(timeSizeComparator).
                    map(Path::getFileName).map(Path::toString).toArray(String[]::new);
        }
    
        public List<String> getSortedFileListAsList(Path dir) throws IOException {
            Stream<Path> stream = Files.list(dir);
            return stream.sorted(timeSizeComparator).
                    map(Path::getFileName).map(Path::toString).collect(Collectors.
                    toList());
        }
    
        public String[] sortByDateAndSize(File[] fileList) {
            Arrays.sort(fileList, (File o1, File o2) -> {
                int r = Long.compare(o1.lastModified(), o2.lastModified());
                if (r != 0) {
                    return r;
                }
                return Long.compare(o1.length(), o2.length());
            });
            String[] fileNames = new String[fileList.length];
            for (int i = 0; i < fileNames.length; i++) {
                fileNames[i] = fileList[i].getName();
            }
            return fileNames;
        }
    
        public List<String> getSortedFile(Path dir) throws IOException {
            return Files.list(dir).map(PathInfo::new).sorted().map(p -> p.getFileName()).collect(Collectors.toList());
        }
    
        static class PathInfo implements Comparable<PathInfo> {
            private final String fileName;
            private final long modified;
            private final long size;
    
            public PathInfo(Path path) {
                try {
                    fileName = path.getFileName().toString();
                    modified = Files.getLastModifiedTime(path).toMillis();
                    size = Files.size(path);
                } catch (IOException ex) {
                    throw new UncheckedIOException(ex);
                }
            }
    
            @Override
            public int compareTo(PathInfo o) {
                int cmp = Long.compare(modified, o.modified);
                if (cmp == 0)
                    cmp = Long.compare(size, o.size);
                return cmp;
            }
    
            public String getFileName() {
                return fileName;
            }
        }
    
        public static void main(String[] args) throws IOException {
            // File (io package)
            File f = new File("C:\\Windows\\system32");
            // Path (nio package)
            Path dir = Paths.get("C:\\Windows\\system32");
    
            FileSorter fs = new FileSorter();
    
            long before = System.currentTimeMillis();
            String[] names = fs.sortByDateAndSize(f.listFiles());
            long after = System.currentTimeMillis();
            System.out.println("Run time of Arrays.sort: " + ((after - before)));
    
            long before2 = System.currentTimeMillis();
            String[] names2 = fs.getSortedFileListAsArray(dir);
            long after2 = System.currentTimeMillis();
            System.out.println("Run time of Stream.sorted as Array: " + ((after2 - before2)));
    
            long before3 = System.currentTimeMillis();
            List<String> names3 = fs.getSortedFileListAsList(dir);
            long after3 = System.currentTimeMillis();
            System.out.println("Run time of Stream.sorted as List: " + ((after3 - before3)));
            long before4 = System.currentTimeMillis();
            List<String> names4 = fs.getSortedFile(dir);
            long after4 = System.currentTimeMillis();
            System.out.println("Run time of Stream.sorted as List with caching: " + ((after4 - before4)));
        }
    }
    

    如您所见,大约85%的时间用于反复获取文件的修改日期和大小。

    在基于路径的比较器中运行lm2 compareTo lm1两次。不是世界末日,但可能影响时代。还有,为什么要用Long来表示文件大小(而不是Long)。@jtahlborn你说得对,这可能是问题的一部分。@jtahlborn我已经更新了我的答案/解决方案,所以这可能是解决我的问题的“fastet”解决方案,而无需并行流?!更不用说,如果文件在排序时被触摸,这将提供一个稳定的排序!很好的解释!谢谢你,以为这是我的Comparator@jtahlborn比如javadocexplained@NwDx流提供了一种很好的方法来生成要排序的缓存对象。这减少了重复的工作。@PeterLawrey我已经更新了我的答案/解决方案,所以这可能是解决我问题的“快速”解决方案,而不是并行流?!
    public PathInfo(Path path) {
      try {
        // read the whole attributes once
        BasicFileAttributes bfa = Files.readAttributes(path, BasicFileAttributes.class);
        fileName = path.getFileName().toString();
        modified = bfa.lastModifiedTime().toMillis();
        size = bfa.size();
      } catch (IOException ex) {
        throw new UncheckedIOException(ex);
      }
    }
    
    import java.io.File;
    import java.io.IOException;
    import java.io.UncheckedIOException;
    import java.nio.file.Files;
    import java.nio.file.Path;
    import java.nio.file.Paths;
    import java.nio.file.attribute.FileTime;
    import java.util.Arrays;
    import java.util.Comparator;
    import java.util.List;
    import java.util.stream.Collectors;
    import java.util.stream.Stream;
    
    public class FileSorter {
    
        // This sorts from old to young and from big to small
        Comparator<Path> timeSizeComparator = (Path o1, Path o2) -> {
            int sorter = 0;
            try {
                FileTime lm1 = Files.getLastModifiedTime(o1);
                FileTime lm2 = Files.getLastModifiedTime(o2);
                if (lm2.compareTo(lm1) == 0) {
                    Long s1 = Files.size(o1);
                    Long s2 = Files.size(o2);
                    sorter = s2.compareTo(s1);
                } else {
                    sorter = lm1.compareTo(lm2);
                }
            } catch (IOException ex) {
                throw new UncheckedIOException(ex);
            }
            return sorter;
        };
    
        public String[] getSortedFileListAsArray(Path dir) throws IOException {
            Stream<Path> stream = Files.list(dir);
            return stream.sorted(timeSizeComparator).
                    map(Path::getFileName).map(Path::toString).toArray(String[]::new);
        }
    
        public List<String> getSortedFileListAsList(Path dir) throws IOException {
            Stream<Path> stream = Files.list(dir);
            return stream.sorted(timeSizeComparator).
                    map(Path::getFileName).map(Path::toString).collect(Collectors.
                    toList());
        }
    
        public String[] sortByDateAndSize(File[] fileList) {
            Arrays.sort(fileList, (File o1, File o2) -> {
                int r = Long.compare(o1.lastModified(), o2.lastModified());
                if (r != 0) {
                    return r;
                }
                return Long.compare(o1.length(), o2.length());
            });
            String[] fileNames = new String[fileList.length];
            for (int i = 0; i < fileNames.length; i++) {
                fileNames[i] = fileList[i].getName();
            }
            return fileNames;
        }
    
        public List<String> getSortedFile(Path dir) throws IOException {
            return Files.list(dir).map(PathInfo::new).sorted().map(p -> p.getFileName()).collect(Collectors.toList());
        }
    
        static class PathInfo implements Comparable<PathInfo> {
            private final String fileName;
            private final long modified;
            private final long size;
    
            public PathInfo(Path path) {
                try {
                    fileName = path.getFileName().toString();
                    modified = Files.getLastModifiedTime(path).toMillis();
                    size = Files.size(path);
                } catch (IOException ex) {
                    throw new UncheckedIOException(ex);
                }
            }
    
            @Override
            public int compareTo(PathInfo o) {
                int cmp = Long.compare(modified, o.modified);
                if (cmp == 0)
                    cmp = Long.compare(size, o.size);
                return cmp;
            }
    
            public String getFileName() {
                return fileName;
            }
        }
    
        public static void main(String[] args) throws IOException {
            // File (io package)
            File f = new File("C:\\Windows\\system32");
            // Path (nio package)
            Path dir = Paths.get("C:\\Windows\\system32");
    
            FileSorter fs = new FileSorter();
    
            long before = System.currentTimeMillis();
            String[] names = fs.sortByDateAndSize(f.listFiles());
            long after = System.currentTimeMillis();
            System.out.println("Run time of Arrays.sort: " + ((after - before)));
    
            long before2 = System.currentTimeMillis();
            String[] names2 = fs.getSortedFileListAsArray(dir);
            long after2 = System.currentTimeMillis();
            System.out.println("Run time of Stream.sorted as Array: " + ((after2 - before2)));
    
            long before3 = System.currentTimeMillis();
            List<String> names3 = fs.getSortedFileListAsList(dir);
            long after3 = System.currentTimeMillis();
            System.out.println("Run time of Stream.sorted as List: " + ((after3 - before3)));
            long before4 = System.currentTimeMillis();
            List<String> names4 = fs.getSortedFile(dir);
            long after4 = System.currentTimeMillis();
            System.out.println("Run time of Stream.sorted as List with caching: " + ((after4 - before4)));
        }
    }
    
    Run time of Arrays.sort: 1980
    Run time of Stream.sorted as Array: 1295
    Run time of Stream.sorted as List: 1228
    Run time of Stream.sorted as List with caching: 185