如何在Java中对文本文件中带引号的字符串进行排序_Java_Performance_Sorting

如何在Java中对文本文件中带引号的字符串进行排序

java performance sorting

如何在Java中对文本文件中带引号的字符串进行排序,java,performance,sorting,Java,Performance,Sorting,我正在尝试阅读引用字符串列表，例如 "GJKFMN","OUYTV","VFRN","APLUI","DCFUYT","DXSER","JHGF","PIUYT","XSQ" 从文本文件中，按字母顺序对单词进行排序。我还想用sayA=1，B=2，…的形式对这些单词进行评分，并对每个单词的字母表求和我尝试了下面的代

我正在尝试阅读引用字符串列表，例如

"GJKFMN","OUYTV","VFRN","APLUI","DCFUYT","DXSER","JHGF","PIUYT","XSQ"

从文本文件中，按字母顺序对单词进行排序。我还想用say

A=1，B=2，…

的形式对这些单词进行评分，并对每个单词的字母表求和

我尝试了下面的代码进行排序，但它没有为我排序：

public static void main(String[] args){
    String filePath = null;
    if (args[0] == null || args[0].isEmpty()) {
        System.out.println("Please Enter the Names File Path Enclosed in Double Quotes");
    }
    else {
        filePath = args[0];
    }
    List<String> bufferList = loadDataUsingBufferReader(filePath);
    List<String> listWithoutQuotes = removeQuotes(bufferList);
    listWithoutQuotes.parallelStream().map(String::toUpperCase).sorted().forEach(System.out::println);
}
public static List<String> removeQuotes(List<String> listWithQoutes) {
    listWithQoutes = listWithQoutes.stream().map(s -> s.replaceAll("\"", "")).collect(Collectors.toList());
    return listWithQoutes;
}
public static List<String> loadDataUsingBufferReader(String filePath) {
    final Charset ENCODING = StandardCharsets.UTF_8;
    List<String> lines = new LinkedList<>();
    try {
        final BufferedReader in = new BufferedReader(
                new InputStreamReader(new FileInputStream(filePath), ENCODING));
        String line;
        while ((line = in.readLine()) != null) {
            lines.add(line);
        }
        in.close();
    } catch (final IOException e) {
        e.printStackTrace();
    }
    return lines;
}

publicstaticvoidmain（字符串[]args）{
字符串filePath=null；
if（args[0]==null | | args[0].isEmpty（））{
System.out.println（“请输入用双引号括起来的文件路径名称”）；
}
否则{
filePath=args[0]；
}
List bufferList=loadDataUsingBufferReader（文件路径）；
ListListWithoutQuotes=removeQuotes（bufferList）；
listWithoutQuotes.parallelStream（）.map（String:：toUpperCase）.sorted（）.forEach（System.out:：println）；
}
公共静态列表removeQuotes（带有QOUTES的列表）{
listWithQoutes=listWithQoutes.stream（）.map（s->s.replaceAll（“\”，”）.collect（Collectors.toList（））；
返回带有QOUTES的列表；
}
公共静态列表loadDataUsingBufferReader（字符串文件路径）{
最终字符集编码=StandardCharsets.UTF_8；
列表行=新建LinkedList（）；
试一试{
最终BufferedReader in=新的BufferedReader(
新的InputStreamReader（新的FileInputStream（filePath），编码））；
弦线；
而（（line=in.readLine（））！=null）{
行。添加（行）；
}
in.close（）；
}捕获（最终IOE例外）{
e、 printStackTrace（）；
}
回流线；
}

在代码中，我从命令行读取文件路径。当我硬编码输入时，它会对其进行排序，但当我从文件中读取时，它不会。性能是一个关键因素，因为文件可能大到包含数百万个字

提前感谢您的帮助…

使用以下测试数据，您只需将粘贴复制到文本文件中并将其用作示例文件即可

"DSRD","KJHT","BFXXX","OUYTP"
"ABCD","XSHTKK","RTZI","HKLOPQ"
"BGTSZ","ASY","LOMCV","DESRAW"
"VMWEE","ERTZU","GSDFX","BHGFD"
"CD","FRTZU","JUHL","RETZ"

类似于下面的内容应该可以工作。我希望方法名称是不言自明的，并且每个步骤中发生的事情都很清楚。我包含了一些println语句作为调试帮助。如果您使用的原始文件可能非常大，您应该删除它们

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class Example {

    public static void main(String args[]) throws IOException {
        String filePath = null;
        if (args[0] == null || args[0].isEmpty()) {
            System.out.println("Please Enter the Names File Path Enclosed in Double Quotes");
        }
        else {
            filePath = args[0];
        }

        List<String> allLines = readAllLinesFromFile(filePath);
        allLines.forEach(System.out::println);
        System.out.println("**********************");

        List<String> listWithoutQuotes = removeQuotes(allLines);
        listWithoutQuotes.forEach(System.out::println);
        System.out.println("*****************");

        List<String> allWords = getAllWordsFromEachLineSorted(listWithoutQuotes);
        System.out.println(allWords);
        System.out.println("****************");

        List<Integer> scores = calculateStoreForAList(allWords);
        System.out.println(scores);
    }
    static List<String> readAllLinesFromFile(String fileName) throws IOException{
        return Files.readAllLines(Paths.get(fileName));
    }
    public static List<String> removeQuotes(List<String> listWithQoutes) {
        return listWithQoutes.stream()
                .map(s -> s.replaceAll("\"", ""))
                .collect(Collectors.toList());
    }
    public static List<String> getAllWordsFromEachLineSorted(List<String> lines) {
        return lines.stream()
                .map(s -> s.split("\\s*,\\s*"))
                .flatMap(Arrays::stream)
                .sorted()
                .collect(Collectors.toList());
    }

    static int calculateScore(String word){
        return word.chars()
                .map(i -> i-64)
                .sum();
    }
    static List<Integer> calculateStoreForAList(List<String> allWords){
        return allWords.stream()
                .map(str -> calculateScore(str))
                .collect(Collectors.toList());
    }
}

使用以下测试数据，您只需将粘贴复制到文本文件并将其用作示例文件

"DSRD","KJHT","BFXXX","OUYTP"
"ABCD","XSHTKK","RTZI","HKLOPQ"
"BGTSZ","ASY","LOMCV","DESRAW"
"VMWEE","ERTZU","GSDFX","BHGFD"
"CD","FRTZU","JUHL","RETZ"

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class Example {

    public static void main(String args[]) throws IOException {
        String filePath = null;
        if (args[0] == null || args[0].isEmpty()) {
            System.out.println("Please Enter the Names File Path Enclosed in Double Quotes");
        }
        else {
            filePath = args[0];
        }

        List<String> allLines = readAllLinesFromFile(filePath);
        allLines.forEach(System.out::println);
        System.out.println("**********************");

        List<String> listWithoutQuotes = removeQuotes(allLines);
        listWithoutQuotes.forEach(System.out::println);
        System.out.println("*****************");

        List<String> allWords = getAllWordsFromEachLineSorted(listWithoutQuotes);
        System.out.println(allWords);
        System.out.println("****************");

        List<Integer> scores = calculateStoreForAList(allWords);
        System.out.println(scores);
    }
    static List<String> readAllLinesFromFile(String fileName) throws IOException{
        return Files.readAllLines(Paths.get(fileName));
    }
    public static List<String> removeQuotes(List<String> listWithQoutes) {
        return listWithQoutes.stream()
                .map(s -> s.replaceAll("\"", ""))
                .collect(Collectors.toList());
    }
    public static List<String> getAllWordsFromEachLineSorted(List<String> lines) {
        return lines.stream()
                .map(s -> s.split("\\s*,\\s*"))
                .flatMap(Arrays::stream)
                .sorted()
                .collect(Collectors.toList());
    }

    static int calculateScore(String word){
        return word.chars()
                .map(i -> i-64)
                .sum();
    }
    static List<Integer> calculateStoreForAList(List<String> allWords){
        return allWords.stream()
                .map(str -> calculateScore(str))
                .collect(Collectors.toList());
    }
}

从文本文件中删除双引号后，我将执行以下步骤

将整个文件作为一个字符串读取：

Path path = FileSystems.getDefault().getPath(directory, filename);
String fileContent = new String(Files.readAllBytes(path), StandardCharsets.UTF_8);

将内容拆分为单词，因为您有标准分隔符逗号：

String[] words = fileContent.split(",");

然后使用Arrays类内置方法对其进行排序：

Arrays.sort(words);

要计算每个单词的分数：大写字母“A”的ascii十进制值为65，因此，如果您从每个字母的ascii十进制值中减去64，您将找到分数。例如：

String abc = "ABC";
int sum = 0;

for (int i = 0; i < abc.length(); ++i){
    sum += (int) abc.charAt(i) - 64;
}

String abc=“abc”；
整数和=0；
对于（int i=0；i


这里的sum
值是6。
从文本文件中删除双引号后，我将执行以下步骤
将整个文件作为一个字符串读取：
Path path = FileSystems.getDefault().getPath(directory, filename);
String fileContent = new String(Files.readAllBytes(path), StandardCharsets.UTF_8);

将内容拆分为单词，因为您有标准分隔符逗号：
String[] words = fileContent.split(",");

然后使用Arrays类内置方法对其进行排序：
Arrays.sort(words);

要计算每个单词的分数：大写字母“A”的ascii十进制值为65，因此，如果您从每个字母的ascii十进制值中减去64，您将找到分数。例如：
String abc = "ABC";
int sum = 0;

for (int i = 0; i < abc.length(); ++i){
    sum += (int) abc.charAt(i) - 64;
} 

String abc=“abc”；
整数和=0；
对于（int i=0；i

这里的sum
值是6。
我想知道Java 8 streams的排序方法是否比简单的Collections.sort（lst，new SortIgnoreCase（））；
，其中类“SortIgnoreCase（）”做了一个“toLowerCase（）.compareTo（）”来谈论“最高效”是不正确的“如果您实现的算法根本不起作用。我已经调整了标题，使其更准确地反映了您的实际要求。解决方案：我建议您在获得编程经验之前不要使用stream。为什么？因为与普通循环相比，调试流可能非常困难。如果使用普通循环重写代码，并在调试器中查看对象的值，您将解释问题的原因。@paulsm4我明白您的意思，但是文件很大，如果说stream api应用并行性后处理大文件的速度会更快，这是错误的吗？问：如果说stream api应用并行性后处理大文件的速度会更快，这是错误的吗？答：是的，那是一个错误的说法。Java8流很好，功能强大，有很多重要的用途。但是仅仅因为单词“parallel”就让文件加载“更快”——不！。这就像“经典”Java的说法：“线程越多，程序运行得越快。”这根本不是真的。它还将“速度”与“响应性”混为一谈（尽管您的示例可能两者都没有改进）。我很好奇，您的Java 8 streams排序方法是否比简单的Collections.sort（lst，new SortIgnoreCase（））更有效，其中类“SortIgnoreCase（）”执行“toLowerCase（）.compareTo（）”如果实现的算法根本不起作用，那么谈论“最有效”的算法是不正确的。我已经调整了标题，使其更准确地反映了您的实际要求。解决方案：我建议您在获得编程经验之前不要使用stream。为什么？因为与普通循环相比，调试流可能非常困难。如果使用普通循环重写代码，并在调试器中查找