Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/336.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/20.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何使用java删除给定文本文件的所有换行符和段落换行符?_Java_Regex_File - Fatal编程技术网

如何使用java删除给定文本文件的所有换行符和段落换行符?

如何使用java删除给定文本文件的所有换行符和段落换行符?,java,regex,file,Java,Regex,File,我有一个巨大的文本文件。 我想删除所有的换行符,并希望段落也被删除,并附加到前面的段落。我应该如何使用java实现它?我已经在java中使用了replaceALL(),但我一直坚持将段落附加到前面的段落中 预期产出: The Project Gutenberg EBook of The Complete Works of William Shakespeare, by William Shakespeare sn This eBook is for the use of anyone anyw

我有一个巨大的文本文件。 我想删除所有的换行符,并希望段落也被删除,并附加到前面的段落。我应该如何使用java实现它?我已经在java中使用了replaceALL(),但我一直坚持将段落附加到前面的段落中

预期产出:

The Project Gutenberg EBook of The Complete Works of William Shakespeare, by
William Shakespeare sn This eBook is for the use of anyone anywhere at no cost and with almost no restrictions whatsoever.  You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.org ** This is a COPYRIGHTED Project Gutenberg eBook, Details Below Please follow the copyright guidelines in this file.Title: The Complete Works of William Shakespeare Author: William Shakespeare Posting Date: September 1, 2011 [EBook #100]
Release Date: January, 1994 Language: English START OF THIS PROJECT GUTENBERG EBOOK COMPLETE WORKS--WILLIAM SHAKESPEARE Produced by World Library, Inc., from their Library of the Future This is the 100th Etext file presented by Project Gutenberg, and is presented in cooperation with World Library, Inc., from their Library of the Future and Shakespeare CDROMS.  Project Gutenberg often releases Etexts that are NOT placed in the Public Domain!! Shakespeare *This Etext has certain copyright implications you should read!*

如果只需要这些单词,可以搜索带有\w的单词并将它们连接起来

public static void main(String args[]) {
    final String input = "hello, how are you today how was school today, what did you have for food? this star needs to be removed ****";
    final String regex = "\\w+";
    final Matcher m = Pattern.compile(regex).matcher(input);

    String output = "";
    while (m.find()) {
        output += m.group(0)+" ";
    }
    System.out.println(output);
}
结果:

hello how are you today how was school today what did you have for food this star needs to be removed 

对于实际选项卡,换行符使用字符串文字转义。别忘了(在窗户上)回车

可以使用一些深奥的字符
uFEFF
,而不是
§

将转向

Good Morning,

How are you?
I am fine.
进入


张贴输入和预期输出的示例。也不要将文本/代码作为图像/链接()发布。使用选项更正您的帖子。@Pshemo我需要删除所有的换行符、标点符号,并将段落附加到前面的段落中。这是一个仍然不是很清楚的单一段落。您声称“所有的换行符”,但这意味着我们将得到一行,这不是这里的情况,因为您的预期输出有四行。您是如何认识到应该保留哪些行分隔符的?您还写道,所有标点符号都应该删除,但我们可以看到[linebreak]William Shakespeare的
之前的
,更不用说
发布日期:1994年1月
。很抱歉上次没有提供信息,假设输入为=你好,你今天好吗今天学校怎么样,你吃了什么?这颗星需要删除****我需要的输出应该是=你好,你今天怎么样,今天学校怎么样,你吃了什么食物这颗星需要删除请帮助@pshemetry
replaceAll(“\\s*\\R\\s*”,”)
hello how are you today how was school today what did you have for food this star needs to be removed 
String text = value.toString()
    .replaceAll("(\r?\n){2}", "§") // Two line breaks will become a real line break.
    .replaceAll("[\t\r\n]+", " ") // White space will become a real space.
    .replace("§", "\n"); // The real line breaks.
Good Morning,

How are you?
I am fine.
Good Morning,
How are you? I am fine.