Java 缓冲读取器-删除标点符号

Java 缓冲读取器-删除标点符号,java,punctuation,buffered,Java,Punctuation,Buffered,我需要有关reader的帮助,它将删除标点符号和数字,并从输入中创建字符串数组 例如,在输入上,将有一个“example.txt”文件,其中包含如下内容: Hello 123 , I'am new example ... text file!" char[] alphabet= {'a','á','b','c','č','d','ď','e','é','ě','f','g','h', 'i','í','j','k','l','m','n','ň','o','ó','p'

我需要有关reader的帮助,它将删除标点符号和数字,并从输入中创建字符串数组

例如,在输入上,将有一个“example.txt”文件,其中包含如下内容:

Hello 123 , I'am new example ... text file!"
char[] alphabet= {'a','á','b','c','č','d','ď','e','é','ě','f','g','h',
            'i','í','j','k','l','m','n','ň','o','ó','p','q','r','ř','s','š','t','ť',
            'u','ú','ů','v','w','x','y','ý','z','ž','A','Á','B','C','Č','D','Ď','E','É',
            'Ě','F','G','H','I','Í','J','K','L','M','N','Ň','O','Ó','P','Q','R','Ř','S','Š','T',
            'Ť','U','Ú','Ů','V','W','X','Y','Ý','Z','Ž',' '};



                String vlozena = userInputScanner.nextLine();
                String fileContentsSingle = "";
                Integer lenght = vlozena.length();
                int j ;
                char cha;

                        /*
                         * kontroluje, zda se jedná o mezeru či písmeno české abecedy
                         * a poté jej přidá, pokud vyhovuje, do věty
                         */
                for (j = 0; j<lenght;j++) {
                    cha = vlozena.charAt(j);
                    for (char z : abeceda) {
                        if (cha == z) {
                            fileContentsSingle = fileContentsSingle + cha;
                        }
                    }
                }

                fileContentsSingle = fileContentsSingle.replaceAll("\\s+", " ");
                fileContentsSingle = fileContentsSingle.toLowerCase();
                String[] vetaNaArraySingle = fileContentsSingle.split("\\s+",-1);
我需要我的阅读器创建包含以下内容的数组:

String[] example = {"Hello", "I", "am", "new", "example", "text", "file"}
有没有办法删除标点和数字,并用缓冲读取器创建字符串数组

提前谢谢大家,, Fipkus.

使用
String.split(regex)

String regex
中,将必须删除的字符放入
String regex=“,0123456789\\”

另一种方法是使用StringTokenizer。它有一点限制性,但我更喜欢它,因为您只列出分隔符而不是regex,这更容易阅读

String test = "Hello 123 , I'am new example ... text file!";
ArrayList<String> exampleTemp = new ArrayList<>();
String[] example = new String[6];

StringTokenizer st = new StringTokenizer(test, " ,.1234567890!");
while(st.hasMoreTokens()) 
{
    exampleTemp.add(st.nextToken());
} 
exampleTemp.toArray(example);

for(String word : example)
{
    System.out.println(word);
}
String test=“您好123,我是新示例……文本文件!”;
ArrayList exampleTemp=新的ArrayList();
字符串[]示例=新字符串[6];
StringTokenizer st=新的StringTokenizer(测试,“,.1234567890!”);
而(st.hasMoreTokens())
{
示例临时添加(st.nextToken());
} 
示例toArray温度(示例);
for(字符串字:示例)
{
System.out.println(word);
}

编辑:我修改它以填充字符串数组。不确定是否存在空白问题。

最后,我这样解决了它:

Hello 123 , I'am new example ... text file!"
char[] alphabet= {'a','á','b','c','č','d','ď','e','é','ě','f','g','h',
            'i','í','j','k','l','m','n','ň','o','ó','p','q','r','ř','s','š','t','ť',
            'u','ú','ů','v','w','x','y','ý','z','ž','A','Á','B','C','Č','D','Ď','E','É',
            'Ě','F','G','H','I','Í','J','K','L','M','N','Ň','O','Ó','P','Q','R','Ř','S','Š','T',
            'Ť','U','Ú','Ů','V','W','X','Y','Ý','Z','Ž',' '};



                String vlozena = userInputScanner.nextLine();
                String fileContentsSingle = "";
                Integer lenght = vlozena.length();
                int j ;
                char cha;

                        /*
                         * kontroluje, zda se jedná o mezeru či písmeno české abecedy
                         * a poté jej přidá, pokud vyhovuje, do věty
                         */
                for (j = 0; j<lenght;j++) {
                    cha = vlozena.charAt(j);
                    for (char z : abeceda) {
                        if (cha == z) {
                            fileContentsSingle = fileContentsSingle + cha;
                        }
                    }
                }

                fileContentsSingle = fileContentsSingle.replaceAll("\\s+", " ");
                fileContentsSingle = fileContentsSingle.toLowerCase();
                String[] vetaNaArraySingle = fileContentsSingle.split("\\s+",-1);
char[]字母表={'a','a','b','c','c','c','c','c','d','ď','e','e','ě','f','g','h',
‘i’、‘i’、‘j’、‘k’、‘l’、‘m’、‘n’、‘o’、‘o’、‘o’、‘p’、‘q’、‘r’、‘ř’、‘s’、‘ť’、‘t’,
‘u’、‘u’、‘ů’、‘v’、‘w’、‘x’、‘y’、‘ý’、‘z’、‘ž’、‘A’、‘Á’、‘B’、‘C’、‘ž’、‘D’、‘Ď’、‘E’、‘"’、‘s、,
‘Ě’、‘F’、‘G’、‘H’、‘I’、‘Í’、‘J’、‘K’、‘L’、‘M’、‘N’、‘Ň’、‘O’、‘Ó’、‘P’、‘Q’、‘R’、‘Ř’、‘S’、‘Š’、‘T’,
Ť’、‘U’、‘Ú’、‘Ů’、‘V’、‘W’、‘X’、‘Y’、‘Ý’、‘Z’、‘Ž’、‘‘’;
字符串vlozena=userInputScanner.nextLine();
字符串fileContentsSingle=“”;
整数长度=vlozena.length();
int j;
查查;
/*
*孔特罗卢杰,杰德诺·梅泽鲁奇·皮斯梅诺·埃斯克塞地东南部
*potéjej přidápokud vyhovuje,do věty
*/

对于(j=0;jYes,有几种方法可以删除标点和数字,并使用BufferedReader创建字符串数组。您好,这会在字符串中创建不需要的空白,然后我无法从中创建字符串数组(按空格分割)。我修改它以返回数组。请提供更多详细信息,以便我们可以帮助您…发布一些代码:)