Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/335.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/string/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 如何清理\n\t等中的字符串。?_Java_String - Fatal编程技术网

Java 如何清理\n\t等中的字符串。?

Java 如何清理\n\t等中的字符串。?,java,string,Java,String,这是我的示例字符串: “你好\n我是\t\n\n马可\t\n” 我想删除所有解码的白色字符。是否有任何通用解决方案不仅适用于\n\t?这将使用单个空格替换非word字符的字符运行。你不必知道你不想要什么角色。你只需说出你想要什么: import java.util.regex.Matcher; import java.util.regex.Pattern; class Test { public static void main(String[]args) { Str

这是我的示例字符串:

“你好\n我是\t\n\n马可\t\n”


我想删除所有解码的白色字符。是否有任何通用解决方案不仅适用于
\n\t

这将使用单个空格替换非word字符的字符运行。你不必知道你不想要什么角色。你只需说出你想要什么:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

class Test {
    public static void main(String[]args) {

        String data = "Hello\n I am\t \n \n Marco\t\n";

        data = data.replaceAll("[^\\w]+", " ");

        System.out.println(data);
    }
}
结果:

Hello I am Marco
正则表达式
“[^\\w]+”
表示匹配非单词字符的字符组。单词字符有A-Z、A-Z、0-9和“389;”。调用
replaceAll
表示用单个空格字符替换这些字符组中的每一个

如果这不是您想要的,您可以通过调整正则表达式和替换字符串来选择其他选项。例如,您可以在表达式
“[^\\w]+”
中保留空格,并将替换字符串更改为
,但在某些单词之间会有多个空格


通过将其他字符添加到
“[^\\w]+”
表达式中,可以将其添加到未删除的字符列表中。

只需将所有空白(即
\s+
)替换为

输出:

HelloIamMarco

您也可以使用java流,我认为这些流更可读:

String noWhitespace=“你好\n我是\t\n\n马可\t\n”.chars()
.filter(c->!Character.isWhitespace(c))
.collect(StringBuilder::new、StringBuilder::appendCodePoint、StringBuilder::append)
.toString();

我在Java中使用正则表达式处理空白方面运气不好(我不同意Java对空白的定义,当你开始处理Unicode字符时,它会变得很奇怪)。对于细粒度控制,我使用以下方法:

public static String strip(final String text)
{
    if ((text == null) || (text.length() == 0))
    {
        return text; // nothing to do
    }

    final StringBuilder str = new StringBuilder();

    for (char c : text.toCharArray())
    {
        switch (c)
        {
            // https://stackoverflow.com/a/4731164/2074605
            case ' ':  // '\u0020' SPACE
            case '\t': // '\u0009' CHARACTER TABULATION
            case '\n':
            case '\r':
            case '\f': // '\u000c'
            case '\u00a0': // NO-BREAK SPACE
            case '\u2002': // EN SPACE
            case '\u2003': // EM SPACE
            case '\u2009': // THIN SPACE
            case '\u200a': // HAIR SPACE
            case '\u000b': // vertical tab
            {
                break;
            }
            default:
            {
                str.append(c);
                break;
            }
        }
    }

    return str.toString();
}
这种方法也有助于轻松构建其他核心字符串实用程序(trim、normalize等)

例如:

/**
 * Normalizes text. This replaces multiple white spaces with a single character.
 * This preserves the first whitespace character but ignores following whitespace until a non-whitespace character is encountered.
 *
 * @param text The text to normalize.
 * @return The normalized text.
 */
public static String normalize(final String text)
{
    if (text == null)
    {
        return null;
    }

    final StringBuilder strbuf = new StringBuilder();

    boolean previousSpace = false;
    for (char c : text.toCharArray())
    {
        switch (c)
        {
            // https://stackoverflow.com/a/4731164/2074605
            case ' ':  // '\u0020' SPACE
            case '\t': // '\u0009' CHARACTER TABULATION
            case '\n':
            case '\r':
            case '\f': // '\u000c'
            case '\u00a0': // NO-BREAK SPACE
            case '\u2002': // EN SPACE
            case '\u2003': // EM SPACE
            case '\u2009': // THIN SPACE
            case '\u200a': // HAIR SPACE
            case '\u000b': // vertical tab
            {
                if (!previousSpace)
                {
                    strbuf.append(c);
                }
                previousSpace = true;
                break;
            }
            default:
            {
                strbuf.append(c);
                previousSpace = false;
                break;
            }
        }
    }

    return strbuf.toString();
}
以及:

以及:


我的工具箱课上有:

/**
     * This method formats a String. <br>
     * <br>
     * It places the first non-white space character at the left, and removes all extra spaces. <br>
     * So "&nbsp;a&nbsp;bc&nbsp;&nbsp;&nbsp;cd" will be returned as "a&nbsp;bc&nbsp;cd"
     * @param format
     */
    public static String stringLeftJustify( String theValue, JustifyFormat format )
    {
        char charArray[];

        try
        {
            charArray = theValue.toCharArray();
        }
        catch (NullPointerException e)
        {
            return "";
        }

        StringBuilder out = new StringBuilder( charArray.length + 1 );

        // remove any leading whitespace
        boolean isSpace = true;

        for (int c = 0; c < charArray.length; c++)
        {
            if (format == JustifyFormat.MULTI_LINE)
            {
                // leave CRLF for multi-line inputs
                if (!(charArray[c] == '\n' || charArray[c] == '\r') && Character.isWhitespace( charArray[c] ))
                {
                    if (!isSpace)
                        out.append( ' ' );

                    isSpace = true;
                }
                else
                {
                    out.append( charArray[c] );
                    isSpace = false;
                }
            }
            else
            {
                if (Character.isWhitespace( charArray[c] ))
                {
                    if (!isSpace)
                        out.append( ' ' );

                    isSpace = true;
                }
                else
                {
                    out.append( charArray[c] );
                    isSpace = false;
                }
            }
        }

        // remove trailing space
        if (isSpace && out.length() > 0)
        {
            String justified = out.toString();

            return justified.substring( 0, justified.length() - 1 );
        }

        return out.toString();
    }
/**
*此方法格式化字符串
*
*它将第一个非空白字符放置在左侧,并删除所有多余的空格
*因此,“bc cd”将作为“bc cd”返回 *@param格式 */ 公共静态字符串stringLeftJustify(字符串值,JustifyFormat格式) { charchararray[]; 尝试 { charArray=value.toCharArray(); } 捕获(NullPointerException e) { 返回“”; } StringBuilder out=新StringBuilder(charArray.length+1); //删除任何前导空格 布尔isSpace=true; for(int c=0;c0) { 字符串对齐=out.toString(); 返回justified.substring(0,justified.length()-1); } return out.toString(); }
使用带有
String.replaceAll()的正则表达式
是的,但使用正则表达式时,我需要知道所有字符,但我不确定它们的大小。我的链接中的第一个答案适用于您的example@pawel033-您对中提到的
\s+
有任何问题吗?@ArvindKumarAvinash是的,我有,我认为这可能与输入有关,但我不知道原因是什么。您可以将模式中的正则表达式简化为
\\W+
\\W
=
[^\\W]
)您可以通过解释您正在做的事情和原因来改进您的答案。例如“\w”是什么意思,为什么要将每个匹配项替换为“”,等等,如果文本中有“\b”作为上下文,该怎么办?
/**
 * Trims leading and trailing whitespace.
 * This method understands more forms of white space than String.trim().
 *
 * @param text The text to trim.
 * @return The trimmed text.
 */
public static String trim(final String text)
{
    if ((text == null) || (text.length() == 0))
    {
        return text; // nothing to do
    }

    // Find the first and last non-space characters in the text.
    Integer firstNonSpaceIdx = null;
    Integer lastNonSpaceIdx = null;

    int currentIdx = 0;

    for (char c : text.toCharArray())
    {
        switch (c)
        {
            // https://stackoverflow.com/a/4731164/2074605
            case ' ':  // '\u0020' SPACE
            case '\t': // '\u0009' CHARACTER TABULATION
            case '\n':
            case '\r':
            case '\f': // '\u000c'
            case '\u00a0': // NO-BREAK SPACE
            case '\u2002': // EN SPACE
            case '\u2003': // EM SPACE
            case '\u2009': // THIN SPACE
            case '\u200a': // HAIR SPACE
            case '\u000b': // vertical tab
            {
                break;
            }
            default:
            {
                if (firstNonSpaceIdx == null)
                {
                    firstNonSpaceIdx = currentIdx;
                }

                lastNonSpaceIdx = currentIdx;
                break;
            }
        }

        ++currentIdx;
    }

    if (firstNonSpaceIdx == null)
    {
        return text; // nothing to do
    }

    return text.substring(firstNonSpaceIdx, lastNonSpaceIdx + 1);
}
/**
 * Normalizes text. This replaces multiple white spaces with a single space character.
 * It also trims any whitespace from the beginning and end of the string.
 *
 * @param text The text to normalize.
 * @return The normalized text.
 */
public static String whitespaceToSingleSpace(final String text)
{
    if (text == null)
    {
        return null;
    }

    final StringBuilder strbuf = new StringBuilder();

    boolean previousSpace = false;
    for (char c : text.toCharArray())
    {
        switch (c)
        {
            // https://stackoverflow.com/a/4731164/2074605
            case ' ':  // '\u0020' SPACE
            case '\t': // '\u0009' CHARACTER TABULATION
            case '\n':
            case '\r':
            case '\f': // '\u000c'
            case '\u00a0': // NO-BREAK SPACE
            case '\u2002': // EN SPACE
            case '\u2003': // EM SPACE
            case '\u2009': // THIN SPACE
            case '\u200a': // HAIR SPACE
            case '\u000b': // vertical tab
            {
                if (!previousSpace)
                {
                    strbuf.append(' ');
                }
                previousSpace = true;
                break;
            }
            default:
            {
                strbuf.append(c);
                previousSpace = false;
                break;
            }
        }
    }

    return trim(strbuf.toString());
}
/**
     * This method formats a String. <br>
     * <br>
     * It places the first non-white space character at the left, and removes all extra spaces. <br>
     * So "&nbsp;a&nbsp;bc&nbsp;&nbsp;&nbsp;cd" will be returned as "a&nbsp;bc&nbsp;cd"
     * @param format
     */
    public static String stringLeftJustify( String theValue, JustifyFormat format )
    {
        char charArray[];

        try
        {
            charArray = theValue.toCharArray();
        }
        catch (NullPointerException e)
        {
            return "";
        }

        StringBuilder out = new StringBuilder( charArray.length + 1 );

        // remove any leading whitespace
        boolean isSpace = true;

        for (int c = 0; c < charArray.length; c++)
        {
            if (format == JustifyFormat.MULTI_LINE)
            {
                // leave CRLF for multi-line inputs
                if (!(charArray[c] == '\n' || charArray[c] == '\r') && Character.isWhitespace( charArray[c] ))
                {
                    if (!isSpace)
                        out.append( ' ' );

                    isSpace = true;
                }
                else
                {
                    out.append( charArray[c] );
                    isSpace = false;
                }
            }
            else
            {
                if (Character.isWhitespace( charArray[c] ))
                {
                    if (!isSpace)
                        out.append( ' ' );

                    isSpace = true;
                }
                else
                {
                    out.append( charArray[c] );
                    isSpace = false;
                }
            }
        }

        // remove trailing space
        if (isSpace && out.length() > 0)
        {
            String justified = out.toString();

            return justified.substring( 0, justified.length() - 1 );
        }

        return out.toString();
    }