Python 在第一个逗号处用超过10个单词分隔行/句子_Python_Bash_Text_Split

Python 在第一个逗号处用超过10个单词分隔行/句子

python bash text

Python 在第一个逗号处用超过10个单词分隔行/句子,python,bash,text,split,Python,Bash,Text,Split,我有下面的代码，每10个单词就拆分一行 #!/bin/bash while read line do counter=1; for word in $line do echo -n $word" "; if (($counter % 10 == 0)) then echo ""; fi let counter=counter+1; done done < input.txt 问题是分割点

我有下面的代码，每10个单词就拆分一行

    #!/bin/bash

while read line
do
counter=1;
    for word in $line
    do
        echo -n $word" ";
    if (($counter % 10 == 0))
      then
        echo "";
    fi
    let counter=counter+1;
    done
done < input.txt

问题是分割点是第10个单词。相反，我希望分割点是第一个逗号字符，仅用于超过10个单词的句子

例如：

第1行：来自测试行的短语，我想拆分，但我不知道如何拆分

到

第1行：来自测试行的短语，第二行：我想拆分，但我不知道如何拆分

如果找不到逗号字符，只需返回该行即可

谢谢

编辑：Python或Bash解决方案可以使用。

这里有一个简单的解决方案，可以检查字符串中的单词数。如果字符串中的字数超过10，则将拆分：

output = []
s = 'phrase from a test line, which I want to split, and I dont know how'
while len (s.split()) > 10:
    first_sent,s = s.split(',',1)
    output.append(first_sent)
output.append(s)

下面是一个简单的解决方案，它检查字符串中的单词数。如果字符串中的字数超过10，则将拆分：

output = []
s = 'phrase from a test line, which I want to split, and I dont know how'
while len (s.split()) > 10:
    first_sent,s = s.split(',',1)
    output.append(first_sent)
output.append(s)

我不确定你是想分成10个字还是15个字

如果你正在处理15个单词，只需将10替换为15即可

或者更清楚地说：

#! /bin/bash

awk -v OFS=, 'NF > 10{

    # enter this block iff words > 10

    # replace first occurence of , and additional space,
    # if any, with newline
    sub(/, */, ",\n", $0)
    print

}' input.txt

我不确定你是想分成10个字还是15个字

如果你正在处理15个单词，只需将10替换为15即可

或者更清楚地说：

#! /bin/bash

awk -v OFS=, 'NF > 10{

    # enter this block iff words > 10

    # replace first occurence of , and additional space,
    # if any, with newline
    sub(/, */, ",\n", $0)
    print

}' input.txt

一个更好的方法是使用awk测试15个或更多的单词，如果是的话，就用A\n代替A，例如

示例使用/输出

在文件中输入后，您将有：

$ awk 'NF >= 15 {sub (", ", ",\n")}1' file
phrase from a test line,
which I want to split, and I don't know how.

如果有大量行，awk将比shell循环快几个数量级

更好的方法是使用awk并测试15个或更多单词，如果是这样，只需用a\n替换a，例如

示例使用/输出

在文件中输入后，您将有：

$ awk 'NF >= 15 {sub (", ", ",\n")}1' file
phrase from a test line,
which I want to split, and I don't know how.

如果有大量行，awk将比shell循环快几个数量级

这是问题的一个简单版本

简单的版本可以用

# For each line with 10 words append a newline after the first comma
sed -r '/((\w)+ ){10}/s/,/,\n/' input.txt

这是问题的一个简单版本

简单的版本可以用

# For each line with 10 words append a newline after the first comma
sed -r '/((\w)+ ){10}/s/,/,\n/' input.txt

您是在寻找python解决方案还是bash解决方案？我已经删除了python标记，直到您回答该标记在这里的相关性或者python解决方案是否可以接受为止。实际上，bash或python解决方案都可以接受。您是在寻找python解决方案还是bash解决方案？我已经删除了python标记，直到您回答该标记在此处的相关性，或者python解决方案是否可以接受。实际上，bash或python解决方案都可以接受。10/15足够接近，标题和文字让人有点困惑。英雄所见略同。我想我们的答案相隔不到5秒。@DavidC.Rankin，哈哈，真的，我在看到你的答案后也在想同样的问题，不过公平地说，你的答案更精确、更短、更高。两者都值得UV。10/15足够近，标题和文本让人有点困惑。英雄所见略同。我想我们的答案相隔不到5秒。@DavidC.Rankin，哈哈，真的，我在看到你的答案后也在想同样的问题，不过为了公平起见，你的答案更精确、更短、更高。两者都值得一试。