将句子的第一个单词大写（regex、gsub、gregexpr）_R

将句子的第一个单词大写（regex、gsub、gregexpr）

将句子的第一个单词大写（regex、gsub、gregexpr）,r,R,假设我有以下文本： txt <- as.character("this is just a test! i'm not sure if this is O.K. or if it will work? who knows. regex is sorta new to me.. There are certain cases that I may not figure out?? sad! ^_^") 哪些是匹配的正确子字符串索引但是，如何实现这一点以正确地将所需的字符大写？我假设

假设我有以下文本：

txt <- as.character("this is just a test! i'm not sure if this is O.K. or if it will work? who knows. regex is sorta new to me..  There are certain cases that I may not figure out??  sad!  ^_^")

哪些是匹配的正确子字符串索引

但是，如何实现这一点以正确地将所需的字符大写？我假设我必须

strsplit

然后

您的

regex

似乎不适用于您的示例，因此我从您的示例中偷了一个

txt使用可能会使这类任务稍微简单一些。这实现了merlin2011使用的相同正则表达式
txt <- as.character("this is just a test! i'm not sure if this is O.K. or if it will work? who knows. regex is sorta new to me..  There are certain cases that I may not figure out??  sad!  ^_^")

re <- rex(
  capture(name = 'first_letter', alnum),
  capture(name = 'sentence',
    any_non_puncts,
    zero_or_more(
      group(
        punct %if_next_isnt% space,
        any_non_puncts
        )
      ),
    maybe(punct)
    )
  )

re_substitutes(txt, re, "\\U\\1\\E\\2", global = TRUE)
#>[1] "This is just a test! I'm not sure if this is O.K. Or if it will work? Who knows. Regex is sorta new to me..  There are certain cases that I may not figure out??  Sad!  ^_^"

txt我对r一无所知，抱歉，但通常会得到第一个字符，将其加上大写，然后将其浓缩到[1:]（包含字符串其余部分的子字符串）…第一个相关问题会为您提供特定于r的信息。这真的很有帮助！！我不知道这个存在。
txt <- as.character("this is just a test! i'm not sure if this is O.K. or if it will work? who knows. regex is sorta new to me..  There are certain cases that I may not figure out??  sad!  ^_^")
print(txt)

gsub("([^.!?\\s])([^.!?]*(?:[.!?](?!['\"]?\\s|$)[^.!?]*)*[.!?]?['\"]?)(?=\\s|$)", "\\U\\1\\E\\2", txt, perl=T, useBytes = F)

txt <- as.character("this is just a test! i'm not sure if this is O.K. or if it will work? who knows. regex is sorta new to me..  There are certain cases that I may not figure out??  sad!  ^_^")

re <- rex(
  capture(name = 'first_letter', alnum),
  capture(name = 'sentence',
    any_non_puncts,
    zero_or_more(
      group(
        punct %if_next_isnt% space,
        any_non_puncts
        )
      ),
    maybe(punct)
    )
  )

re_substitutes(txt, re, "\\U\\1\\E\\2", global = TRUE)
#>[1] "This is just a test! I'm not sure if this is O.K. Or if it will work? Who knows. Regex is sorta new to me..  There are certain cases that I may not figure out??  Sad!  ^_^"