Lua:缩写词

Lua:缩写词,lua,Lua,我正在寻找代码来返回输入的缩写,即“联邦调查局”应该返回FBI(最好没有o),也可以使用小写字母“联邦调查局”。我该怎么做呢? 谢谢像这样的东西可能有用: ("Federal bureau of Investigation") :gsub("of","") -- remove "of" :gsub("(%w)%S+%s*","%1") -- leave first character of a word :upper() -- convert to uppercase 这个返回“F

我正在寻找代码来返回输入的缩写,即“联邦调查局”应该返回FBI(最好没有o),也可以使用小写字母“联邦调查局”。我该怎么做呢?
谢谢

像这样的东西可能有用:

("Federal bureau of Investigation")
  :gsub("of","") -- remove "of"
  :gsub("(%w)%S+%s*","%1") -- leave first character of a word
  :upper() -- convert to uppercase
这个返回“FBI”

这个怎么样

do
   -- default list of words to exclude or reshape
   local stopwords = { }
   for w in ("a an and for of the to"):gmatch "%w+" do  stopwords[w] = ""  end
   -- abbreviating a phrase:
   function TLA( phrase, subst )
      subst = subst or stopwords
      -- first replace each word (incl. "'") by its abbreviation...
      -- (will leave spaces etc. in the string)
      phrase = phrase:gsub( "[%w']+", function( word )
         if not word:find "%U" then  return word  end -- OPTIONAL keep abbrevs
         word = word:lower()
         if subst[word] then  return subst[word]  end -- from substitution list
         return word:sub( 1, 1 ):upper( )             -- others: to first letter
      end )
      -- ...then remove all non-word characters
      return (phrase:gsub( "%W", "" ))
   end
end
它处理简单的情况:

TLA "Ministry Of Information"  --> "MI"
TLA "floating-point exception" --> "FPE"
可以处理一些特殊情况:

TLA "augmented BNF" --> "ABNF"
调整替换列表/将非空字符串放入也可能有用:

TLA "one way or the other" --> "OWOO"
TLA( "one way or the other", {} ) --> "OWOTO"
TLA( "Ministry Of Information", { of = "of" } ) --> "MofI"

local custom_subst = {
   ["for"] = "4", to = "2", ["and"] = "", one = "1", two = "2", -- ...
}
TLA "Ministry for Fear, Uncertainity and Doubt" --> "MFUD"
TLA( "Ministry for Fear, Uncertainity and Doubt", custom_subst ) --> "M4FUD"
TLA( "Two-factor authentication", custom_subst ) --> "2FA"
像往常一样

TLA( "there ain't no such thing as a free lunch", {} ) --> "TANSTAAFL"


–因此,除了替换列表之外,代码中还有很多东西可能需要修改。

显然,除非使用特定的词典,否则没有完美的通用解决方案,因为许多缩写不遵循一致的规则

例如,考虑BASIC=初学者的通用符号指令代码

(应该是BAPSIC而不是BASIC)

因此,尽管有一些限制,这里还有另一个可能的首字母缩略词生成器,它似乎适用于大多数“正常”情况

function acronym(s,ignore)
  ignore = ignore or
  {                                     --default list of words to ignore
  ['a'] = true, ['an'] = true, ['and'] = true, ['in'] = true, ['for'] = true,
  ['of'] = true, ['the'] = true, ['to'] = true, ['or'] = true,
  }

  local ans = {}
  for w in s:gmatch '[%w\']+' do
    if not ignore[w:lower()] then ans[#ans+1] = w:sub(1,1):upper() end
  end
  return table.concat(ans)
end

断言失败:
断言(TLA“WINE不是模拟器”==“WINE”)
:-)
function acronym(s,ignore)
  ignore = ignore or
  {                                     --default list of words to ignore
  ['a'] = true, ['an'] = true, ['and'] = true, ['in'] = true, ['for'] = true,
  ['of'] = true, ['the'] = true, ['to'] = true, ['or'] = true,
  }

  local ans = {}
  for w in s:gmatch '[%w\']+' do
    if not ignore[w:lower()] then ans[#ans+1] = w:sub(1,1):upper() end
  end
  return table.concat(ans)
end