Algorithm 有人有一个好的正确的案例算法吗
是否有人拥有可信的正确案例或PCase算法(类似于UCase或更高版本)?我正在寻找具有值的东西,例如Algorithm 有人有一个好的正确的案例算法吗,algorithm,string,Algorithm,String,是否有人拥有可信的正确案例或PCase算法(类似于UCase或更高版本)?我正在寻找具有值的东西,例如“GEORGE BURDELL”或“GEORGE BURDELL”,并将其转换为“GEORGE BURDELL” 我有一个简单的程序来处理简单的案例。理想的情况是拥有一些能够处理诸如“O'REILLY”之类的事情的东西,并将其转化为“O'REILLY”,但我知道这更难 我主要集中在英语上,如果这能简化事情的话 更新:我使用C作为语言,但我几乎可以从任何东西进行转换(假设存在类似的功能) 我同意
“GEORGE BURDELL”
或“GEORGE BURDELL”
,并将其转换为“GEORGE BURDELL”
我有一个简单的程序来处理简单的案例。理想的情况是拥有一些能够处理诸如“O'REILLY”
之类的事情的东西,并将其转化为“O'REILLY”
,但我知道这更难
我主要集中在英语上,如果这能简化事情的话
更新:我使用C作为语言,但我几乎可以从任何东西进行转换(假设存在类似的功能)
我同意麦当劳快餐店是一家很难的餐厅。我本想在我的O'Reilly示例中提到这一点,但在最初的帖子中没有提到。您使用什么编程语言?许多语言允许回调函数进行正则表达式匹配。可以使用这些属性轻松地为匹配设置属性。将使用的正则表达式非常简单,只需匹配所有单词字符,如下所示:
/\w+/
或者,您可以提取第一个字符作为额外匹配:
/(\w)(\w*)/
现在,您可以分别访问匹配中的第一个字符和后续字符。然后,回调函数可以简单地返回命中的串联。在伪Python中(我实际上不懂Python):
顺便说一句,这也将处理“O'Reilly”的情况,因为“O”和“Reilly”将分别匹配,并且都是适当的。然而,还有其他一些特殊情况没有被算法很好地处理,例如“McDonald's”或任何省略的单词。该算法将为后者生成“麦当劳”。可以对撇号进行特殊处理,但这会干扰第一种情况。找到一个理论上完美的解决方案是不可能的。实际上,它可能有助于考虑撇号后面部分的长度。一种简单的方法,可以大写每个单词的第一个字母(用空格分隔)
$words=explode(“,$string);
对于($i=0;$i您没有提到您希望解决方案使用哪种语言,因此这里有一些伪代码
Loop through each character
If the previous character was an alphabet letter
Make the character lower case
Otherwise
Make the character upper case
End loop
还有一个简洁的Perl脚本用于标题框文本
!/usr/bin/perl
#此过滤器将所有单词更改为标题大写,并尝试变得更聪明
#关于将输入中的a/an/the等小词大写为*un*。
#
#没有封顶的“小词”列表来自
#《纽约时报》风格手册,加上“vs”和“v”。
#
#2008年5月10日
#John Gruber的原始版本:
# http://daringfireball.net/2008/05/title_case
#
#2008年7月28日
#由亚里士多德·帕格尔茨重新编写并改进:
# http://plasmasturm.org/code/titlecase/
#
#在____结束时的完整更改日志。
#
#许可证:http://www.opensource.org/licenses/mit-license.php
#
严格使用;
使用警告;
使用utf8;
使用开放qw(:编码(UTF-8):标准);
我的@small_words=qw((?在飞行中
{
\b
(?
但听起来你的意思是…仅限于人名。除非我误解了你的问题,否则我认为你不需要自己动手,TextInfo类可以帮你
using System.Globalization;
CultureInfo.InvariantCulture.TextInfo.ToTitleCase("GeOrGE bUrdEll")
将返回“George Burdell。如果有特殊规定,您可以使用自己的文化
更新:(在对该答案的评论中)指出,如果输入都是大写,这将不起作用,因为该方法将假定它是首字母缩略词。简单的解决方法是在将文本提交给ToTitleCase之前将其.ToLower()下载。下面是一个可能简单的C#实现:-
public class ProperCaseHelper {
public string ToProperCase(string input) {
string ret = string.Empty;
var words = input.Split(' ');
for (int i = 0; i < words.Length; ++i) {
ret += wordToProperCase(words[i]);
if (i < words.Length - 1) ret += " ";
}
return ret;
}
private string wordToProperCase(string word) {
if (string.IsNullOrEmpty(word)) return word;
// Standard case
string ret = capitaliseFirstLetter(word);
// Special cases:
ret = properSuffix(ret, "'");
ret = properSuffix(ret, ".");
ret = properSuffix(ret, "Mc");
ret = properSuffix(ret, "Mac");
return ret;
}
private string properSuffix(string word, string prefix) {
if(string.IsNullOrEmpty(word)) return word;
string lowerWord = word.ToLower(), lowerPrefix = prefix.ToLower();
if (!lowerWord.Contains(lowerPrefix)) return word;
int index = lowerWord.IndexOf(lowerPrefix);
// If the search string is at the end of the word ignore.
if (index + prefix.Length == word.Length) return word;
return word.Substring(0, index) + prefix +
capitaliseFirstLetter(word.Substring(index + prefix.Length));
}
private string capitaliseFirstLetter(string word) {
return char.ToUpper(word[0]) + word.Substring(1).ToLower();
}
}
公共类属性帮助器{
公共字符串ToProperCase(字符串输入){
string ret=string.Empty;
变量字=输入。拆分(“”);
for(int i=0;i
Kronoz,谢谢。我在您的函数中发现行:
`if (!lowerWord.Contains(lowerPrefix)) return word`;
不得不说
if (!lowerWord.StartsWith(lowerPrefix)) return word;
所以“información”没有改为“información”
最好的
Enrique我使用它作为文本框的textchanged事件处理程序。支持输入“McDonald”
@扎克:我会把它作为一个单独的回复贴出来
下面是一个基于kronoz的帖子的例子
void Main()
{
List<string> names = new List<string>() {
"bill o'reilly",
"johannes diderik van der waals",
"mr. moseley-williams",
"Joe VanWyck",
"mcdonald's",
"william the third",
"hrh prince charles",
"h.r.m. queen elizabeth the third",
"william gates, iii",
"pope leo xii",
"a.k. jennings"
};
names.Select(name => name.ToProperCase()).Dump();
}
// http://stackoverflow.com/questions/32149/does-anyone-have-a-good-proper-case-algorithm
public static class ProperCaseHelper
{
public static string ToProperCase(this string input)
{
if (IsAllUpperOrAllLower(input))
{
// fix the ALL UPPERCASE or all lowercase names
return string.Join(" ", input.Split(' ').Select(word => wordToProperCase(word)));
}
else
{
// leave the CamelCase or Propercase names alone
return input;
}
}
public static bool IsAllUpperOrAllLower(this string input)
{
return (input.ToLower().Equals(input) || input.ToUpper().Equals(input));
}
private static string wordToProperCase(string word)
{
if (string.IsNullOrEmpty(word)) return word;
// Standard case
string ret = capitaliseFirstLetter(word);
// Special cases:
ret = properSuffix(ret, "'"); // D'Artagnon, D'Silva
ret = properSuffix(ret, "."); // ???
ret = properSuffix(ret, "-"); // Oscar-Meyer-Weiner
ret = properSuffix(ret, "Mc", t => t.Length > 4); // Scots
ret = properSuffix(ret, "Mac", t => t.Length > 5); // Scots except Macey
// Special words:
ret = specialWords(ret, "van"); // Dick van Dyke
ret = specialWords(ret, "von"); // Baron von Bruin-Valt
ret = specialWords(ret, "de");
ret = specialWords(ret, "di");
ret = specialWords(ret, "da"); // Leonardo da Vinci, Eduardo da Silva
ret = specialWords(ret, "of"); // The Grand Old Duke of York
ret = specialWords(ret, "the"); // William the Conqueror
ret = specialWords(ret, "HRH"); // His/Her Royal Highness
ret = specialWords(ret, "HRM"); // His/Her Royal Majesty
ret = specialWords(ret, "H.R.H."); // His/Her Royal Highness
ret = specialWords(ret, "H.R.M."); // His/Her Royal Majesty
ret = dealWithRomanNumerals(ret); // William Gates, III
return ret;
}
private static string properSuffix(string word, string prefix, Func<string, bool> condition = null)
{
if (string.IsNullOrEmpty(word)) return word;
if (condition != null && ! condition(word)) return word;
string lowerWord = word.ToLower();
string lowerPrefix = prefix.ToLower();
if (!lowerWord.Contains(lowerPrefix)) return word;
int index = lowerWord.IndexOf(lowerPrefix);
// If the search string is at the end of the word ignore.
if (index + prefix.Length == word.Length) return word;
return word.Substring(0, index) + prefix +
capitaliseFirstLetter(word.Substring(index + prefix.Length));
}
private static string specialWords(string word, string specialWord)
{
if (word.Equals(specialWord, StringComparison.InvariantCultureIgnoreCase))
{
return specialWord;
}
else
{
return word;
}
}
private static string dealWithRomanNumerals(string word)
{
// Roman Numeral parser thanks to [Hannobo](https://stackoverflow.com/users/785111/hannobo)
// Note that it excludes the Chinese last name Xi
return new Regex(@"\b(?!Xi\b)(X|XX|XXX|XL|L|LX|LXX|LXXX|XC|C)?(I|II|III|IV|V|VI|VII|VIII|IX)?\b", RegexOptions.IgnoreCase).Replace(word, match => match.Value.ToUpperInvariant());
}
private static string capitaliseFirstLetter(string word)
{
return char.ToUpper(word[0]) + word.Substring(1).ToLower();
}
}
void Main()
{
列表名称=新列表(){
“比尔·奥雷利”,
“约翰·迪德里克·范德瓦尔斯”,
“莫斯利·威廉姆斯先生”,
“乔·范威克”,
“麦当劳”,
“威廉三世”,
“hrh查尔斯王子”,
“h.r.m.伊丽莎白女王三世”,
“威廉·盖茨三世”,
“教皇利奥十二世”,
“a.k.詹宁斯”
};
name.Select(name=>name.topropertase()).Dump();
}
// http://stackoverflow.com/questions/32149/does-anyone-have-a-good-proper-case-algorithm
公共静态类propertAcceHelper
{
P
public class ProperCaseHelper {
public string ToProperCase(string input) {
string ret = string.Empty;
var words = input.Split(' ');
for (int i = 0; i < words.Length; ++i) {
ret += wordToProperCase(words[i]);
if (i < words.Length - 1) ret += " ";
}
return ret;
}
private string wordToProperCase(string word) {
if (string.IsNullOrEmpty(word)) return word;
// Standard case
string ret = capitaliseFirstLetter(word);
// Special cases:
ret = properSuffix(ret, "'");
ret = properSuffix(ret, ".");
ret = properSuffix(ret, "Mc");
ret = properSuffix(ret, "Mac");
return ret;
}
private string properSuffix(string word, string prefix) {
if(string.IsNullOrEmpty(word)) return word;
string lowerWord = word.ToLower(), lowerPrefix = prefix.ToLower();
if (!lowerWord.Contains(lowerPrefix)) return word;
int index = lowerWord.IndexOf(lowerPrefix);
// If the search string is at the end of the word ignore.
if (index + prefix.Length == word.Length) return word;
return word.Substring(0, index) + prefix +
capitaliseFirstLetter(word.Substring(index + prefix.Length));
}
private string capitaliseFirstLetter(string word) {
return char.ToUpper(word[0]) + word.Substring(1).ToLower();
}
}
`if (!lowerWord.Contains(lowerPrefix)) return word`;
if (!lowerWord.StartsWith(lowerPrefix)) return word;
Public Shared Function DoProperCaseConvert(ByVal str As String, Optional ByVal allowCapital As Boolean = True) As String
Dim strCon As String = ""
Dim wordbreak As String = " ,.1234567890;/\-()#$%^&*€!~+=@"
Dim nextShouldBeCapital As Boolean = True
'Improve to recognize all caps input
'If str.Equals(str.ToUpper) Then
' str = str.ToLower
'End If
For Each s As Char In str.ToCharArray
If allowCapital Then
strCon = strCon & If(nextShouldBeCapital, s.ToString.ToUpper, s)
Else
strCon = strCon & If(nextShouldBeCapital, s.ToString.ToUpper, s.ToLower)
End If
If wordbreak.Contains(s.ToString) Then
nextShouldBeCapital = True
Else
nextShouldBeCapital = False
End If
Next
Return strCon
End Function
void Main()
{
List<string> names = new List<string>() {
"bill o'reilly",
"johannes diderik van der waals",
"mr. moseley-williams",
"Joe VanWyck",
"mcdonald's",
"william the third",
"hrh prince charles",
"h.r.m. queen elizabeth the third",
"william gates, iii",
"pope leo xii",
"a.k. jennings"
};
names.Select(name => name.ToProperCase()).Dump();
}
// http://stackoverflow.com/questions/32149/does-anyone-have-a-good-proper-case-algorithm
public static class ProperCaseHelper
{
public static string ToProperCase(this string input)
{
if (IsAllUpperOrAllLower(input))
{
// fix the ALL UPPERCASE or all lowercase names
return string.Join(" ", input.Split(' ').Select(word => wordToProperCase(word)));
}
else
{
// leave the CamelCase or Propercase names alone
return input;
}
}
public static bool IsAllUpperOrAllLower(this string input)
{
return (input.ToLower().Equals(input) || input.ToUpper().Equals(input));
}
private static string wordToProperCase(string word)
{
if (string.IsNullOrEmpty(word)) return word;
// Standard case
string ret = capitaliseFirstLetter(word);
// Special cases:
ret = properSuffix(ret, "'"); // D'Artagnon, D'Silva
ret = properSuffix(ret, "."); // ???
ret = properSuffix(ret, "-"); // Oscar-Meyer-Weiner
ret = properSuffix(ret, "Mc", t => t.Length > 4); // Scots
ret = properSuffix(ret, "Mac", t => t.Length > 5); // Scots except Macey
// Special words:
ret = specialWords(ret, "van"); // Dick van Dyke
ret = specialWords(ret, "von"); // Baron von Bruin-Valt
ret = specialWords(ret, "de");
ret = specialWords(ret, "di");
ret = specialWords(ret, "da"); // Leonardo da Vinci, Eduardo da Silva
ret = specialWords(ret, "of"); // The Grand Old Duke of York
ret = specialWords(ret, "the"); // William the Conqueror
ret = specialWords(ret, "HRH"); // His/Her Royal Highness
ret = specialWords(ret, "HRM"); // His/Her Royal Majesty
ret = specialWords(ret, "H.R.H."); // His/Her Royal Highness
ret = specialWords(ret, "H.R.M."); // His/Her Royal Majesty
ret = dealWithRomanNumerals(ret); // William Gates, III
return ret;
}
private static string properSuffix(string word, string prefix, Func<string, bool> condition = null)
{
if (string.IsNullOrEmpty(word)) return word;
if (condition != null && ! condition(word)) return word;
string lowerWord = word.ToLower();
string lowerPrefix = prefix.ToLower();
if (!lowerWord.Contains(lowerPrefix)) return word;
int index = lowerWord.IndexOf(lowerPrefix);
// If the search string is at the end of the word ignore.
if (index + prefix.Length == word.Length) return word;
return word.Substring(0, index) + prefix +
capitaliseFirstLetter(word.Substring(index + prefix.Length));
}
private static string specialWords(string word, string specialWord)
{
if (word.Equals(specialWord, StringComparison.InvariantCultureIgnoreCase))
{
return specialWord;
}
else
{
return word;
}
}
private static string dealWithRomanNumerals(string word)
{
// Roman Numeral parser thanks to [Hannobo](https://stackoverflow.com/users/785111/hannobo)
// Note that it excludes the Chinese last name Xi
return new Regex(@"\b(?!Xi\b)(X|XX|XXX|XL|L|LX|LXX|LXXX|XC|C)?(I|II|III|IV|V|VI|VII|VIII|IX)?\b", RegexOptions.IgnoreCase).Replace(word, match => match.Value.ToUpperInvariant());
}
private static string capitaliseFirstLetter(string word)
{
return char.ToUpper(word[0]) + word.Substring(1).ToLower();
}
}
function name_title_case($str)
{
// name parts that should be lowercase in most cases
$ok_to_be_lower = array('av','af','da','dal','de','del','der','di','la','le','van','der','den','vel','von');
// name parts that should be lower even if at the beginning of a name
$always_lower = array('van', 'der');
// Create an array from the parts of the string passed in
$parts = explode(" ", mb_strtolower($str));
foreach ($parts as $part)
{
(in_array($part, $ok_to_be_lower)) ? $rules[$part] = 'nocaps' : $rules[$part] = 'caps';
}
// Determine the first part in the string
reset($rules);
$first_part = key($rules);
// Loop through and cap-or-dont-cap
foreach ($rules as $part => $rule)
{
if ($rule == 'caps')
{
// ucfirst() words and also takes into account apostrophes and hyphens like this:
// O'brien -> O'Brien || mary-kaye -> Mary-Kaye
$part = str_replace('- ','-',ucwords(str_replace('-','- ', $part)));
$c13n[] = str_replace('\' ', '\'', ucwords(str_replace('\'', '\' ', $part)));
}
else if ($part == $first_part && !in_array($part, $always_lower))
{
// If the first part of the string is ok_to_be_lower, cap it anyway
$c13n[] = ucfirst($part);
}
else
{
$c13n[] = $part;
}
}
$titleized = implode(' ', $c13n);
return trim($titleized);
}
public static String toProperName(String name)
{
if (name != null)
{
if (name.Length >= 2 && name.ToLower().Substring(0, 2) == "mc") // Changes mcdonald to "McDonald"
return "Mc" + Regex.Replace(name.ToLower().Substring(2), @"\b[a-z]", m => m.Value.ToUpper());
if (name.Length >= 3 && name.ToLower().Substring(0, 3) == "van") // Changes vanwinkle to "VanWinkle"
return "Van" + Regex.Replace(name.ToLower().Substring(3), @"\b[a-z]", m => m.Value.ToUpper());
return Regex.Replace(name.ToLower(), @"\b[a-z]", m => m.Value.ToUpper()); // Changes to title case but also fixes
// appostrophes like O'HARE or o'hare to O'Hare
}
return "";
}