Javascript 有效地替换字符串中的所有重音字符?
对于一个穷人来说,在客户端实现近似排序正确的排序,我需要一个JavaScript函数来高效地替换字符串中的单个字符 以下是我的意思(注意,这适用于德语文本,其他语言排序不同): 首先,我不喜欢每次调用函数时都会重新生成正则表达式。我想在这方面,关闭可能会有所帮助,但出于某种原因,我似乎没有掌握窍门 有人能想出更有效的办法吗Javascript 有效地替换字符串中的所有重音字符?,javascript,sorting,string,collation,Javascript,Sorting,String,Collation,对于一个穷人来说,在客户端实现近似排序正确的排序,我需要一个JavaScript函数来高效地替换字符串中的单个字符 以下是我的意思(注意,这适用于德语文本,其他语言排序不同): 首先,我不喜欢每次调用函数时都会重新生成正则表达式。我想在这方面,关闭可能会有所帮助,但出于某种原因,我似乎没有掌握窍门 有人能想出更有效的办法吗 以下答案分为两类: 不同完整性和效率程度的字符串替换函数(我最初问的) 一个of,现在是JS引擎中的一个(在问题出现的时候不是那么多),可以更优雅地解决这类问题 我无法具体
以下答案分为两类:
我无法具体说明您试图对函数本身做些什么,但是如果您不喜欢每次都构建regex,这里有两种解决方案和一些注意事项 有一种方法可以做到这一点:
function makeSortString(s) {
if(!makeSortString.translate_re) makeSortString.translate_re = /[öäüÖÄÜ]/g;
var translate = {
"ä": "a", "ö": "o", "ü": "u",
"Ä": "A", "Ö": "O", "Ü": "U" // probably more to come
};
return ( s.replace(makeSortString.translate_re, function(match) {
return translate[match];
}) );
}
remove-accents("Thís ís ân accêntéd phráse");
这显然会使正则表达式成为函数本身的一个属性。关于这一点,您可能不喜欢(或者您可能不喜欢,我想这取决于)的唯一一点是,现在可以在函数体之外修改正则表达式。因此,有人可以这样做来修改内部使用的正则表达式:
makeSortString.translate_re = /[a-z]/g;
所以,有这个选择
获得闭包从而防止有人修改正则表达式的一种方法是将其定义为匿名函数赋值,如下所示:
var makeSortString = (function() {
var translate_re = /[öäüÖÄÜ]/g;
return function(s) {
var translate = {
"ä": "a", "ö": "o", "ü": "u",
"Ä": "A", "Ö": "O", "Ü": "U" // probably more to come
};
return ( s.replace(translate_re, function(match) {
return translate[match];
}) );
}
})();
$("table.sortable").tablesorter({
textExtraction: getTextExtractor()
});
希望这对你有用
更新:时间还早,我不知道为什么我以前没有看到明显的结果,但是将
translate
对象也放在一个闭包中也可能有用:
var makeSortString = (function() {
var translate_re = /[öäüÖÄÜ]/g;
var translate = {
"ä": "a", "ö": "o", "ü": "u",
"Ä": "A", "Ö": "O", "Ü": "U" // probably more to come
};
return function(s) {
return ( s.replace(translate_re, function(match) {
return translate[match];
}) );
}
})();
基于Jason Bunting的解决方案,下面是我现在使用的 这一切都是为了:为了使用tablesorter插件对非英语表进行(几乎正确的)排序,有必要使用定制的 这个:
- 将最常见的重音字母转换为非重音字母(支持的字母列表易于扩展)
- 将德语格式的日期(
)更改为可识别的格式('dd.mm.yyyy'
)'yyyy-mm-dd'
// file encoding must be UTF-8!
function getTextExtractor()
{
return (function() {
var patternLetters = /[öäüÖÄÜáàâéèêúùûóòôÁÀÂÉÈÊÚÙÛÓÒÔß]/g;
var patternDateDmy = /^(?:\D+)?(\d{1,2})\.(\d{1,2})\.(\d{2,4})$/;
var lookupLetters = {
"ä": "a", "ö": "o", "ü": "u",
"Ä": "A", "Ö": "O", "Ü": "U",
"á": "a", "à": "a", "â": "a",
"é": "e", "è": "e", "ê": "e",
"ú": "u", "ù": "u", "û": "u",
"ó": "o", "ò": "o", "ô": "o",
"Á": "A", "À": "A", "Â": "A",
"É": "E", "È": "E", "Ê": "E",
"Ú": "U", "Ù": "U", "Û": "U",
"Ó": "O", "Ò": "O", "Ô": "O",
"ß": "s"
};
var letterTranslator = function(match) {
return lookupLetters[match] || match;
}
return function(node) {
var text = $.trim($(node).text());
var date = text.match(patternDateDmy);
if (date)
return [date[3], date[2], date[1]].join("-");
else
return text.replace(patternLetters, letterTranslator);
}
})();
}
您可以这样使用它:
var makeSortString = (function() {
var translate_re = /[öäüÖÄÜ]/g;
return function(s) {
var translate = {
"ä": "a", "ö": "o", "ü": "u",
"Ä": "A", "Ö": "O", "Ü": "U" // probably more to come
};
return ( s.replace(translate_re, function(match) {
return translate[match];
}) );
}
})();
$("table.sortable").tablesorter({
textExtraction: getTextExtractor()
});
我制作了一个原型版本:
String.prototype.strip = function() {
var translate_re = /[öäüÖÄÜß ]/g;
var translate = {
"ä":"a", "ö":"o", "ü":"u",
"Ä":"A", "Ö":"O", "Ü":"U",
" ":"_", "ß":"ss" // probably more to come
};
return (this.replace(translate_re, function(match){
return translate[match];})
);
};
使用类似于:
var teststring = 'ä ö ü Ä Ö Ü ß';
teststring.strip();
这会将字符串更改为a_o_u_a_o_u ss我认为这可能会更干净/更好一些(尽管我还没有测试它的性能): 或者,如果您仍然过于担心性能,让我们充分利用这两个方面:
String.prototype.stripAccents = function() {
var in_chrs = 'àáâãäçèéêëìíîïñòóôõöùúûüýÿÀÁÂÃÄÇÈÉÊËÌÍÎÏÑÒÓÔÕÖÙÚÛÜÝ',
out_chrs = 'aaaaaceeeeiiiinooooouuuuyyAAAAACEEEEIIIINOOOOOUUUUY',
transl = {};
eval('var chars_rgx = /['+in_chrs+']/g');
for(var i = 0; i < in_chrs.length; i++){ transl[in_chrs.charAt(i)] = out_chrs.charAt(i); }
return this.replace(chars_rgx, function(match){
return transl[match]; });
};
String.prototype.stripeAccents=function(){
风险价值(var inÿchrs)=‘A’-风险价值(var)=‘A’-风险价值(var)=‘A)-风险价值’=‘A)-风险价值(var)=‘A)-风险价值’=‘A)-风险价值(var)=‘A)-风险价值(var)/风险价值’=‘A)/风险价值’=‘A)/风险价值,
out_chrs='aaaaaaa ceeeeiiinooouuuuuuyaaaaaaceeeeiiinouuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu,
transl={};
eval('var chars_rgx=/['+in_chrs+']/g');
对于(var i=0;i
编辑(作者@Tomalak) 我很欣赏这个主意。然而,正如下面的评论所述,该实现存在一些问题 下面是我将如何实现它
var stripAccents = (function () {
var in_chrs = 'àáâãäçèéêëìíîïñòóôõöùúûüýÿÀÁÂÃÄÇÈÉÊËÌÍÎÏÑÒÓÔÕÖÙÚÛÜÝ',
out_chrs = 'aaaaaceeeeiiiinooooouuuuyyAAAAACEEEEIIIINOOOOOUUUUY',
chars_rgx = new RegExp('[' + in_chrs + ']', 'g'),
transl = {}, i,
lookup = function (m) { return transl[m] || m; };
for (i=0; i<in_chrs.length; i++) {
transl[ in_chrs[i] ] = out_chrs[i];
}
return function (s) { return s.replace(chars_rgx, lookup); }
})();
var stripeAccents=(函数(){
风险价值(var inÿchrs)=‘A’-风险价值(var)=‘A’-风险价值(var)=‘A)-风险价值’=‘A)-风险价值(var)=‘A)-风险价值’=‘A)-风险价值(var)=‘A)-风险价值(var)/风险价值’=‘A)/风险价值’=‘A)/风险价值,
out_chrs='aaaaaaa ceeeeiiinooouuuuuuyaaaaaaceeeeiiinouuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu,
chars_rgx=new RegExp('['+in_chrs+']','g'),
transl={},i,
lookup=函数(m){return transl[m]| | m;};
对于(i=0;i,这里有一个基于Unicode标准的更完整版本,取自此处:
一些例子:
> "Piqué".latinize();
"Pique"
> "Piqué".isLatin();
false
> "Pique".isLatin();
true
> "Piqué".latinise().isLatin();
true
如果您想实现排序,其中“a”后面的“ä”不被视为相同,那么您可以使用类似于我的函数
您可以随时更改字母表以获得不同甚至奇怪的排序。但是,如果您希望某些字母是等效的,则必须操作字符串,如a=a.replace(/ä/,'a')
或类似内容,正如许多人已经在上面回复的那样。如果有人想在所有小写词之前使用所有大写词,我会将大写字母包括在内(然后您必须使用ommit.toLowerCase()
)
函数sortbyalphabet(a,b){
字母表;
a=a.toLowerCase();
b=b.toLowerCase();
shorterone=(a.length>b.length?a:b);
对于(i=0;iKierons解决方案的javascript直接端口::
还有一个稍加修改的版本,使用字符映射而不是两个数组:
为了比较这两种方法,我做了一个简单的基准测试:
您的请求的完整解决方案是:
function convert_accented_characters(str){
var conversions = new Object();
conversions['ae'] = 'ä|æ|ǽ';
conversions['oe'] = 'ö|œ';
conversions['ue'] = 'ü';
conversions['Ae'] = 'Ä';
conversions['Ue'] = 'Ü';
conversions['Oe'] = 'Ö';
conversions['A'] = 'À|Á|Â|Ã|Ä|Å|Ǻ|Ā|Ă|Ą|Ǎ';
conversions['a'] = 'à|á|â|ã|å|ǻ|ā|ă|ą|ǎ|ª';
conversions['C'] = 'Ç|Ć|Ĉ|Ċ|Č';
conversions['c'] = 'ç|ć|ĉ|ċ|č';
conversions['D'] = 'Ð|Ď|Đ';
conversions['d'] = 'ð|ď|đ';
conversions['E'] = 'È|É|Ê|Ë|Ē|Ĕ|Ė|Ę|Ě';
conversions['e'] = 'è|é|ê|ë|ē|ĕ|ė|ę|ě';
conversions['G'] = 'Ĝ|Ğ|Ġ|Ģ';
conversions['g'] = 'ĝ|ğ|ġ|ģ';
conversions['H'] = 'Ĥ|Ħ';
conversions['h'] = 'ĥ|ħ';
conversions['I'] = 'Ì|Í|Î|Ï|Ĩ|Ī|Ĭ|Ǐ|Į|İ';
conversions['i'] = 'ì|í|î|ï|ĩ|ī|ĭ|ǐ|į|ı';
conversions['J'] = 'Ĵ';
conversions['j'] = 'ĵ';
conversions['K'] = 'Ķ';
conversions['k'] = 'ķ';
conversions['L'] = 'Ĺ|Ļ|Ľ|Ŀ|Ł';
conversions['l'] = 'ĺ|ļ|ľ|ŀ|ł';
conversions['N'] = 'Ñ|Ń|Ņ|Ň';
conversions['n'] = 'ñ|ń|ņ|ň|ʼn';
conversions['O'] = 'Ò|Ó|Ô|Õ|Ō|Ŏ|Ǒ|Ő|Ơ|Ø|Ǿ';
conversions['o'] = 'ò|ó|ô|õ|ō|ŏ|ǒ|ő|ơ|ø|ǿ|º';
conversions['R'] = 'Ŕ|Ŗ|Ř';
conversions['r'] = 'ŕ|ŗ|ř';
conversions['S'] = 'Ś|Ŝ|Ş|Š';
conversions['s'] = 'ś|ŝ|ş|š|ſ';
conversions['T'] = 'Ţ|Ť|Ŧ';
conversions['t'] = 'ţ|ť|ŧ';
conversions['U'] = 'Ù|Ú|Û|Ũ|Ū|Ŭ|Ů|Ű|Ų|Ư|Ǔ|Ǖ|Ǘ|Ǚ|Ǜ';
conversions['u'] = 'ù|ú|û|ũ|ū|ŭ|ů|ű|ų|ư|ǔ|ǖ|ǘ|ǚ|ǜ';
conversions['Y'] = 'Ý|Ÿ|Ŷ';
conversions['y'] = 'ý|ÿ|ŷ';
conversions['W'] = 'Ŵ';
conversions['w'] = 'ŵ';
conversions['Z'] = 'Ź|Ż|Ž';
conversions['z'] = 'ź|ż|ž';
conversions['AE'] = 'Æ|Ǽ';
conversions['ss'] = 'ß';
conversions['IJ'] = 'IJ';
conversions['ij'] = 'ij';
conversions['OE'] = 'Œ';
conversions['f'] = 'ƒ';
for(var i in conversions){
var re = new RegExp(conversions[i],"g");
str = str.replace(re,i);
}
return str;
}
这种口音的正确术语是变音符号。在谷歌搜索这个术语后,我发现它是主干.paginator
的一部分。它有一个非常完整的变音符号集合,并用最直观的ascii字符替换它们。我发现这是目前可用的最完整的Javascript解决方案
完整功能供将来参考:
function removeDiacritics (str) {
var defaultDiacriticsRemovalMap = [
{'base':'A', 'letters':/[\u0041\u24B6\uFF21\u00C0\u00C1\u00C2\u1EA6\u1EA4\u1EAA\u1EA8\u00C3\u0100\u0102\u1EB0\u1EAE\u1EB4\u1EB2\u0226\u01E0\u00C4\u01DE\u1EA2\u00C5\u01FA\u01CD\u0200\u0202\u1EA0\u1EAC\u1EB6\u1E00\u0104\u023A\u2C6F]/g},
{'base':'AA','letters':/[\uA732]/g},
{'base':'AE','letters':/[\u00C6\u01FC\u01E2]/g},
{'base':'AO','letters':/[\uA734]/g},
{'base':'AU','letters':/[\uA736]/g},
{'base':'AV','letters':/[\uA738\uA73A]/g},
{'base':'AY','letters':/[\uA73C]/g},
{'base':'B', 'letters':/[\u0042\u24B7\uFF22\u1E02\u1E04\u1E06\u0243\u0182\u0181]/g},
{'base':'C', 'letters':/[\u0043\u24B8\uFF23\u0106\u0108\u010A\u010C\u00C7\u1E08\u0187\u023B\uA73E]/g},
{'base':'D', 'letters':/[\u0044\u24B9\uFF24\u1E0A\u010E\u1E0C\u1E10\u1E12\u1E0E\u0110\u018B\u018A\u0189\uA779]/g},
{'base':'DZ','letters':/[\u01F1\u01C4]/g},
{'base':'Dz','letters':/[\u01F2\u01C5]/g},
{'base':'E', 'letters':/[\u0045\u24BA\uFF25\u00C8\u00C9\u00CA\u1EC0\u1EBE\u1EC4\u1EC2\u1EBC\u0112\u1E14\u1E16\u0114\u0116\u00CB\u1EBA\u011A\u0204\u0206\u1EB8\u1EC6\u0228\u1E1C\u0118\u1E18\u1E1A\u0190\u018E]/g},
{'base':'F', 'letters':/[\u0046\u24BB\uFF26\u1E1E\u0191\uA77B]/g},
{'base':'G', 'letters':/[\u0047\u24BC\uFF27\u01F4\u011C\u1E20\u011E\u0120\u01E6\u0122\u01E4\u0193\uA7A0\uA77D\uA77E]/g},
{'base':'H', 'letters':/[\u0048\u24BD\uFF28\u0124\u1E22\u1E26\u021E\u1E24\u1E28\u1E2A\u0126\u2C67\u2C75\uA78D]/g},
{'base':'I', 'letters':/[\u0049\u24BE\uFF29\u00CC\u00CD\u00CE\u0128\u012A\u012C\u0130\u00CF\u1E2E\u1EC8\u01CF\u0208\u020A\u1ECA\u012E\u1E2C\u0197]/g},
{'base':'J', 'letters':/[\u004A\u24BF\uFF2A\u0134\u0248]/g},
{'base':'K', 'letters':/[\u004B\u24C0\uFF2B\u1E30\u01E8\u1E32\u0136\u1E34\u0198\u2C69\uA740\uA742\uA744\uA7A2]/g},
{'base':'L', 'letters':/[\u004C\u24C1\uFF2C\u013F\u0139\u013D\u1E36\u1E38\u013B\u1E3C\u1E3A\u0141\u023D\u2C62\u2C60\uA748\uA746\uA780]/g},
{'base':'LJ','letters':/[\u01C7]/g},
{'base':'Lj','letters':/[\u01C8]/g},
{'base':'M', 'letters':/[\u004D\u24C2\uFF2D\u1E3E\u1E40\u1E42\u2C6E\u019C]/g},
{'base':'N', 'letters':/[\u004E\u24C3\uFF2E\u01F8\u0143\u00D1\u1E44\u0147\u1E46\u0145\u1E4A\u1E48\u0220\u019D\uA790\uA7A4]/g},
{'base':'NJ','letters':/[\u01CA]/g},
{'base':'Nj','letters':/[\u01CB]/g},
{'base':'O', 'letters':/[\u004F\u24C4\uFF2F\u00D2\u00D3\u00D4\u1ED2\u1ED0\u1ED6\u1ED4\u00D5\u1E4C\u022C\u1E4E\u014C\u1E50\u1E52\u014E\u022E\u0230\u00D6\u022A\u1ECE\u0150\u01D1\u020C\u020E\u01A0\u1EDC\u1EDA\u1EE0\u1EDE\u1EE2\u1ECC\u1ED8\u01EA\u01EC\u00D8\u01FE\u0186\u019F\uA74A\uA74C]/g},
{'base':'OI','letters':/[\u01A2]/g},
{'base':'OO','letters':/[\uA74E]/g},
{'base':'OU','letters':/[\u0222]/g},
{'base':'P', 'letters':/[\u0050\u24C5\uFF30\u1E54\u1E56\u01A4\u2C63\uA750\uA752\uA754]/g},
{'base':'Q', 'letters':/[\u0051\u24C6\uFF31\uA756\uA758\u024A]/g},
{'base':'R', 'letters':/[\u0052\u24C7\uFF32\u0154\u1E58\u0158\u0210\u0212\u1E5A\u1E5C\u0156\u1E5E\u024C\u2C64\uA75A\uA7A6\uA782]/g},
{'base':'S', 'letters':/[\u0053\u24C8\uFF33\u1E9E\u015A\u1E64\u015C\u1E60\u0160\u1E66\u1E62\u1E68\u0218\u015E\u2C7E\uA7A8\uA784]/g},
{'base':'T', 'letters':/[\u0054\u24C9\uFF34\u1E6A\u0164\u1E6C\u021A\u0162\u1E70\u1E6E\u0166\u01AC\u01AE\u023E\uA786]/g},
{'base':'TZ','letters':/[\uA728]/g},
{'base':'U', 'letters':/[\u0055\u24CA\uFF35\u00D9\u00DA\u00DB\u0168\u1E78\u016A\u1E7A\u016C\u00DC\u01DB\u01D7\u01D5\u01D9\u1EE6\u016E\u0170\u01D3\u0214\u0216\u01AF\u1EEA\u1EE8\u1EEE\u1EEC\u1EF0\u1EE4\u1E72\u0172\u1E76\u1E74\u0244]/g},
{'base':'V', 'letters':/[\u0056\u24CB\uFF36\u1E7C\u1E7E\u01B2\uA75E\u0245]/g},
{'base':'VY','letters':/[\uA760]/g},
{'base':'W', 'letters':/[\u0057\u24CC\uFF37\u1E80\u1E82\u0174\u1E86\u1E84\u1E88\u2C72]/g},
{'base':'X', 'letters':/[\u0058\u24CD\uFF38\u1E8A\u1E8C]/g},
{'base':'Y', 'letters':/[\u0059\u24CE\uFF39\u1EF2\u00DD\u0176\u1EF8\u0232\u1E8E\u0178\u1EF6\u1EF4\u01B3\u024E\u1EFE]/g},
{'base':'Z', 'letters':/[\u005A\u24CF\uFF3A\u0179\u1E90\u017B\u017D\u1E92\u1E94\u01B5\u0224\u2C7F\u2C6B\uA762]/g},
{'base':'a', 'letters':/[\u0061\u24D0\uFF41\u1E9A\u00E0\u00E1\u00E2\u1EA7\u1EA5\u1EAB\u1EA9\u00E3\u0101\u0103\u1EB1\u1EAF\u1EB5\u1EB3\u0227\u01E1\u00E4\u01DF\u1EA3\u00E5\u01FB\u01CE\u0201\u0203\u1EA1\u1EAD\u1EB7\u1E01\u0105\u2C65\u0250]/g},
{'base':'aa','letters':/[\uA733]/g},
{'base':'ae','letters':/[\u00E6\u01FD\u01E3]/g},
{'base':'ao','letters':/[\uA735]/g},
{'base':'au','letters':/[\uA737]/g},
{'base':'av','letters':/[\uA739\uA73B]/g},
{'base':'ay','letters':/[\uA73D]/g},
{'base':'b', 'letters':/[\u0062\u24D1\uFF42\u1E03\u1E05\u1E07\u0180\u0183\u0253]/g},
{'base':'c', 'letters':/[\u0063\u24D2\uFF43\u0107\u0109\u010B\u010D\u00E7\u1E09\u0188\u023C\uA73F\u2184]/g},
{'base':'d', 'letters':/[\u0064\u24D3\uFF44\u1E0B\u010F\u1E0D\u1E11\u1E13\u1E0F\u0111\u018C\u0256\u0257\uA77A]/g},
{'base':'dz','letters':/[\u01F3\u01C6]/g},
{'base':'e', 'letters':/[\u0065\u24D4\uFF45\u00E8\u00E9\u00EA\u1EC1\u1EBF\u1EC5\u1EC3\u1EBD\u0113\u1E15\u1E17\u0115\u0117\u00EB\u1EBB\u011B\u0205\u0207\u1EB9\u1EC7\u0229\u1E1D\u0119\u1E19\u1E1B\u0247\u025B\u01DD]/g},
{'base':'f', 'letters':/[\u0066\u24D5\uFF46\u1E1F\u0192\uA77C]/g},
{'base':'g', 'letters':/[\u0067\u24D6\uFF47\u01F5\u011D\u1E21\u011F\u0121\u01E7\u0123\u01E5\u0260\uA7A1\u1D79\uA77F]/g},
{'base':'h', 'letters':/[\u0068\u24D7\uFF48\u0125\u1E23\u1E27\u021F\u1E25\u1E29\u1E2B\u1E96\u0127\u2C68\u2C76\u0265]/g},
{'base':'hv','letters':/[\u0195]/g},
{'base':'i', 'letters':/[\u0069\u24D8\uFF49\u00EC\u00ED\u00EE\u0129\u012B\u012D\u00EF\u1E2F\u1EC9\u01D0\u0209\u020B\u1ECB\u012F\u1E2D\u0268\u0131]/g},
{'base':'j', 'letters':/[\u006A\u24D9\uFF4A\u0135\u01F0\u0249]/g},
{'base':'k', 'letters':/[\u006B\u24DA\uFF4B\u1E31\u01E9\u1E33\u0137\u1E35\u0199\u2C6A\uA741\uA743\uA745\uA7A3]/g},
{'base':'l', 'letters':/[\u006C\u24DB\uFF4C\u0140\u013A\u013E\u1E37\u1E39\u013C\u1E3D\u1E3B\u017F\u0142\u019A\u026B\u2C61\uA749\uA781\uA747]/g},
{'base':'lj','letters':/[\u01C9]/g},
{'base':'m', 'letters':/[\u006D\u24DC\uFF4D\u1E3F\u1E41\u1E43\u0271\u026F]/g},
{'base':'n', 'letters':/[\u006E\u24DD\uFF4E\u01F9\u0144\u00F1\u1E45\u0148\u1E47\u0146\u1E4B\u1E49\u019E\u0272\u0149\uA791\uA7A5]/g},
{'base':'nj','letters':/[\u01CC]/g},
{'base':'o', 'letters':/[\u006F\u24DE\uFF4F\u00F2\u00F3\u00F4\u1ED3\u1ED1\u1ED7\u1ED5\u00F5\u1E4D\u022D\u1E4F\u014D\u1E51\u1E53\u014F\u022F\u0231\u00F6\u022B\u1ECF\u0151\u01D2\u020D\u020F\u01A1\u1EDD\u1EDB\u1EE1\u1EDF\u1EE3\u1ECD\u1ED9\u01EB\u01ED\u00F8\u01FF\u0254\uA74B\uA74D\u0275]/g},
{'base':'oi','letters':/[\u01A3]/g},
{'base':'ou','letters':/[\u0223]/g},
{'base':'oo','letters':/[\uA74F]/g},
{'base':'p','letters':/[\u0070\u24DF\uFF50\u1E55\u1E57\u01A5\u1D7D\uA751\uA753\uA755]/g},
{'base':'q','letters':/[\u0071\u24E0\uFF51\u024B\uA757\uA759]/g},
{'base':'r','letters':/[\u0072\u24E1\uFF52\u0155\u1E59\u0159\u0211\u0213\u1E5B\u1E5D\u0157\u1E5F\u024D\u027D\uA75B\uA7A7\uA783]/g},
{'base':'s','letters':/[\u0073\u24E2\uFF53\u00DF\u015B\u1E65\u015D\u1E61\u0161\u1E67\u1E63\u1E69\u0219\u015F\u023F\uA7A9\uA785\u1E9B]/g},
{'base':'t','letters':/[\u0074\u24E3\uFF54\u1E6B\u1E97\u0165\u1E6D\u021B\u0163\u1E71\u1E6F\u0167\u01AD\u0288\u2C66\uA787]/g},
{'base':'tz','letters':/[\uA729]/g},
{'base':'u','letters':/[\u0075\u24E4\uFF55\u00F9\u00FA\u00FB\u0169\u1E79\u016B\u1E7B\u016D\u00FC\u01DC\u01D8\u01D6\u01DA\u1EE7\u016F\u0171\u01D4\u0215\u0217\u01B0\u1EEB\u1EE9\u1EEF\u1EED\u1EF1\u1EE5\u1E73\u0173\u1E77\u1E75\u0289]/g},
{'base':'v','letters':/[\u0076\u24E5\uFF56\u1E7D\u1E7F\u028B\uA75F\u028C]/g},
{'base':'vy','letters':/[\uA761]/g},
{'base':'w','letters':/[\u0077\u24E6\uFF57\u1E81\u1E83\u0175\u1E87\u1E85\u1E98\u1E89\u2C73]/g},
{'base':'x','letters':/[\u0078\u24E7\uFF58\u1E8B\u1E8D]/g},
{'base':'y','letters':/[\u0079\u24E8\uFF59\u1EF3\u00FD\u0177\u1EF9\u0233\u1E8F\u00FF\u1EF7\u1E99\u1EF5\u01B4\u024F\u1EFF]/g},
{'base':'z','letters':/[\u007A\u24E9\uFF5A\u017A\u1E91\u017C\u017E\u1E93\u1E95\u01B6\u0225\u0240\u2C6C\uA763]/g}
];
for(var i=0; i<defaultDiacriticsRemovalMap.length; i++) {
str = str.replace(defaultDiacriticsRemovalMap[i].letters, defaultDiacriticsRemovalMap[i].base);
}
return str;
}
function removeDiacritics(str){
变量defaultDiacriticsRemovalMap=[
{'base':'A','letters':/[\u0041\u24B6\uFF21\u00C0\u00C1\u00C2\u1EA6\u1EA4\u1EAA\u1EA8\u00C3\u0100\u0102\u1EB0\u1EAE\u1EB4\u1EB2\u0226\u01E0\u00C4\u01DE\u1EA2\u00C5\u01FA\u01CD\u0202\u1EA0\u1EA0\u1EA0,
{'base':'AA','letters':/[\uA732]/g},
{'base':'AE','letters':/[\u00C6\u01FC\u01E2]/g},
{'base':'AO','letters':/[\uA734]/g},
{'base':'AU','l
/**
* Normalise a string replacing foreign characters
*
* @param {String} str
* @return {String} str
*/
var normalize = (function () {
var a = ['À', 'Á', 'Â', 'Ã', 'Ä', 'Å', 'Æ', 'Ç', 'È', 'É', 'Ê', 'Ë', 'Ì', 'Í', 'Î', 'Ï', 'Ð', 'Ñ', 'Ò', 'Ó', 'Ô', 'Õ', 'Ö', 'Ø', 'Ù', 'Ú', 'Û', 'Ü', 'Ý', 'ß', 'à', 'á', 'â', 'ã', 'ä', 'å', 'æ', 'ç', 'è', 'é', 'ê', 'ë', 'ì', 'í', 'î', 'ï', 'ñ', 'ò', 'ó', 'ô', 'õ', 'ö', 'ø', 'ù', 'ú', 'û', 'ü', 'ý', 'ÿ', 'Ā', 'ā', 'Ă', 'ă', 'Ą', 'ą', 'Ć', 'ć', 'Ĉ', 'ĉ', 'Ċ', 'ċ', 'Č', 'č', 'Ď', 'ď', 'Đ', 'đ', 'Ē', 'ē', 'Ĕ', 'ĕ', 'Ė', 'ė', 'Ę', 'ę', 'Ě', 'ě', 'Ĝ', 'ĝ', 'Ğ', 'ğ', 'Ġ', 'ġ', 'Ģ', 'ģ', 'Ĥ', 'ĥ', 'Ħ', 'ħ', 'Ĩ', 'ĩ', 'Ī', 'ī', 'Ĭ', 'ĭ', 'Į', 'į', 'İ', 'ı', 'IJ', 'ij', 'Ĵ', 'ĵ', 'Ķ', 'ķ', 'Ĺ', 'ĺ', 'Ļ', 'ļ', 'Ľ', 'ľ', 'Ŀ', 'ŀ', 'Ł', 'ł', 'Ń', 'ń', 'Ņ', 'ņ', 'Ň', 'ň', 'ʼn', 'Ō', 'ō', 'Ŏ', 'ŏ', 'Ő', 'ő', 'Œ', 'œ', 'Ŕ', 'ŕ', 'Ŗ', 'ŗ', 'Ř', 'ř', 'Ś', 'ś', 'Ŝ', 'ŝ', 'Ş', 'ş', 'Š', 'š', 'Ţ', 'ţ', 'Ť', 'ť', 'Ŧ', 'ŧ', 'Ũ', 'ũ', 'Ū', 'ū', 'Ŭ', 'ŭ', 'Ů', 'ů', 'Ű', 'ű', 'Ų', 'ų', 'Ŵ', 'ŵ', 'Ŷ', 'ŷ', 'Ÿ', 'Ź', 'ź', 'Ż', 'ż', 'Ž', 'ž', 'ſ', 'ƒ', 'Ơ', 'ơ', 'Ư', 'ư', 'Ǎ', 'ǎ', 'Ǐ', 'ǐ', 'Ǒ', 'ǒ', 'Ǔ', 'ǔ', 'Ǖ', 'ǖ', 'Ǘ', 'ǘ', 'Ǚ', 'ǚ', 'Ǜ', 'ǜ', 'Ǻ', 'ǻ', 'Ǽ', 'ǽ', 'Ǿ', 'ǿ'];
var b = ['A', 'A', 'A', 'A', 'A', 'A', 'AE', 'C', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I', 'D', 'N', 'O', 'O', 'O', 'O', 'O', 'O', 'U', 'U', 'U', 'U', 'Y', 's', 'a', 'a', 'a', 'a', 'a', 'a', 'ae', 'c', 'e', 'e', 'e', 'e', 'i', 'i', 'i', 'i', 'n', 'o', 'o', 'o', 'o', 'o', 'o', 'u', 'u', 'u', 'u', 'y', 'y', 'A', 'a', 'A', 'a', 'A', 'a', 'C', 'c', 'C', 'c', 'C', 'c', 'C', 'c', 'D', 'd', 'D', 'd', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'G', 'g', 'G', 'g', 'G', 'g', 'G', 'g', 'H', 'h', 'H', 'h', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'IJ', 'ij', 'J', 'j', 'K', 'k', 'L', 'l', 'L', 'l', 'L', 'l', 'L', 'l', 'l', 'l', 'N', 'n', 'N', 'n', 'N', 'n', 'n', 'O', 'o', 'O', 'o', 'O', 'o', 'OE', 'oe', 'R', 'r', 'R', 'r', 'R', 'r', 'S', 's', 'S', 's', 'S', 's', 'S', 's', 'T', 't', 'T', 't', 'T', 't', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'W', 'w', 'Y', 'y', 'Y', 'Z', 'z', 'Z', 'z', 'Z', 'z', 's', 'f', 'O', 'o', 'U', 'u', 'A', 'a', 'I', 'i', 'O', 'o', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'A', 'a', 'AE', 'ae', 'O', 'o'];
return function (str) {
var i = a.length;
while (i--) str = str.replace(a[i], b[i]);
return str;
};
}());
/**
* Normalise a string replacing foreign characters
*
* @param {String} str
* @return {String}
*/
var normalize = (function () {
var map = {
"À": "A",
"Á": "A",
"Â": "A",
"Ã": "A",
"Ä": "A",
"Å": "A",
"Æ": "AE",
"Ç": "C",
"È": "E",
"É": "E",
"Ê": "E",
"Ë": "E",
"Ì": "I",
"Í": "I",
"Î": "I",
"Ï": "I",
"Ð": "D",
"Ñ": "N",
"Ò": "O",
"Ó": "O",
"Ô": "O",
"Õ": "O",
"Ö": "O",
"Ø": "O",
"Ù": "U",
"Ú": "U",
"Û": "U",
"Ü": "U",
"Ý": "Y",
"ß": "s",
"à": "a",
"á": "a",
"â": "a",
"ã": "a",
"ä": "a",
"å": "a",
"æ": "ae",
"ç": "c",
"è": "e",
"é": "e",
"ê": "e",
"ë": "e",
"ì": "i",
"í": "i",
"î": "i",
"ï": "i",
"ñ": "n",
"ò": "o",
"ó": "o",
"ô": "o",
"õ": "o",
"ö": "o",
"ø": "o",
"ù": "u",
"ú": "u",
"û": "u",
"ü": "u",
"ý": "y",
"ÿ": "y",
"Ā": "A",
"ā": "a",
"Ă": "A",
"ă": "a",
"Ą": "A",
"ą": "a",
"Ć": "C",
"ć": "c",
"Ĉ": "C",
"ĉ": "c",
"Ċ": "C",
"ċ": "c",
"Č": "C",
"č": "c",
"Ď": "D",
"ď": "d",
"Đ": "D",
"đ": "d",
"Ē": "E",
"ē": "e",
"Ĕ": "E",
"ĕ": "e",
"Ė": "E",
"ė": "e",
"Ę": "E",
"ę": "e",
"Ě": "E",
"ě": "e",
"Ĝ": "G",
"ĝ": "g",
"Ğ": "G",
"ğ": "g",
"Ġ": "G",
"ġ": "g",
"Ģ": "G",
"ģ": "g",
"Ĥ": "H",
"ĥ": "h",
"Ħ": "H",
"ħ": "h",
"Ĩ": "I",
"ĩ": "i",
"Ī": "I",
"ī": "i",
"Ĭ": "I",
"ĭ": "i",
"Į": "I",
"į": "i",
"İ": "I",
"ı": "i",
"IJ": "IJ",
"ij": "ij",
"Ĵ": "J",
"ĵ": "j",
"Ķ": "K",
"ķ": "k",
"Ĺ": "L",
"ĺ": "l",
"Ļ": "L",
"ļ": "l",
"Ľ": "L",
"ľ": "l",
"Ŀ": "L",
"ŀ": "l",
"Ł": "l",
"ł": "l",
"Ń": "N",
"ń": "n",
"Ņ": "N",
"ņ": "n",
"Ň": "N",
"ň": "n",
"ʼn": "n",
"Ō": "O",
"ō": "o",
"Ŏ": "O",
"ŏ": "o",
"Ő": "O",
"ő": "o",
"Œ": "OE",
"œ": "oe",
"Ŕ": "R",
"ŕ": "r",
"Ŗ": "R",
"ŗ": "r",
"Ř": "R",
"ř": "r",
"Ś": "S",
"ś": "s",
"Ŝ": "S",
"ŝ": "s",
"Ş": "S",
"ş": "s",
"Š": "S",
"š": "s",
"Ţ": "T",
"ţ": "t",
"Ť": "T",
"ť": "t",
"Ŧ": "T",
"ŧ": "t",
"Ũ": "U",
"ũ": "u",
"Ū": "U",
"ū": "u",
"Ŭ": "U",
"ŭ": "u",
"Ů": "U",
"ů": "u",
"Ű": "U",
"ű": "u",
"Ų": "U",
"ų": "u",
"Ŵ": "W",
"ŵ": "w",
"Ŷ": "Y",
"ŷ": "y",
"Ÿ": "Y",
"Ź": "Z",
"ź": "z",
"Ż": "Z",
"ż": "z",
"Ž": "Z",
"ž": "z",
"ſ": "s",
"ƒ": "f",
"Ơ": "O",
"ơ": "o",
"Ư": "U",
"ư": "u",
"Ǎ": "A",
"ǎ": "a",
"Ǐ": "I",
"ǐ": "i",
"Ǒ": "O",
"ǒ": "o",
"Ǔ": "U",
"ǔ": "u",
"Ǖ": "U",
"ǖ": "u",
"Ǘ": "U",
"ǘ": "u",
"Ǚ": "U",
"ǚ": "u",
"Ǜ": "U",
"ǜ": "u",
"Ǻ": "A",
"ǻ": "a",
"Ǽ": "AE",
"ǽ": "ae",
"Ǿ": "O",
"ǿ": "o"
},
nonWord = /\W/g,
mapping = function (c) {
return map[c] || c;
};
return function (str) {
return str.replace(nonWord, mapping);
};
}());
function convert_accented_characters(str){
var conversions = new Object();
conversions['ae'] = 'ä|æ|ǽ';
conversions['oe'] = 'ö|œ';
conversions['ue'] = 'ü';
conversions['Ae'] = 'Ä';
conversions['Ue'] = 'Ü';
conversions['Oe'] = 'Ö';
conversions['A'] = 'À|Á|Â|Ã|Ä|Å|Ǻ|Ā|Ă|Ą|Ǎ';
conversions['a'] = 'à|á|â|ã|å|ǻ|ā|ă|ą|ǎ|ª';
conversions['C'] = 'Ç|Ć|Ĉ|Ċ|Č';
conversions['c'] = 'ç|ć|ĉ|ċ|č';
conversions['D'] = 'Ð|Ď|Đ';
conversions['d'] = 'ð|ď|đ';
conversions['E'] = 'È|É|Ê|Ë|Ē|Ĕ|Ė|Ę|Ě';
conversions['e'] = 'è|é|ê|ë|ē|ĕ|ė|ę|ě';
conversions['G'] = 'Ĝ|Ğ|Ġ|Ģ';
conversions['g'] = 'ĝ|ğ|ġ|ģ';
conversions['H'] = 'Ĥ|Ħ';
conversions['h'] = 'ĥ|ħ';
conversions['I'] = 'Ì|Í|Î|Ï|Ĩ|Ī|Ĭ|Ǐ|Į|İ';
conversions['i'] = 'ì|í|î|ï|ĩ|ī|ĭ|ǐ|į|ı';
conversions['J'] = 'Ĵ';
conversions['j'] = 'ĵ';
conversions['K'] = 'Ķ';
conversions['k'] = 'ķ';
conversions['L'] = 'Ĺ|Ļ|Ľ|Ŀ|Ł';
conversions['l'] = 'ĺ|ļ|ľ|ŀ|ł';
conversions['N'] = 'Ñ|Ń|Ņ|Ň';
conversions['n'] = 'ñ|ń|ņ|ň|ʼn';
conversions['O'] = 'Ò|Ó|Ô|Õ|Ō|Ŏ|Ǒ|Ő|Ơ|Ø|Ǿ';
conversions['o'] = 'ò|ó|ô|õ|ō|ŏ|ǒ|ő|ơ|ø|ǿ|º';
conversions['R'] = 'Ŕ|Ŗ|Ř';
conversions['r'] = 'ŕ|ŗ|ř';
conversions['S'] = 'Ś|Ŝ|Ş|Š';
conversions['s'] = 'ś|ŝ|ş|š|ſ';
conversions['T'] = 'Ţ|Ť|Ŧ';
conversions['t'] = 'ţ|ť|ŧ';
conversions['U'] = 'Ù|Ú|Û|Ũ|Ū|Ŭ|Ů|Ű|Ų|Ư|Ǔ|Ǖ|Ǘ|Ǚ|Ǜ';
conversions['u'] = 'ù|ú|û|ũ|ū|ŭ|ů|ű|ų|ư|ǔ|ǖ|ǘ|ǚ|ǜ';
conversions['Y'] = 'Ý|Ÿ|Ŷ';
conversions['y'] = 'ý|ÿ|ŷ';
conversions['W'] = 'Ŵ';
conversions['w'] = 'ŵ';
conversions['Z'] = 'Ź|Ż|Ž';
conversions['z'] = 'ź|ż|ž';
conversions['AE'] = 'Æ|Ǽ';
conversions['ss'] = 'ß';
conversions['IJ'] = 'IJ';
conversions['ij'] = 'ij';
conversions['OE'] = 'Œ';
conversions['f'] = 'ƒ';
for(var i in conversions){
var re = new RegExp(conversions[i],"g");
str = str.replace(re,i);
}
return str;
}
function removeDiacritics (str) {
var defaultDiacriticsRemovalMap = [
{'base':'A', 'letters':/[\u0041\u24B6\uFF21\u00C0\u00C1\u00C2\u1EA6\u1EA4\u1EAA\u1EA8\u00C3\u0100\u0102\u1EB0\u1EAE\u1EB4\u1EB2\u0226\u01E0\u00C4\u01DE\u1EA2\u00C5\u01FA\u01CD\u0200\u0202\u1EA0\u1EAC\u1EB6\u1E00\u0104\u023A\u2C6F]/g},
{'base':'AA','letters':/[\uA732]/g},
{'base':'AE','letters':/[\u00C6\u01FC\u01E2]/g},
{'base':'AO','letters':/[\uA734]/g},
{'base':'AU','letters':/[\uA736]/g},
{'base':'AV','letters':/[\uA738\uA73A]/g},
{'base':'AY','letters':/[\uA73C]/g},
{'base':'B', 'letters':/[\u0042\u24B7\uFF22\u1E02\u1E04\u1E06\u0243\u0182\u0181]/g},
{'base':'C', 'letters':/[\u0043\u24B8\uFF23\u0106\u0108\u010A\u010C\u00C7\u1E08\u0187\u023B\uA73E]/g},
{'base':'D', 'letters':/[\u0044\u24B9\uFF24\u1E0A\u010E\u1E0C\u1E10\u1E12\u1E0E\u0110\u018B\u018A\u0189\uA779]/g},
{'base':'DZ','letters':/[\u01F1\u01C4]/g},
{'base':'Dz','letters':/[\u01F2\u01C5]/g},
{'base':'E', 'letters':/[\u0045\u24BA\uFF25\u00C8\u00C9\u00CA\u1EC0\u1EBE\u1EC4\u1EC2\u1EBC\u0112\u1E14\u1E16\u0114\u0116\u00CB\u1EBA\u011A\u0204\u0206\u1EB8\u1EC6\u0228\u1E1C\u0118\u1E18\u1E1A\u0190\u018E]/g},
{'base':'F', 'letters':/[\u0046\u24BB\uFF26\u1E1E\u0191\uA77B]/g},
{'base':'G', 'letters':/[\u0047\u24BC\uFF27\u01F4\u011C\u1E20\u011E\u0120\u01E6\u0122\u01E4\u0193\uA7A0\uA77D\uA77E]/g},
{'base':'H', 'letters':/[\u0048\u24BD\uFF28\u0124\u1E22\u1E26\u021E\u1E24\u1E28\u1E2A\u0126\u2C67\u2C75\uA78D]/g},
{'base':'I', 'letters':/[\u0049\u24BE\uFF29\u00CC\u00CD\u00CE\u0128\u012A\u012C\u0130\u00CF\u1E2E\u1EC8\u01CF\u0208\u020A\u1ECA\u012E\u1E2C\u0197]/g},
{'base':'J', 'letters':/[\u004A\u24BF\uFF2A\u0134\u0248]/g},
{'base':'K', 'letters':/[\u004B\u24C0\uFF2B\u1E30\u01E8\u1E32\u0136\u1E34\u0198\u2C69\uA740\uA742\uA744\uA7A2]/g},
{'base':'L', 'letters':/[\u004C\u24C1\uFF2C\u013F\u0139\u013D\u1E36\u1E38\u013B\u1E3C\u1E3A\u0141\u023D\u2C62\u2C60\uA748\uA746\uA780]/g},
{'base':'LJ','letters':/[\u01C7]/g},
{'base':'Lj','letters':/[\u01C8]/g},
{'base':'M', 'letters':/[\u004D\u24C2\uFF2D\u1E3E\u1E40\u1E42\u2C6E\u019C]/g},
{'base':'N', 'letters':/[\u004E\u24C3\uFF2E\u01F8\u0143\u00D1\u1E44\u0147\u1E46\u0145\u1E4A\u1E48\u0220\u019D\uA790\uA7A4]/g},
{'base':'NJ','letters':/[\u01CA]/g},
{'base':'Nj','letters':/[\u01CB]/g},
{'base':'O', 'letters':/[\u004F\u24C4\uFF2F\u00D2\u00D3\u00D4\u1ED2\u1ED0\u1ED6\u1ED4\u00D5\u1E4C\u022C\u1E4E\u014C\u1E50\u1E52\u014E\u022E\u0230\u00D6\u022A\u1ECE\u0150\u01D1\u020C\u020E\u01A0\u1EDC\u1EDA\u1EE0\u1EDE\u1EE2\u1ECC\u1ED8\u01EA\u01EC\u00D8\u01FE\u0186\u019F\uA74A\uA74C]/g},
{'base':'OI','letters':/[\u01A2]/g},
{'base':'OO','letters':/[\uA74E]/g},
{'base':'OU','letters':/[\u0222]/g},
{'base':'P', 'letters':/[\u0050\u24C5\uFF30\u1E54\u1E56\u01A4\u2C63\uA750\uA752\uA754]/g},
{'base':'Q', 'letters':/[\u0051\u24C6\uFF31\uA756\uA758\u024A]/g},
{'base':'R', 'letters':/[\u0052\u24C7\uFF32\u0154\u1E58\u0158\u0210\u0212\u1E5A\u1E5C\u0156\u1E5E\u024C\u2C64\uA75A\uA7A6\uA782]/g},
{'base':'S', 'letters':/[\u0053\u24C8\uFF33\u1E9E\u015A\u1E64\u015C\u1E60\u0160\u1E66\u1E62\u1E68\u0218\u015E\u2C7E\uA7A8\uA784]/g},
{'base':'T', 'letters':/[\u0054\u24C9\uFF34\u1E6A\u0164\u1E6C\u021A\u0162\u1E70\u1E6E\u0166\u01AC\u01AE\u023E\uA786]/g},
{'base':'TZ','letters':/[\uA728]/g},
{'base':'U', 'letters':/[\u0055\u24CA\uFF35\u00D9\u00DA\u00DB\u0168\u1E78\u016A\u1E7A\u016C\u00DC\u01DB\u01D7\u01D5\u01D9\u1EE6\u016E\u0170\u01D3\u0214\u0216\u01AF\u1EEA\u1EE8\u1EEE\u1EEC\u1EF0\u1EE4\u1E72\u0172\u1E76\u1E74\u0244]/g},
{'base':'V', 'letters':/[\u0056\u24CB\uFF36\u1E7C\u1E7E\u01B2\uA75E\u0245]/g},
{'base':'VY','letters':/[\uA760]/g},
{'base':'W', 'letters':/[\u0057\u24CC\uFF37\u1E80\u1E82\u0174\u1E86\u1E84\u1E88\u2C72]/g},
{'base':'X', 'letters':/[\u0058\u24CD\uFF38\u1E8A\u1E8C]/g},
{'base':'Y', 'letters':/[\u0059\u24CE\uFF39\u1EF2\u00DD\u0176\u1EF8\u0232\u1E8E\u0178\u1EF6\u1EF4\u01B3\u024E\u1EFE]/g},
{'base':'Z', 'letters':/[\u005A\u24CF\uFF3A\u0179\u1E90\u017B\u017D\u1E92\u1E94\u01B5\u0224\u2C7F\u2C6B\uA762]/g},
{'base':'a', 'letters':/[\u0061\u24D0\uFF41\u1E9A\u00E0\u00E1\u00E2\u1EA7\u1EA5\u1EAB\u1EA9\u00E3\u0101\u0103\u1EB1\u1EAF\u1EB5\u1EB3\u0227\u01E1\u00E4\u01DF\u1EA3\u00E5\u01FB\u01CE\u0201\u0203\u1EA1\u1EAD\u1EB7\u1E01\u0105\u2C65\u0250]/g},
{'base':'aa','letters':/[\uA733]/g},
{'base':'ae','letters':/[\u00E6\u01FD\u01E3]/g},
{'base':'ao','letters':/[\uA735]/g},
{'base':'au','letters':/[\uA737]/g},
{'base':'av','letters':/[\uA739\uA73B]/g},
{'base':'ay','letters':/[\uA73D]/g},
{'base':'b', 'letters':/[\u0062\u24D1\uFF42\u1E03\u1E05\u1E07\u0180\u0183\u0253]/g},
{'base':'c', 'letters':/[\u0063\u24D2\uFF43\u0107\u0109\u010B\u010D\u00E7\u1E09\u0188\u023C\uA73F\u2184]/g},
{'base':'d', 'letters':/[\u0064\u24D3\uFF44\u1E0B\u010F\u1E0D\u1E11\u1E13\u1E0F\u0111\u018C\u0256\u0257\uA77A]/g},
{'base':'dz','letters':/[\u01F3\u01C6]/g},
{'base':'e', 'letters':/[\u0065\u24D4\uFF45\u00E8\u00E9\u00EA\u1EC1\u1EBF\u1EC5\u1EC3\u1EBD\u0113\u1E15\u1E17\u0115\u0117\u00EB\u1EBB\u011B\u0205\u0207\u1EB9\u1EC7\u0229\u1E1D\u0119\u1E19\u1E1B\u0247\u025B\u01DD]/g},
{'base':'f', 'letters':/[\u0066\u24D5\uFF46\u1E1F\u0192\uA77C]/g},
{'base':'g', 'letters':/[\u0067\u24D6\uFF47\u01F5\u011D\u1E21\u011F\u0121\u01E7\u0123\u01E5\u0260\uA7A1\u1D79\uA77F]/g},
{'base':'h', 'letters':/[\u0068\u24D7\uFF48\u0125\u1E23\u1E27\u021F\u1E25\u1E29\u1E2B\u1E96\u0127\u2C68\u2C76\u0265]/g},
{'base':'hv','letters':/[\u0195]/g},
{'base':'i', 'letters':/[\u0069\u24D8\uFF49\u00EC\u00ED\u00EE\u0129\u012B\u012D\u00EF\u1E2F\u1EC9\u01D0\u0209\u020B\u1ECB\u012F\u1E2D\u0268\u0131]/g},
{'base':'j', 'letters':/[\u006A\u24D9\uFF4A\u0135\u01F0\u0249]/g},
{'base':'k', 'letters':/[\u006B\u24DA\uFF4B\u1E31\u01E9\u1E33\u0137\u1E35\u0199\u2C6A\uA741\uA743\uA745\uA7A3]/g},
{'base':'l', 'letters':/[\u006C\u24DB\uFF4C\u0140\u013A\u013E\u1E37\u1E39\u013C\u1E3D\u1E3B\u017F\u0142\u019A\u026B\u2C61\uA749\uA781\uA747]/g},
{'base':'lj','letters':/[\u01C9]/g},
{'base':'m', 'letters':/[\u006D\u24DC\uFF4D\u1E3F\u1E41\u1E43\u0271\u026F]/g},
{'base':'n', 'letters':/[\u006E\u24DD\uFF4E\u01F9\u0144\u00F1\u1E45\u0148\u1E47\u0146\u1E4B\u1E49\u019E\u0272\u0149\uA791\uA7A5]/g},
{'base':'nj','letters':/[\u01CC]/g},
{'base':'o', 'letters':/[\u006F\u24DE\uFF4F\u00F2\u00F3\u00F4\u1ED3\u1ED1\u1ED7\u1ED5\u00F5\u1E4D\u022D\u1E4F\u014D\u1E51\u1E53\u014F\u022F\u0231\u00F6\u022B\u1ECF\u0151\u01D2\u020D\u020F\u01A1\u1EDD\u1EDB\u1EE1\u1EDF\u1EE3\u1ECD\u1ED9\u01EB\u01ED\u00F8\u01FF\u0254\uA74B\uA74D\u0275]/g},
{'base':'oi','letters':/[\u01A3]/g},
{'base':'ou','letters':/[\u0223]/g},
{'base':'oo','letters':/[\uA74F]/g},
{'base':'p','letters':/[\u0070\u24DF\uFF50\u1E55\u1E57\u01A5\u1D7D\uA751\uA753\uA755]/g},
{'base':'q','letters':/[\u0071\u24E0\uFF51\u024B\uA757\uA759]/g},
{'base':'r','letters':/[\u0072\u24E1\uFF52\u0155\u1E59\u0159\u0211\u0213\u1E5B\u1E5D\u0157\u1E5F\u024D\u027D\uA75B\uA7A7\uA783]/g},
{'base':'s','letters':/[\u0073\u24E2\uFF53\u00DF\u015B\u1E65\u015D\u1E61\u0161\u1E67\u1E63\u1E69\u0219\u015F\u023F\uA7A9\uA785\u1E9B]/g},
{'base':'t','letters':/[\u0074\u24E3\uFF54\u1E6B\u1E97\u0165\u1E6D\u021B\u0163\u1E71\u1E6F\u0167\u01AD\u0288\u2C66\uA787]/g},
{'base':'tz','letters':/[\uA729]/g},
{'base':'u','letters':/[\u0075\u24E4\uFF55\u00F9\u00FA\u00FB\u0169\u1E79\u016B\u1E7B\u016D\u00FC\u01DC\u01D8\u01D6\u01DA\u1EE7\u016F\u0171\u01D4\u0215\u0217\u01B0\u1EEB\u1EE9\u1EEF\u1EED\u1EF1\u1EE5\u1E73\u0173\u1E77\u1E75\u0289]/g},
{'base':'v','letters':/[\u0076\u24E5\uFF56\u1E7D\u1E7F\u028B\uA75F\u028C]/g},
{'base':'vy','letters':/[\uA761]/g},
{'base':'w','letters':/[\u0077\u24E6\uFF57\u1E81\u1E83\u0175\u1E87\u1E85\u1E98\u1E89\u2C73]/g},
{'base':'x','letters':/[\u0078\u24E7\uFF58\u1E8B\u1E8D]/g},
{'base':'y','letters':/[\u0079\u24E8\uFF59\u1EF3\u00FD\u0177\u1EF9\u0233\u1E8F\u00FF\u1EF7\u1E99\u1EF5\u01B4\u024F\u1EFF]/g},
{'base':'z','letters':/[\u007A\u24E9\uFF5A\u017A\u1E91\u017C\u017E\u1E93\u1E95\u01B6\u0225\u0240\u2C6C\uA763]/g}
];
for(var i=0; i<defaultDiacriticsRemovalMap.length; i++) {
str = str.replace(defaultDiacriticsRemovalMap[i].letters, defaultDiacriticsRemovalMap[i].base);
}
return str;
}
var str = "Letras Á É Í Ó Ú Ñ - á é í ó ú ñ...";
console.log (str.normalize ("NFKD").replace (/[\u0300-\u036F]/g, ""));
// Letras A E I O U N - a e i o u n...
function noTilde (s) {
if (s.normalize != undefined) {
s = s.normalize ("NFKD");
}
return s.replace (/[\u0300-\u036F]/g, "");
}
TAB_00C0 = "AAAAAAACEEEEIIII" +
"DNOOOOO*OUUUUYIs" +
"aaaaaaaceeeeiiii" +
"?nooooo/ouuuuy?y" +
"AaAaAaCcCcCcCcDd" +
"DdEeEeEeEeEeGgGg" +
"GgGgHhHhIiIiIiIi" +
"IiJjJjKkkLlLlLlL" +
"lLlNnNnNnnNnOoOo" +
"OoOoRrRrRrSsSsSs" +
"SsTtTtTtUuUuUuUu" +
"UuUuWwYyYZzZzZzF";
function stripDiacritics(source) {
var result = source.split('');
for (var i = 0; i < result.length; i++) {
var c = source.charCodeAt(i);
if (c >= 0x00c0 && c <= 0x017f) {
result[i] = String.fromCharCode(TAB_00C0.charCodeAt(c - 0x00c0));
} else if (c > 127) {
result[i] = '?';
}
}
return result.join('');
}
stripDiacritics("Šupa, čo? ľšťčžýæøåℌð")
String.prototype.removeAccents = function() {
var removalMap = {
'A' : /[AⒶAÀÁÂẦẤẪẨÃĀĂẰẮẴẲȦǠÄǞẢÅǺǍȀȂẠẬẶḀĄ]/g,
'AA' : /[Ꜳ]/g,
'AE' : /[ÆǼǢ]/g,
'AO' : /[Ꜵ]/g,
'AU' : /[Ꜷ]/g,
'AV' : /[ꜸꜺ]/g,
'AY' : /[Ꜽ]/g,
'B' : /[BⒷBḂḄḆɃƂƁ]/g,
'C' : /[CⒸCĆĈĊČÇḈƇȻꜾ]/g,
'D' : /[DⒹDḊĎḌḐḒḎĐƋƊƉꝹ]/g,
'DZ' : /[DZDŽ]/g,
'Dz' : /[DzDž]/g,
'E' : /[EⒺEÈÉÊỀẾỄỂẼĒḔḖĔĖËẺĚȄȆẸỆȨḜĘḘḚƐƎ]/g,
'F' : /[FⒻFḞƑꝻ]/g,
'G' : /[GⒼGǴĜḠĞĠǦĢǤƓꞠꝽꝾ]/g,
'H' : /[HⒽHĤḢḦȞḤḨḪĦⱧⱵꞍ]/g,
'I' : /[IⒾIÌÍÎĨĪĬİÏḮỈǏȈȊỊĮḬƗ]/g,
'J' : /[JⒿJĴɈ]/g,
'K' : /[KⓀKḰǨḲĶḴƘⱩꝀꝂꝄꞢ]/g,
'L' : /[LⓁLĿĹĽḶḸĻḼḺŁȽⱢⱠꝈꝆꞀ]/g,
'LJ' : /[LJ]/g,
'Lj' : /[Lj]/g,
'M' : /[MⓂMḾṀṂⱮƜ]/g,
'N' : /[NⓃNǸŃÑṄŇṆŅṊṈȠƝꞐꞤ]/g,
'NJ' : /[NJ]/g,
'Nj' : /[Nj]/g,
'O' : /[OⓄOÒÓÔỒỐỖỔÕṌȬṎŌṐṒŎȮȰÖȪỎŐǑȌȎƠỜỚỠỞỢỌỘǪǬØǾƆƟꝊꝌ]/g,
'OI' : /[Ƣ]/g,
'OO' : /[Ꝏ]/g,
'OU' : /[Ȣ]/g,
'P' : /[PⓅPṔṖƤⱣꝐꝒꝔ]/g,
'Q' : /[QⓆQꝖꝘɊ]/g,
'R' : /[RⓇRŔṘŘȐȒṚṜŖṞɌⱤꝚꞦꞂ]/g,
'S' : /[SⓈSẞŚṤŜṠŠṦṢṨȘŞⱾꞨꞄ]/g,
'T' : /[TⓉTṪŤṬȚŢṰṮŦƬƮȾꞆ]/g,
'TZ' : /[Ꜩ]/g,
'U' : /[UⓊUÙÚÛŨṸŪṺŬÜǛǗǕǙỦŮŰǓȔȖƯỪỨỮỬỰỤṲŲṶṴɄ]/g,
'V' : /[VⓋVṼṾƲꝞɅ]/g,
'VY' : /[Ꝡ]/g,
'W' : /[WⓌWẀẂŴẆẄẈⱲ]/g,
'X' : /[XⓍXẊẌ]/g,
'Y' : /[YⓎYỲÝŶỸȲẎŸỶỴƳɎỾ]/g,
'Z' : /[ZⓏZŹẐŻŽẒẔƵȤⱿⱫꝢ]/g,
'a' : /[aⓐaẚàáâầấẫẩãāăằắẵẳȧǡäǟảåǻǎȁȃạậặḁąⱥɐ]/g,
'aa' : /[ꜳ]/g,
'ae' : /[æǽǣ]/g,
'ao' : /[ꜵ]/g,
'au' : /[ꜷ]/g,
'av' : /[ꜹꜻ]/g,
'ay' : /[ꜽ]/g,
'b' : /[bⓑbḃḅḇƀƃɓ]/g,
'c' : /[cⓒcćĉċčçḉƈȼꜿↄ]/g,
'd' : /[dⓓdḋďḍḑḓḏđƌɖɗꝺ]/g,
'dz' : /[dzdž]/g,
'e' : /[eⓔeèéêềếễểẽēḕḗĕėëẻěȅȇẹệȩḝęḙḛɇɛǝ]/g,
'f' : /[fⓕfḟƒꝼ]/g,
'g' : /[gⓖgǵĝḡğġǧģǥɠꞡᵹꝿ]/g,
'h' : /[hⓗhĥḣḧȟḥḩḫẖħⱨⱶɥ]/g,
'hv' : /[ƕ]/g,
'i' : /[iⓘiìíîĩīĭïḯỉǐȉȋịįḭɨı]/g,
'j' : /[jⓙjĵǰɉ]/g,
'k' : /[kⓚkḱǩḳķḵƙⱪꝁꝃꝅꞣ]/g,
'l' : /[lⓛlŀĺľḷḹļḽḻſłƚɫⱡꝉꞁꝇ]/g,
'lj' : /[lj]/g,
'm' : /[mⓜmḿṁṃɱɯ]/g,
'n' : /[nⓝnǹńñṅňṇņṋṉƞɲʼnꞑꞥ]/g,
'nj' : /[nj]/g,
'o' : /[oⓞoòóôồốỗổõṍȭṏōṑṓŏȯȱöȫỏőǒȍȏơờớỡởợọộǫǭøǿɔꝋꝍɵ]/g,
'oi' : /[ƣ]/g,
'ou' : /[ȣ]/g,
'oo' : /[ꝏ]/g,
'p' : /[pⓟpṕṗƥᵽꝑꝓꝕ]/g,
'q' : /[qⓠqɋꝗꝙ]/g,
'r' : /[rⓡrŕṙřȑȓṛṝŗṟɍɽꝛꞧꞃ]/g,
's' : /[sⓢsßśṥŝṡšṧṣṩșşȿꞩꞅẛ]/g,
't' : /[tⓣtṫẗťṭțţṱṯŧƭʈⱦꞇ]/g,
'tz' : /[ꜩ]/g,
'u' : /[uⓤuùúûũṹūṻŭüǜǘǖǚủůűǔȕȗưừứữửựụṳųṷṵʉ]/g,
'v' : /[vⓥvṽṿʋꝟʌ]/g,
'vy' : /[ꝡ]/g,
'w' : /[wⓦwẁẃŵẇẅẘẉⱳ]/g,
'x' : /[xⓧxẋẍ]/g,
'y' : /[yⓨyỳýŷỹȳẏÿỷẙỵƴɏỿ]/g,
'z' : /[zⓩzźẑżžẓẕƶȥɀⱬꝣ]/g,
};
var str = this;
for(var latin in removalMap) {
var nonLatin = removalMap[latin];
str = str.replace(nonLatin , latin);
}
return str;
}
"ąąą".removeAccents(); // returns "aaa"
function remove-accents(p){
c='áàãâäéèêëíìîïóòõôöúùûüçÁÀÃÂÄÉÈÊËÍÌÎÏÓÒÕÖÔÚÙÛÜÇ';s='aaaaaeeeeiiiiooooouuuucAAAAAEEEEIIIIOOOOOUUUUC';n='';for(i=0;i<p.length;i++){if(c.search(p.substr(i,1))>=0){n+=s.substr(c.search(p.substr(i,1)),1);} else{n+=p.substr(i,1);}} return n;
}
remove-accents("Thís ís ân accêntéd phráse");
"This is an accented phrase"
var list = ['a', 'b', 'c', 'o', 'u', 'z', 'ä', 'ö', 'ü'];
list.sort((a, b) => a.localeCompare(b));
console.log(list);
//Outputs ['a', 'ä', 'b', 'c', 'o', 'ö', 'u', 'ü', 'z']
const str = "Crème Brulée"
str.normalize('NFD').replace(/[\u0300-\u036f]/g, "")
> 'Creme Brulee'
const c = new Intl.Collator();
['creme brulee', 'crème brulée', 'crame brulai', 'crome brouillé',
'creme brulay', 'creme brulfé', 'creme bruléa'].sort(c.compare)
[ 'crame brulai','creme brulay','creme bruléa','creme brulee',
'crème brulée','creme brulfé','crome brouillé' ]
['creme brulee', 'crème brulée', 'crame brulai', 'crome brouillé'].sort((a,b) => a>b)
["crame brulai", "creme brulee", "crome brouillé", "crème brulée"]
var normalizeConversions = [
{ regex: new RegExp('ä|æ|ǽ', 'g'), clean: 'ae' },
{ regex: new RegExp('ö|œ', 'g'), clean: 'oe' },
{ regex: new RegExp('ü', 'g'), clean: 'ue' },
{ regex: new RegExp('Ä', 'g'), clean: 'Ae' },
{ regex: new RegExp('Ü', 'g'), clean: 'Ue' },
{ regex: new RegExp('Ö', 'g'), clean: 'Oe' },
{ regex: new RegExp('À|Á|Â|Ã|Ä|Å|Ǻ|Ā|Ă|Ą|Ǎ', 'g'), clean: 'A' },
{ regex: new RegExp('à|á|â|ã|å|ǻ|ā|ă|ą|ǎ|ª', 'g'), clean: 'a' },
{ regex: new RegExp('Ç|Ć|Ĉ|Ċ|Č', 'g'), clean: 'C' },
{ regex: new RegExp('ç|ć|ĉ|ċ|č', 'g'), clean: 'c' },
{ regex: new RegExp('Ð|Ď|Đ', 'g'), clean: 'D' },
{ regex: new RegExp('ð|ď|đ', 'g'), clean: 'd' },
{ regex: new RegExp('È|É|Ê|Ë|Ē|Ĕ|Ė|Ę|Ě', 'g'), clean: 'E' },
{ regex: new RegExp('è|é|ê|ë|ē|ĕ|ė|ę|ě', 'g'), clean: 'e' },
{ regex: new RegExp('Ĝ|Ğ|Ġ|Ģ', 'g'), clean: 'G' },
{ regex: new RegExp('ĝ|ğ|ġ|ģ', 'g'), clean: 'g' },
{ regex: new RegExp('Ĥ|Ħ', 'g'), clean: 'H' },
{ regex: new RegExp('ĥ|ħ', 'g'), clean: 'h' },
{ regex: new RegExp('Ì|Í|Î|Ï|Ĩ|Ī|Ĭ|Ǐ|Į|İ', 'g'), clean: 'I' },
{ regex: new RegExp('ì|í|î|ï|ĩ|ī|ĭ|ǐ|į|ı', 'g'), clean: 'i' },
{ regex: new RegExp('Ĵ', 'g'), clean: 'J' },
{ regex: new RegExp('ĵ', 'g'), clean: 'j' },
{ regex: new RegExp('Ķ', 'g'), clean: 'K' },
{ regex: new RegExp('ķ', 'g'), clean: 'k' },
{ regex: new RegExp('Ĺ|Ļ|Ľ|Ŀ|Ł', 'g'), clean: 'L' },
{ regex: new RegExp('ĺ|ļ|ľ|ŀ|ł', 'g'), clean: 'l' },
{ regex: new RegExp('Ñ|Ń|Ņ|Ň', 'g'), clean: 'N' },
{ regex: new RegExp('ñ|ń|ņ|ň|ʼn', 'g'), clean: 'n' },
{ regex: new RegExp('Ò|Ó|Ô|Õ|Ō|Ŏ|Ǒ|Ő|Ơ|Ø|Ǿ', 'g'), clean: 'O' },
{ regex: new RegExp('ò|ó|ô|õ|ō|ŏ|ǒ|ő|ơ|ø|ǿ|º', 'g'), clean: 'o' },
{ regex: new RegExp('Ŕ|Ŗ|Ř', 'g'), clean: 'R' },
{ regex: new RegExp('ŕ|ŗ|ř', 'g'), clean: 'r' },
{ regex: new RegExp('Ś|Ŝ|Ş|Š', 'g'), clean: 'S' },
{ regex: new RegExp('ś|ŝ|ş|š|ſ', 'g'), clean: 's' },
{ regex: new RegExp('Ţ|Ť|Ŧ', 'g'), clean: 'T' },
{ regex: new RegExp('ţ|ť|ŧ', 'g'), clean: 't' },
{ regex: new RegExp('Ù|Ú|Û|Ũ|Ū|Ŭ|Ů|Ű|Ų|Ư|Ǔ|Ǖ|Ǘ|Ǚ|Ǜ', 'g'), clean: 'U' },
{ regex: new RegExp('ù|ú|û|ũ|ū|ŭ|ů|ű|ų|ư|ǔ|ǖ|ǘ|ǚ|ǜ', 'g'), clean: 'u' },
{ regex: new RegExp('Ý|Ÿ|Ŷ', 'g'), clean: 'Y' },
{ regex: new RegExp('ý|ÿ|ŷ', 'g'), clean: 'y' },
{ regex: new RegExp('Ŵ', 'g'), clean: 'W' },
{ regex: new RegExp('ŵ', 'g'), clean: 'w' },
{ regex: new RegExp('Ź|Ż|Ž', 'g'), clean: 'Z' },
{ regex: new RegExp('ź|ż|ž', 'g'), clean: 'z' },
{ regex: new RegExp('Æ|Ǽ', 'g'), clean: 'AE' },
{ regex: new RegExp('ß', 'g'), clean: 'ss' },
{ regex: new RegExp('IJ', 'g'), clean: 'IJ' },
{ regex: new RegExp('ij', 'g'), clean: 'ij' },
{ regex: new RegExp('Œ', 'g'), clean: 'OE' },
{ regex: new RegExp('ƒ', 'g'), clean: 'f' }
];
function(str){
normalizeConversions.forEach(function(normalizeEntry){
str = str.replace(normalizeEntry.regex, normalizeEntry.clean);
});
return str;
};
// Usage example:
"Some string".replace(/[^a-zA-Z0-9-_]/g, char => ToLatinMap.get(char) || '')
// Map:
export let ToLatinMap: Map<string, string> = new Map<string, string>([
["Á", "A"],
["Ă", "A"],
["Ắ", "A"],
["Ặ", "A"],
["Ằ", "A"],
["Ẳ", "A"],
["Ẵ", "A"],
["Ǎ", "A"],
["Â", "A"],
["Ấ", "A"],
["Ậ", "A"],
["Ầ", "A"],
["Ẩ", "A"],
["Ẫ", "A"],
["Ä", "A"],
["Ǟ", "A"],
["Ȧ", "A"],
["Ǡ", "A"],
["Ạ", "A"],
["Ȁ", "A"],
["À", "A"],
["Ả", "A"],
["Ȃ", "A"],
["Ā", "A"],
["Ą", "A"],
["Å", "A"],
["Ǻ", "A"],
["Ḁ", "A"],
["Ⱥ", "A"],
["Ã", "A"],
["Ꜳ", "AA"],
["Æ", "AE"],
["Ǽ", "AE"],
["Ǣ", "AE"],
["Ꜵ", "AO"],
["Ꜷ", "AU"],
["Ꜹ", "AV"],
["Ꜻ", "AV"],
["Ꜽ", "AY"],
["Ḃ", "B"],
["Ḅ", "B"],
["Ɓ", "B"],
["Ḇ", "B"],
["Ƀ", "B"],
["Ƃ", "B"],
["Ć", "C"],
["Č", "C"],
["Ç", "C"],
["Ḉ", "C"],
["Ĉ", "C"],
["Ċ", "C"],
["Ƈ", "C"],
["Ȼ", "C"],
["Ď", "D"],
["Ḑ", "D"],
["Ḓ", "D"],
["Ḋ", "D"],
["Ḍ", "D"],
["Ɗ", "D"],
["Ḏ", "D"],
["Dz", "D"],
["Dž", "D"],
["Đ", "D"],
["Ƌ", "D"],
["DZ", "DZ"],
["DŽ", "DZ"],
["É", "E"],
["Ĕ", "E"],
["Ě", "E"],
["Ȩ", "E"],
["Ḝ", "E"],
["Ê", "E"],
["Ế", "E"],
["Ệ", "E"],
["Ề", "E"],
["Ể", "E"],
["Ễ", "E"],
["Ḙ", "E"],
["Ë", "E"],
["Ė", "E"],
["Ẹ", "E"],
["Ȅ", "E"],
["È", "E"],
["Ẻ", "E"],
["Ȇ", "E"],
["Ē", "E"],
["Ḗ", "E"],
["Ḕ", "E"],
["Ę", "E"],
["Ɇ", "E"],
["Ẽ", "E"],
["Ḛ", "E"],
["Ꝫ", "ET"],
["Ḟ", "F"],
["Ƒ", "F"],
["Ǵ", "G"],
["Ğ", "G"],
["Ǧ", "G"],
["Ģ", "G"],
["Ĝ", "G"],
["Ġ", "G"],
["Ɠ", "G"],
["Ḡ", "G"],
["Ǥ", "G"],
["Ḫ", "H"],
["Ȟ", "H"],
["Ḩ", "H"],
["Ĥ", "H"],
["Ⱨ", "H"],
["Ḧ", "H"],
["Ḣ", "H"],
["Ḥ", "H"],
["Ħ", "H"],
["Í", "I"],
["Ĭ", "I"],
["Ǐ", "I"],
["Î", "I"],
["Ï", "I"],
["Ḯ", "I"],
["İ", "I"],
["Ị", "I"],
["Ȉ", "I"],
["Ì", "I"],
["Ỉ", "I"],
["Ȋ", "I"],
["Ī", "I"],
["Į", "I"],
["Ɨ", "I"],
["Ĩ", "I"],
["Ḭ", "I"],
["Ꝺ", "D"],
["Ꝼ", "F"],
["Ᵹ", "G"],
["Ꞃ", "R"],
["Ꞅ", "S"],
["Ꞇ", "T"],
["Ꝭ", "IS"],
["Ĵ", "J"],
["Ɉ", "J"],
["Ḱ", "K"],
["Ǩ", "K"],
["Ķ", "K"],
["Ⱪ", "K"],
["Ꝃ", "K"],
["Ḳ", "K"],
["Ƙ", "K"],
["Ḵ", "K"],
["Ꝁ", "K"],
["Ꝅ", "K"],
["Ĺ", "L"],
["Ƚ", "L"],
["Ľ", "L"],
["Ļ", "L"],
["Ḽ", "L"],
["Ḷ", "L"],
["Ḹ", "L"],
["Ⱡ", "L"],
["Ꝉ", "L"],
["Ḻ", "L"],
["Ŀ", "L"],
["Ɫ", "L"],
["Lj", "L"],
["Ł", "L"],
["LJ", "LJ"],
["Ḿ", "M"],
["Ṁ", "M"],
["Ṃ", "M"],
["Ɱ", "M"],
["Ń", "N"],
["Ň", "N"],
["Ņ", "N"],
["Ṋ", "N"],
["Ṅ", "N"],
["Ṇ", "N"],
["Ǹ", "N"],
["Ɲ", "N"],
["Ṉ", "N"],
["Ƞ", "N"],
["Nj", "N"],
["Ñ", "N"],
["NJ", "NJ"],
["Ó", "O"],
["Ŏ", "O"],
["Ǒ", "O"],
["Ô", "O"],
["Ố", "O"],
["Ộ", "O"],
["Ồ", "O"],
["Ổ", "O"],
["Ỗ", "O"],
["Ö", "O"],
["Ȫ", "O"],
["Ȯ", "O"],
["Ȱ", "O"],
["Ọ", "O"],
["Ő", "O"],
["Ȍ", "O"],
["Ò", "O"],
["Ỏ", "O"],
["Ơ", "O"],
["Ớ", "O"],
["Ợ", "O"],
["Ờ", "O"],
["Ở", "O"],
["Ỡ", "O"],
["Ȏ", "O"],
["Ꝋ", "O"],
["Ꝍ", "O"],
["Ō", "O"],
["Ṓ", "O"],
["Ṑ", "O"],
["Ɵ", "O"],
["Ǫ", "O"],
["Ǭ", "O"],
["Ø", "O"],
["Ǿ", "O"],
["Õ", "O"],
["Ṍ", "O"],
["Ṏ", "O"],
["Ȭ", "O"],
["Ƣ", "OI"],
["Ꝏ", "OO"],
["Ɛ", "E"],
["Ɔ", "O"],
["Ȣ", "OU"],
["Ṕ", "P"],
["Ṗ", "P"],
["Ꝓ", "P"],
["Ƥ", "P"],
["Ꝕ", "P"],
["Ᵽ", "P"],
["Ꝑ", "P"],
["Ꝙ", "Q"],
["Ꝗ", "Q"],
["Ŕ", "R"],
["Ř", "R"],
["Ŗ", "R"],
["Ṙ", "R"],
["Ṛ", "R"],
["Ṝ", "R"],
["Ȑ", "R"],
["Ȓ", "R"],
["Ṟ", "R"],
["Ɍ", "R"],
["Ɽ", "R"],
["Ꜿ", "C"],
["Ǝ", "E"],
["Ś", "S"],
["Ṥ", "S"],
["Š", "S"],
["Ṧ", "S"],
["Ş", "S"],
["Ŝ", "S"],
["Ș", "S"],
["Ṡ", "S"],
["Ṣ", "S"],
["Ṩ", "S"],
["Ť", "T"],
["Ţ", "T"],
["Ṱ", "T"],
["Ț", "T"],
["Ⱦ", "T"],
["Ṫ", "T"],
["Ṭ", "T"],
["Ƭ", "T"],
["Ṯ", "T"],
["Ʈ", "T"],
["Ŧ", "T"],
["Ɐ", "A"],
["Ꞁ", "L"],
["Ɯ", "M"],
["Ʌ", "V"],
["Ꜩ", "TZ"],
["Ú", "U"],
["Ŭ", "U"],
["Ǔ", "U"],
["Û", "U"],
["Ṷ", "U"],
["Ü", "U"],
["Ǘ", "U"],
["Ǚ", "U"],
["Ǜ", "U"],
["Ǖ", "U"],
["Ṳ", "U"],
["Ụ", "U"],
["Ű", "U"],
["Ȕ", "U"],
["Ù", "U"],
["Ủ", "U"],
["Ư", "U"],
["Ứ", "U"],
["Ự", "U"],
["Ừ", "U"],
["Ử", "U"],
["Ữ", "U"],
["Ȗ", "U"],
["Ū", "U"],
["Ṻ", "U"],
["Ų", "U"],
["Ů", "U"],
["Ũ", "U"],
["Ṹ", "U"],
["Ṵ", "U"],
["Ꝟ", "V"],
["Ṿ", "V"],
["Ʋ", "V"],
["Ṽ", "V"],
["Ꝡ", "VY"],
["Ẃ", "W"],
["Ŵ", "W"],
["Ẅ", "W"],
["Ẇ", "W"],
["Ẉ", "W"],
["Ẁ", "W"],
["Ⱳ", "W"],
["Ẍ", "X"],
["Ẋ", "X"],
["Ý", "Y"],
["Ŷ", "Y"],
["Ÿ", "Y"],
["Ẏ", "Y"],
["Ỵ", "Y"],
["Ỳ", "Y"],
["Ƴ", "Y"],
["Ỷ", "Y"],
["Ỿ", "Y"],
["Ȳ", "Y"],
["Ɏ", "Y"],
["Ỹ", "Y"],
["Ź", "Z"],
["Ž", "Z"],
["Ẑ", "Z"],
["Ⱬ", "Z"],
["Ż", "Z"],
["Ẓ", "Z"],
["Ȥ", "Z"],
["Ẕ", "Z"],
["Ƶ", "Z"],
["IJ", "IJ"],
["Œ", "OE"],
["ᴀ", "A"],
["ᴁ", "AE"],
["ʙ", "B"],
["ᴃ", "B"],
["ᴄ", "C"],
["ᴅ", "D"],
["ᴇ", "E"],
["ꜰ", "F"],
["ɢ", "G"],
["ʛ", "G"],
["ʜ", "H"],
["ɪ", "I"],
["ʁ", "R"],
["ᴊ", "J"],
["ᴋ", "K"],
["ʟ", "L"],
["ᴌ", "L"],
["ᴍ", "M"],
["ɴ", "N"],
["ᴏ", "O"],
["ɶ", "OE"],
["ᴐ", "O"],
["ᴕ", "OU"],
["ᴘ", "P"],
["ʀ", "R"],
["ᴎ", "N"],
["ᴙ", "R"],
["ꜱ", "S"],
["ᴛ", "T"],
["ⱻ", "E"],
["ᴚ", "R"],
["ᴜ", "U"],
["ᴠ", "V"],
["ᴡ", "W"],
["ʏ", "Y"],
["ᴢ", "Z"],
["á", "a"],
["ă", "a"],
["ắ", "a"],
["ặ", "a"],
["ằ", "a"],
["ẳ", "a"],
["ẵ", "a"],
["ǎ", "a"],
["â", "a"],
["ấ", "a"],
["ậ", "a"],
["ầ", "a"],
["ẩ", "a"],
["ẫ", "a"],
["ä", "a"],
["ǟ", "a"],
["ȧ", "a"],
["ǡ", "a"],
["ạ", "a"],
["ȁ", "a"],
["à", "a"],
["ả", "a"],
["ȃ", "a"],
["ā", "a"],
["ą", "a"],
["ᶏ", "a"],
["ẚ", "a"],
["å", "a"],
["ǻ", "a"],
["ḁ", "a"],
["ⱥ", "a"],
["ã", "a"],
["ꜳ", "aa"],
["æ", "ae"],
["ǽ", "ae"],
["ǣ", "ae"],
["ꜵ", "ao"],
["ꜷ", "au"],
["ꜹ", "av"],
["ꜻ", "av"],
["ꜽ", "ay"],
["ḃ", "b"],
["ḅ", "b"],
["ɓ", "b"],
["ḇ", "b"],
["ᵬ", "b"],
["ᶀ", "b"],
["ƀ", "b"],
["ƃ", "b"],
["ɵ", "o"],
["ć", "c"],
["č", "c"],
["ç", "c"],
["ḉ", "c"],
["ĉ", "c"],
["ɕ", "c"],
["ċ", "c"],
["ƈ", "c"],
["ȼ", "c"],
["ď", "d"],
["ḑ", "d"],
["ḓ", "d"],
["ȡ", "d"],
["ḋ", "d"],
["ḍ", "d"],
["ɗ", "d"],
["ᶑ", "d"],
["ḏ", "d"],
["ᵭ", "d"],
["ᶁ", "d"],
["đ", "d"],
["ɖ", "d"],
["ƌ", "d"],
["ı", "i"],
["ȷ", "j"],
["ɟ", "j"],
["ʄ", "j"],
["dz", "dz"],
["dž", "dz"],
["é", "e"],
["ĕ", "e"],
["ě", "e"],
["ȩ", "e"],
["ḝ", "e"],
["ê", "e"],
["ế", "e"],
["ệ", "e"],
["ề", "e"],
["ể", "e"],
["ễ", "e"],
["ḙ", "e"],
["ë", "e"],
["ė", "e"],
["ẹ", "e"],
["ȅ", "e"],
["è", "e"],
["ẻ", "e"],
["ȇ", "e"],
["ē", "e"],
["ḗ", "e"],
["ḕ", "e"],
["ⱸ", "e"],
["ę", "e"],
["ᶒ", "e"],
["ɇ", "e"],
["ẽ", "e"],
["ḛ", "e"],
["ꝫ", "et"],
["ḟ", "f"],
["ƒ", "f"],
["ᵮ", "f"],
["ᶂ", "f"],
["ǵ", "g"],
["ğ", "g"],
["ǧ", "g"],
["ģ", "g"],
["ĝ", "g"],
["ġ", "g"],
["ɠ", "g"],
["ḡ", "g"],
["ᶃ", "g"],
["ǥ", "g"],
["ḫ", "h"],
["ȟ", "h"],
["ḩ", "h"],
["ĥ", "h"],
["ⱨ", "h"],
["ḧ", "h"],
["ḣ", "h"],
["ḥ", "h"],
["ɦ", "h"],
["ẖ", "h"],
["ħ", "h"],
["ƕ", "hv"],
["í", "i"],
["ĭ", "i"],
["ǐ", "i"],
["î", "i"],
["ï", "i"],
["ḯ", "i"],
["ị", "i"],
["ȉ", "i"],
["ì", "i"],
["ỉ", "i"],
["ȋ", "i"],
["ī", "i"],
["į", "i"],
["ᶖ", "i"],
["ɨ", "i"],
["ĩ", "i"],
["ḭ", "i"],
["ꝺ", "d"],
["ꝼ", "f"],
["ᵹ", "g"],
["ꞃ", "r"],
["ꞅ", "s"],
["ꞇ", "t"],
["ꝭ", "is"],
["ǰ", "j"],
["ĵ", "j"],
["ʝ", "j"],
["ɉ", "j"],
["ḱ", "k"],
["ǩ", "k"],
["ķ", "k"],
["ⱪ", "k"],
["ꝃ", "k"],
["ḳ", "k"],
["ƙ", "k"],
["ḵ", "k"],
["ᶄ", "k"],
["ꝁ", "k"],
["ꝅ", "k"],
["ĺ", "l"],
["ƚ", "l"],
["ɬ", "l"],
["ľ", "l"],
["ļ", "l"],
["ḽ", "l"],
["ȴ", "l"],
["ḷ", "l"],
["ḹ", "l"],
["ⱡ", "l"],
["ꝉ", "l"],
["ḻ", "l"],
["ŀ", "l"],
["ɫ", "l"],
["ᶅ", "l"],
["ɭ", "l"],
["ł", "l"],
["lj", "lj"],
["ſ", "s"],
["ẜ", "s"],
["ẛ", "s"],
["ẝ", "s"],
["ḿ", "m"],
["ṁ", "m"],
["ṃ", "m"],
["ɱ", "m"],
["ᵯ", "m"],
["ᶆ", "m"],
["ń", "n"],
["ň", "n"],
["ņ", "n"],
["ṋ", "n"],
["ȵ", "n"],
["ṅ", "n"],
["ṇ", "n"],
["ǹ", "n"],
["ɲ", "n"],
["ṉ", "n"],
["ƞ", "n"],
["ᵰ", "n"],
["ᶇ", "n"],
["ɳ", "n"],
["ñ", "n"],
["nj", "nj"],
["ó", "o"],
["ŏ", "o"],
["ǒ", "o"],
["ô", "o"],
["ố", "o"],
["ộ", "o"],
["ồ", "o"],
["ổ", "o"],
["ỗ", "o"],
["ö", "o"],
["ȫ", "o"],
["ȯ", "o"],
["ȱ", "o"],
["ọ", "o"],
["ő", "o"],
["ȍ", "o"],
["ò", "o"],
["ỏ", "o"],
["ơ", "o"],
["ớ", "o"],
["ợ", "o"],
["ờ", "o"],
["ở", "o"],
["ỡ", "o"],
["ȏ", "o"],
["ꝋ", "o"],
["ꝍ", "o"],
["ⱺ", "o"],
["ō", "o"],
["ṓ", "o"],
["ṑ", "o"],
["ǫ", "o"],
["ǭ", "o"],
["ø", "o"],
["ǿ", "o"],
["õ", "o"],
["ṍ", "o"],
["ṏ", "o"],
["ȭ", "o"],
["ƣ", "oi"],
["ꝏ", "oo"],
["ɛ", "e"],
["ᶓ", "e"],
["ɔ", "o"],
["ᶗ", "o"],
["ȣ", "ou"],
["ṕ", "p"],
["ṗ", "p"],
["ꝓ", "p"],
["ƥ", "p"],
["ᵱ", "p"],
["ᶈ", "p"],
["ꝕ", "p"],
["ᵽ", "p"],
["ꝑ", "p"],
["ꝙ", "q"],
["ʠ", "q"],
["ɋ", "q"],
["ꝗ", "q"],
["ŕ", "r"],
["ř", "r"],
["ŗ", "r"],
["ṙ", "r"],
["ṛ", "r"],
["ṝ", "r"],
["ȑ", "r"],
["ɾ", "r"],
["ᵳ", "r"],
["ȓ", "r"],
["ṟ", "r"],
["ɼ", "r"],
["ᵲ", "r"],
["ᶉ", "r"],
["ɍ", "r"],
["ɽ", "r"],
["ↄ", "c"],
["ꜿ", "c"],
["ɘ", "e"],
["ɿ", "r"],
["ś", "s"],
["ṥ", "s"],
["š", "s"],
["ṧ", "s"],
["ş", "s"],
["ŝ", "s"],
["ș", "s"],
["ṡ", "s"],
["ṣ", "s"],
["ṩ", "s"],
["ʂ", "s"],
["ᵴ", "s"],
["ᶊ", "s"],
["ȿ", "s"],
["ɡ", "g"],
["ᴑ", "o"],
["ᴓ", "o"],
["ᴝ", "u"],
["ť", "t"],
["ţ", "t"],
["ṱ", "t"],
["ț", "t"],
["ȶ", "t"],
["ẗ", "t"],
["ⱦ", "t"],
["ṫ", "t"],
["ṭ", "t"],
["ƭ", "t"],
["ṯ", "t"],
["ᵵ", "t"],
["ƫ", "t"],
["ʈ", "t"],
["ŧ", "t"],
["ᵺ", "th"],
["ɐ", "a"],
["ᴂ", "ae"],
["ǝ", "e"],
["ᵷ", "g"],
["ɥ", "h"],
["ʮ", "h"],
["ʯ", "h"],
["ᴉ", "i"],
["ʞ", "k"],
["ꞁ", "l"],
["ɯ", "m"],
["ɰ", "m"],
["ᴔ", "oe"],
["ɹ", "r"],
["ɻ", "r"],
["ɺ", "r"],
["ⱹ", "r"],
["ʇ", "t"],
["ʌ", "v"],
["ʍ", "w"],
["ʎ", "y"],
["ꜩ", "tz"],
["ú", "u"],
["ŭ", "u"],
["ǔ", "u"],
["û", "u"],
["ṷ", "u"],
["ü", "u"],
["ǘ", "u"],
["ǚ", "u"],
["ǜ", "u"],
["ǖ", "u"],
["ṳ", "u"],
["ụ", "u"],
["ű", "u"],
["ȕ", "u"],
["ù", "u"],
["ủ", "u"],
["ư", "u"],
["ứ", "u"],
["ự", "u"],
["ừ", "u"],
["ử", "u"],
["ữ", "u"],
["ȗ", "u"],
["ū", "u"],
["ṻ", "u"],
["ų", "u"],
["ᶙ", "u"],
["ů", "u"],
["ũ", "u"],
["ṹ", "u"],
["ṵ", "u"],
["ᵫ", "ue"],
["ꝸ", "um"],
["ⱴ", "v"],
["ꝟ", "v"],
["ṿ", "v"],
["ʋ", "v"],
["ᶌ", "v"],
["ⱱ", "v"],
["ṽ", "v"],
["ꝡ", "vy"],
["ẃ", "w"],
["ŵ", "w"],
["ẅ", "w"],
["ẇ", "w"],
["ẉ", "w"],
["ẁ", "w"],
["ⱳ", "w"],
["ẘ", "w"],
["ẍ", "x"],
["ẋ", "x"],
["ᶍ", "x"],
["ý", "y"],
["ŷ", "y"],
["ÿ", "y"],
["ẏ", "y"],
["ỵ", "y"],
["ỳ", "y"],
["ƴ", "y"],
["ỷ", "y"],
["ỿ", "y"],
["ȳ", "y"],
["ẙ", "y"],
["ɏ", "y"],
["ỹ", "y"],
["ź", "z"],
["ž", "z"],
["ẑ", "z"],
["ʑ", "z"],
["ⱬ", "z"],
["ż", "z"],
["ẓ", "z"],
["ȥ", "z"],
["ẕ", "z"],
["ᵶ", "z"],
["ᶎ", "z"],
["ʐ", "z"],
["ƶ", "z"],
["ɀ", "z"],
["ff", "ff"],
["ffi", "ffi"],
["ffl", "ffl"],
["fi", "fi"],
["fl", "fl"],
["ij", "ij"],
["œ", "oe"],
["st", "st"],
["ₐ", "a"],
["ₑ", "e"],
["ᵢ", "i"],
["ⱼ", "j"],
["ₒ", "o"],
["ᵣ", "r"],
["ᵤ", "u"],
["ᵥ", "v"],
["ₓ", "x"],
]);