Algorithm 查找字符串中第一个未重复的字符
查找字符串中只出现一次的第一个字符的最快方法是什么?我看到人们在下面发布了一些令人愉快的答案,因此我想提供一些更深入的内容 Ruby中的惯用解决方案 我们可以在字符串中找到第一个不重复的字符,如下所示:Algorithm 查找字符串中第一个未重复的字符,algorithm,language-agnostic,string,Algorithm,Language Agnostic,String,查找字符串中只出现一次的第一个字符的最快方法是什么?我看到人们在下面发布了一些令人愉快的答案,因此我想提供一些更深入的内容 Ruby中的惯用解决方案 我们可以在字符串中找到第一个不重复的字符,如下所示: def first_unrepeated_char字符串 string.each|char.tally.find{| |,n | n==1}。首先 结束 Ruby是如何做到这一点的 阅读Ruby的源代码 让我们分解这个解决方案,考虑Ruby使用的每一步的算法。 首先,我们调用字符串上的每个字符
def first_unrepeated_char字符串
string.each|char.tally.find{| |,n | n==1}。首先
结束
Ruby是如何做到这一点的
阅读Ruby的源代码
让我们分解这个解决方案,考虑Ruby使用的每一步的算法。 首先,我们调用字符串上的每个字符。这将创建一个枚举器,允许我们一次访问一个字符的字符串。由于Ruby处理Unicode字符,所以我们从枚举器获得的每个值都可以是可变的字节数,这一点很复杂。如果我们知道我们的输入是ASCII或类似的,我们可以使用
每个字节
每个_char
方法是
rb\u str\u每个字符(值str)
{
返回大小的枚举数(str、0、0、rb\u str\u每个字符大小);
返回rb_str_enumerate_字符(str,0);
}
依次,rb\u string\u enumerate\u chars
rb\u str\u枚举字符(值str,值ary)
{
值orig=str;
长i,len,n;
常量字符*ptr;
rb_编码*enc;
str=rb_str_新_冻结(str);
ptr=RSTRING_ptr(str);
len=RSTRING_len(str);
enc=rb_enc_get(str);
如果(ENC_代码范围_清洁_P(ENC_代码范围(str))){
对于(i=0;i
从中我们可以看到,它调用rb_enc_mbclen
(或其快速版本)来获取字符串中下一个字符的长度(以字节为单位),以便可以迭代下一步。通过懒洋洋地迭代一个字符串,一次只读取一个字符,当tally
消耗迭代器时,我们只对输入字符串执行一次完整的遍历
理货是:
静态无效
汇总(值哈希、值组)
{
值tally=rb_hash_aref(散列,组);
如果(无P(计数)){
tally=INT2FIX(1);
}
否则如果(FIXNUM_P(计数)和&tally
这里,tally\u i
使用RB\u BLOCK\u CALL\u FUNC\u ARGLIST
反复调用tally\u up
,每次迭代时更新tally散列
粗糙时间与记忆分析
each_char
方法没有分配一个数组来急切地保存字符串的字符,因此它有一个小的恒定内存开销。当我们tall
字符时,我们分配一个散列并将计数数据放入其中,在最坏的情况下,它会占用与输入字符串乘以某个常数因子一样多的内存
时间方面,tally
对字符串进行完整扫描,调用find
查找第一个不重复的字符将再次扫描哈希,每个字符的最坏情况复杂度为O(n)
public void findUnique(String string) {
ArrayList<Character> uniqueList = new ArrayList<>();
int[] chatArr = new int[128];
for (int i = 0; i < string.length(); i++) {
Character ch = string.charAt(i);
if (chatArr[ch] != -1) {
chatArr[ch] = -1;
uniqueList.add(ch);
} else {
uniqueList.remove(ch);
}
}
if (uniqueList.size() == 0) {
System.out.println("No unique character found!");
} else {
System.out.println("First unique character is :" + uniqueList.get(0));
}
}
然而,tally也会在每次迭代中更新一个哈希值。在每个字符上更新哈希值的速度可能会像O(n)一样慢,所以这个Ruby解决方案最糟糕的情况可能是O(n^2)
然而,在合理的假设下,更新一个散列,因此我们可以预期平均摊销案例看起来像O(n)
我以前用Python接受的答案 在处理整个字符串之前,您无法知道该字符是否不重复,因此我的建议如下:
def first_non_repeated_character(string):
chars = []
repeated = []
for character in string:
if character in chars:
chars.remove(character)
repeated.append(character)
else:
if not character in repeated:
chars.append(character)
if len(chars):
return chars[0]
else:
return False
编辑:最初发布的代码是错误的,但这个最新的代码片段被证明可以在Ryan的计算机上工作™. 它必须至少为O(n),因为在读取所有字符之前,您不知道是否会重复某个字符
因此,您可以对字符进行迭代,并在第一次看到每个字符时将其附加到列表中,并分别记录您看到它的次数(事实上,对计数来说唯一重要的值是“0”、“1”或“大于1”)
当到达字符串末尾时,只需在列表中找到计数正好为1的第一个字符
Python中的示例代码:
def first_non_repeated_character(s):
counts = defaultdict(int)
l = []
for c in s:
counts[c] += 1
if counts[c] == 1:
l.append(c)
for c in l:
if counts[c] == 1:
return c
return None
这在O(n)中运行。为什么不使用基于堆的数据结构,例如最小优先级队列。从字符串中读取每个字符时,根据字符串中的位置和到目前为止出现的次数,以优先级将其添加到队列中。您可以修改队列以在冲突时添加优先级,以便角色的优先级是该角色出现次数的总和。在循环结束时,队列中的第一个元素将是字符串中频率最低的字符,如果有多个计数=1的字符,则第一个元素是添加到队列中的第一个唯一字符。在C中,这几乎是(不完全是O(n!),但大于0(n2)) 但对于大小合理的字符串,它的性能将优于“更好”的算法,因为O太小了。这还可以很容易地告诉您第一个非重复字符串的位置
char FirstNonRepeatedChar(char * psz)
{
for (int ii = 0; psz[ii] != 0; ++ii)
{
for (int jj = ii+1; ; ++jj)
{
// if we hit the end of string, then we found a non-repeat character.
//
if (psz[jj] == 0)
return psz[ii]; // this character doesn't repeat
// if we found a repeat character, we can stop looking.
//
if (psz[ii] == psz[jj])
break;
}
}
return 0; // there were no non-repeating characters.
}
编辑:此代码假定您不是指连续重复的字符 计数器需要Python2.7或Python3.1
>>> from collections import Counter
>>> def first_non_repeated_character(s):
... counts = Counter(s)
... for c in s:
... if counts[c]==1:
... return c
... return None
...
>>> first_non_repeated_character("aaabbbcddd")
'c'
>>> first_non_repeated_character("aaaebbbcddd")
'e'
>>> from collections import Counter
>>> def first_non_repeated_character(s):
... return min((k for k,v in Counter(s).items() if v<2), key=s.index)
...
>>> first_non_repeated_character("aaabbbcddd")
'c'
>>> first_non_repeated_character("aaaebbbcddd")
'e'
下面是另一个有趣的方法。计数器需要Python2.7或Python3.1
>>> from collections import Counter
>>> def first_non_repeated_character(s):
... counts = Counter(s)
... for c in s:
... if counts[c]==1:
... return c
... return None
...
>>> first_non_repeated_character("aaabbbcddd")
'c'
>>> first_non_repeated_character("aaaebbbcddd")
'e'
>>> from collections import Counter
>>> def first_non_repeated_character(s):
... return min((k for k,v in Counter(s).items() if v<2), key=s.index)
...
>>> first_non_repeated_character("aaabbbcddd")
'c'
>>> first_non_repeated_character("aaaebbbcddd")
'e'
来自集合导入计数器的>>
>
use strict;
use warnings;
foreach my $word(@ARGV)
{
my @distinct_chars;
my %char_counts;
my @chars=split(//,$word);
foreach (@chars)
{
push @distinct_chars,$_ unless $_~~@distinct_chars;
$char_counts{$_}++;
}
my $first_non_repeated="";
foreach(@distinct_chars)
{
if($char_counts{$_}==1)
{
$first_non_repeated=$_;
last;
}
}
if(length($first_non_repeated))
{
print "For \"$word\", the first non-repeated character is '$first_non_repeated'.\n";
}
else
{
print "All characters in \"$word\" are repeated.\n";
}
}
jmaney> perl non_repeated.pl aabccd "a huge string in which some characters repeat" abcabc
For "aabccd", the first non-repeated character is 'b'.
For "a huge string in which some characters repeat", the first non-repeated character is 'u'.
All characters in "abcabc" are repeated.
unsigned char find_first_unique(unsigned char *string)
{
int chars[256];
int i=0;
memset(chars, 0, sizeof(chars));
while (string[i++])
{
chars[string[i]]++;
}
i = 0;
while (string[i++])
{
if (chars[string[i]] == 1) return string[i];
}
return 0;
}
public static String findFirstUnique(String str)
{
String unique = "";
foreach (char ch in str)
{
if (unique.Contains(ch)) unique=unique.Replace(ch.ToString(), "");
else unique += ch.ToString();
}
return unique[0].ToString();
}
def first_non_repeated_character(s):
counts = defaultdict(int)
for c in s:
counts[c] += 1
for c in s:
if counts[c] == 1:
return c
return None
string = "conservationist deliberately treasures analytical";
Cases[Gather @ Characters @ string, {_}, 1, 1][[1]]
{"v"}
var string = "tooth";
var hash = [];
for(var i=0; j=string.length, i<j; i++){
if(hash[string[i]] !== undefined){
hash[string[i]] = hash[string[i]] + 1;
}else{
hash[string[i]] = 1;
}
}
for(i=0; j=string.length, i<j; i++){
if(hash[string[i]] === 1){
console.info( string[i] );
return false;
}
}
// prints "h"
C code
-----
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char t_c;
char *t_p = argv[1] ;
char count[128]={'\0'};
char ch;
for(t_c = *(argv[1]); t_c != '\0'; t_c = *(++t_p))
count[t_c]++;
t_p = argv[1];
for(t_c = *t_p; t_c != '\0'; t_c = *(++t_p))
{
if(count[t_c] == 1)
{
printf("Element is %c\n",t_c);
break;
}
}
return 0;
}
char FindUniqueChar(char *a)
{
int i=0;
bool repeat=false;
while(a[i] != '\0')
{
if (a[i] == a[i+1])
{
repeat = true;
}
else
{
if(!repeat)
{
cout<<a[i];
return a[i];
}
repeat=false;
}
i++;
}
return a[i];
}
using System;
using System.Linq;
using System.Text;
namespace SomethingDigital
{
class FirstNonRepeatingChar
{
public static void Main()
{
String input = "geeksforgeeksandgeeksquizfor";
char[] str = input.ToCharArray();
bool[] b = new bool[256];
String unique1 = "";
String unique2 = "";
foreach (char ch in str)
{
if (!unique1.Contains(ch))
{
unique1 = unique1 + ch;
unique2 = unique2 + ch;
}
else
{
unique2 = unique2.Replace(ch.ToString(), "");
}
}
if (unique2 != "")
{
Console.WriteLine(unique2[0].ToString());
Console.ReadLine();
}
else
{
Console.WriteLine("No non repeated string");
Console.ReadLine();
}
}
}
}
def first_non_repeated_character(string)
string1 = string.split('')
string2 = string.split('')
string1.each do |let1|
counter = 0
string2.each do |let2|
if let1 == let2
counter+=1
end
end
if counter == 1
return let1
break
end
end
end
p first_non_repeated_character('dont doddle in the forest')
var first_non_repeated_character = function (string) {
var string1 = string.split('');
var string2 = string.split('');
var single_letters = [];
for (var i = 0; i < string1.length; i++) {
var count = 0;
for (var x = 0; x < string2.length; x++) {
if (string1[i] == string2[x]) {
count++
}
}
if (count == 1) {
return string1[i];
}
}
}
console.log(first_non_repeated_character('dont doddle in the forest'));
console.log(first_non_repeated_character('how are you today really?'));
#include <iostream>
#include <cstdio>
#include <cstdlib>
#include <climits>
using namespace std;
#define No_of_chars 256
//store the count and the index where the char first appear
typedef struct countarray
{
int count;
int index;
}countarray;
//returns the count array
countarray *getcountarray(char *str)
{
countarray *count;
count=new countarray[No_of_chars];
for(int i=0;i<No_of_chars;i++)
{
count[i].count=0;
count[i].index=-1;
}
for(int i=0;*(str+i);i++)
{
(count[*(str+i)].count)++;
if(count[*(str+i)].count==1) //if count==1 then update the index
count[*(str+i)].index=i;
}
return count;
}
char firstnonrepeatingchar(char *str)
{
countarray *array;
array = getcountarray(str);
int result = INT_MAX;
for(int i=0;i<No_of_chars;i++)
{
if(array[i].count==1 && result > array[i].index)
result = array[i].index;
}
delete[] (array);
return (str[result]);
}
int main()
{
char str[] = "geeksforgeeks";
cout<<"First non repeating character is "<<firstnonrepeatingchar(str)<<endl;
return 0;
}
var arr = string.split("");
var occurences = {};
var tmp;
var lowestindex = string.length+1;
arr.forEach( function(c){
tmp = c;
if( typeof occurences[tmp] == "undefined")
occurences[tmp] = tmp;
else
occurences[tmp] += tmp;
});
for(var p in occurences) {
if(occurences[p].length == 1)
lowestindex = Math.min(lowestindex, string.indexOf(p));
}
if(lowestindex > string.length)
return null;
return string[lowestindex];
}
private static string FirstNoRepeatingCharacter(string aword)
{
Dictionary<string, int> dic = new Dictionary<string, int>();
for (int i = 0; i < aword.Length; i++)
{
if (!dic.ContainsKey(aword.Substring(i, 1)))
dic.Add(aword.Substring(i, 1), 1);
else
dic[aword.Substring(i, 1)]++;
}
foreach (var item in dic)
{
if (item.Value == 1) return item.Key;
}
return string.Empty;
}
public void firstUniqueChar(String str){
String unique= "";
String repeated = "";
str = str.toLowerCase();
for(int i=0; i<str.length();i++){
char ch = str.charAt(i);
if(!(repeated.contains(str.subSequence(i, i+1))))
if(unique.contains(str.subSequence(i, i+1))){
unique = unique.replaceAll(Character.toString(ch), "");
repeated = repeated+ch;
}
else
unique = unique+ch;
}
System.out.println(unique.charAt(0));
}
import static java.util.stream.Collectors.counting;
import static java.util.stream.Collectors.groupingBy;
import java.util.Arrays;
import java.util.List;
import java.util.Map;
// Runs in O(N) time and uses lambdas and the stream API from Java 8
// Also, it is only three lines of code!
private static String findFirstUniqueCharacterPerformantWithLambda(String inputString) {
// convert the input string into a list of characters
final List<String> inputCharacters = Arrays.asList(inputString.split(""));
// first, construct a map to count the number of occurrences of each character
final Map<Object, Long> characterCounts = inputCharacters
.stream()
.collect(groupingBy(s -> s, counting()));
// then, find the first unique character by consulting the count map
return inputCharacters
.stream()
.filter(s -> characterCounts.get(s) == 1)
.findFirst()
.orElse(null);
}
public void findUnique(String string) {
ArrayList<Character> uniqueList = new ArrayList<>();
int[] chatArr = new int[128];
for (int i = 0; i < string.length(); i++) {
Character ch = string.charAt(i);
if (chatArr[ch] != -1) {
chatArr[ch] = -1;
uniqueList.add(ch);
} else {
uniqueList.remove(ch);
}
}
if (uniqueList.size() == 0) {
System.out.println("No unique character found!");
} else {
System.out.println("First unique character is :" + uniqueList.get(0));
}
}
def first_unique(s):
repeated = []
while s:
if s[0] not in s[1:] and s[0] not in repeated:
return s[0]
else:
repeated.append(s[0])
s = s[1:]
return None
(first_unique('abdcab') == 'd', first_unique('aabbccdad') == None, first_unique('') == None, first_unique('a') == 'a')
public class Test4 {
public static void main(String[] args) {
String a = "GiniGinaProtijayi";
firstUniqCharindex(a);
}
public static void firstUniqCharindex(String a) {
int[] count = new int[256];
for (int i = 0; i < a.length(); i++) {
count[a.charAt(i)]++;
}
int index = -1;
for (int i = 0; i < a.length(); i++) {
if (count[a.charAt(i)] == 1) {
index = i;
break;
} // if
}
System.out.println(index);// output => 8
System.out.println(a.charAt(index)); //output => P
}// end1
}
def firstUniqChar(a):
count = [0] * 256
for i in a: count[ord(i)] += 1
element = ""
for items in a:
if(count[ord(items) ] == 1):
element = items ;
break
return element
a = "GiniGinaProtijayi";
print(firstUniqChar(a)) # output is P
public class Test2 {
public static void main(String[] args) {
String a = "GiniGinaProtijayi";
Map<Character, Long> map = a.chars()
.mapToObj(
ch -> Character.valueOf((char) ch)
).collect(
Collectors.groupingBy(
Function.identity(),
LinkedHashMap::new,
Collectors.counting()));
System.out.println("MAP => " + map);
// {G=2, i=5, n=2, a=2, P=1, r=1, o=1, t=1, j=1, y=1}
Character chh = map
.entrySet()
.stream()
.filter(entry -> entry.getValue() == 1L)
.map(entry -> entry.getKey())
.findFirst()
.get();
System.out.println("First Non Repeating Character => " + chh);// P
}// main
}