打印文件中最常用的单词（字符串）Objective-C_Objective C_Algorithm_Sorting_Big O

打印文件中最常用的单词（字符串）Objective-C

objective-c algorithm sorting big-o

打印文件中最常用的单词（字符串）Objective-C,objective-c,algorithm,sorting,big-o,Objective C,Algorithm,Sorting,Big O,新加入objective-c，需要帮助解决此问题：编写一个包含两个参数的函数： 1表示文本文档的字符串，以及 2一个整数，提供要返回的项目数。实现该函数，使其返回按词频排序的字符串列表，最常出现的单词排在第一位。用你最好的判断来决定单词是如何分开的。您的解决方案应在O（n）时间内运行，其中n是文档中的字符数。像生产/商业系统一样实现此功能。您可以使用任何标准数据结构到目前为止我所尝试的（正在进行的工作）：`//Function正在进行的工作 // -(NSString *) wordFr

新加入objective-c，需要帮助解决此问题：

编写一个包含两个参数的函数：

1表示文本文档的字符串，以及
2一个整数，提供要返回的项目数。实现该函数，使其返回按词频排序的字符串列表，最常出现的单词排在第一位。用你最好的判断来决定单词是如何分开的。您的解决方案应在O（n）时间内运行，其中n是文档中的字符数。像生产/商业系统一样实现此功能。您可以使用任何标准数据结构

到目前为止我所尝试的（正在进行的工作）：`//Function正在进行的工作

// -(NSString *) wordFrequency:(int)itemsToReturn  inDocument:(NSString *)textDocument ;
//  Get the desktop directory (where the text document is)

NSURL *desktopDirectory = [[NSFileManager defaultManager] URLForDirectory:NSDesktopDirectory inDomain:NSUserDomainMask appropriateForURL:nil create:NO error:nil];

 //  Create full path to the file
 NSURL *fullPath = [desktopDirectory URLByAppendingPathComponent:@"document.txt"];

 //  Load the string
 NSString *content = [NSString stringWithContentsOfURL:fullPath encoding:NSUTF8StringEncoding error:nil];
 //  Optional code for confirmation - Check that the file is here and print its content to the console
 //  NSLog(@" The string is:%@", content);

 // Create an array with the words contain in the string
  NSArray *myWords = [content componentsSeparatedByString:@" "];

 //  Optional code for confirmation - Print content of the array to the console
 //  NSLog(@"array: %@", myWords);
 //  Take an NSCountedSet of objects in an array and order those objects by their object count then returns a sorted array, sorted in descending order by the count of the objects.

  NSCountedSet *countedSet = [[NSCountedSet alloc] initWithArray:myWords];
  NSMutableArray *dictArray = [NSMutableArray array];
  [countedSet enumerateObjectsUsingBlock:^(id obj, BOOL *stop) {
  [dictArray addObject:@{@"word": obj,
                               @"count": @([countedSet countForObject:obj])}];
    }];

  NSLog(@"Words sorted by count: %@", [dictArray sortedArrayUsingDescriptors:@[[NSSortDescriptor sortDescriptorWithKey:@"count" ascending:NO]]]);
 }
return 0;
 }

这是地图缩小的经典作业。我非常熟悉objective-c，但据我所知，这些概念很容易在其中实现
第一个地图是计算发生的次数。
这一步基本上是根据单词对元素进行分组，然后对它们进行计数

map(text): for each word in text: emit(word,'1') reduce(word,list<number>): emit (word,sum(number))

map（文本）：对于文本中的每个单词：发射（单词“1”）减少（单词、列表）：发出（字、和（数））
使用map reduce的另一种方法是使用迭代计算和哈希映射，哈希映射是统计每个单词发生次数的直方图
在你有了一个数字和事件的列表之后，你所要做的就是把它们排在前k位。这在以下线程中得到了很好的解释：。
在这里，“比较器”是#每个单词的发生率，如前一步中计算的
基本思想是使用最小堆，并在其中存储
k
第一个元素。
现在，迭代剩余的元素，如果新元素大于顶部（堆中的最小元素），则移除顶部并用新元素替换它
最后，您有一个包含
k
最大元素的堆，它们已经在一个堆中-因此它们已经被排序（虽然顺序相反，但处理起来相当容易）
复杂性是
O（nlogK）

要实现
O（n+klogk）
您可以使用而不是min-heap解决方案来获取top-k，然后对检索到的元素进行排序。
您的问题是什么？你的算法有效吗？如果没有，你会得到什么结果如果您的问题是关于改进工作代码的，那么最好是在。嗨，Martin，我希望有人能帮助我解决这个编码难题，因为我对objective-c还不熟悉。在其他讨论中，我看到了用ruby为同一个问题编写代码的例子，但我想让一些人看到objective-c中的一些代码示例，以便更好地理解并让我走上正确的道路。算法工作原理我得到了如下结果：2014-04-15 02:59:51.387[6666:303]按计数排序的单词：（{count=6；word=and；}，{count=5；word=a；}，{count=5；word=the；}，{count=3；word=of；但我不知道如何将此代码集成到适当的函数中