Rust 如何根据';锈迹斑斑
该程序用于检查字符串中每个单词的出现次数。每个测试都成功运行,除了当Joe不能区分“大”和“大”时。或Rust 如何根据';锈迹斑斑,rust,Rust,该程序用于检查字符串中每个单词的出现次数。每个测试都成功运行,除了当Joe不能区分“大”和“大”时。或words=“首先:不要笑。然后:不要哭。”。 如果我摆脱!c、 如果在split闭包中是字母数字(),那么我必须写出每个特殊字符,在这些字符上必须拆分单词 这是一个关于Exercism的初级练习,所以我想避免使用正则表达式 use std::collections::HashMap; pub fn word_count(words: &str) -> HashMap<St
words=“首先:不要笑。然后:不要哭。”
。
如果我摆脱!c、 如果在split闭包中是字母数字()
,那么我必须写出每个特殊字符,在这些字符上必须拆分单词
这是一个关于Exercism的初级练习,所以我想避免使用正则表达式
use std::collections::HashMap;
pub fn word_count(words: &str) -> HashMap<String, u32> {
let mut indexes: HashMap<String, u32> = HashMap::new();
let to_lowercase = words.to_lowercase();
for c in to_lowercase.split(|c: char| !c.is_alphanumeric()).filter(|&x| x!="").collect::<Vec<&str>>(){
let entry = indexes.entry(c.to_string()).or_insert(0);
*entry += 1;
};
indexes
}
使用std::collections::HashMap;
pub fn单词计数(单词:&str)->HashMap{
让mut索引:HashMap=HashMap::new();
let to_lowercase=单词。to_lowercase();
对于c,使用小写字母。拆分(| c:char |!c.是字母数字())。筛选(|&x | x!=“”)。收集::(){
让entry=index.entry(c.to_string())或_insert(0);
*条目+=1;
};
索引
}
一些测试
fn check_word_count(s: &str, pairs: &[(&str, u32)]) {
// The reason for the awkward code in here is to ensure that the failure
// message for assert_eq! is as informative as possible. A simpler
// solution would simply check the length of the map, and then
// check for the presence and value of each key in the given pairs vector.
let mut m: HashMap<String, u32> = word_count(s);
for &(k, v) in pairs.iter() {
assert_eq!((k, m.remove(&k.to_string()).unwrap_or(0)), (k, v));
}
// may fail with a message that clearly shows all extra pairs in the map
assert_eq!(m.iter().collect::<Vec<(&String, &u32)>>(), vec![]);
}
fn with_apostrophes() {
check_word_count(
"First: don't laugh. Then: don't cry.",
&[
("first", 1),
("don't", 2),
("laugh", 1),
("then", 1),
("cry", 1),
],
);
}
#[test]
#[ignore]
fn with_quotations() {
check_word_count(
"Joe can't tell between 'large' and large.",
&[
("joe", 1),
("can't", 1),
("tell", 1),
("between", 1),
("large", 2),
("and", 1),
],
);
}
fn检查单词计数(s:&str,pairs:&[(&str,u32)]){
//这里代码笨拙的原因是为了确保
//assert_eq!的消息尽可能提供信息。更简单的
//解决方案只需检查地图的长度,然后
//检查给定向量对中每个键的存在和值。
让mut m:HashMap=word\u计数;
成对的&(k,v)。iter(){
assert_eq!((k,m.remove(&k.to_string())。unwrap_或(0)),(k,v));
}
//可能会失败,并显示一条消息,清楚显示地图中的所有额外对
断言(m.iter().collect::(),vec![]);
}
fn带_撇号(){
检查字数(
“首先:不要笑。然后:不要哭。”,
&[
(“第一”,1),
(“不要”,2),
(“笑”,1),
(“then”,1),
(“哭”,1),
],
);
}
#[测试]
#[忽略]
fn带_引号(){
检查字数(
“乔分不清‘大’和‘大’。”,
&[
(“乔”,1),
(“不能”,1),
(“告诉”,1),
(“介于”,1),
(“大”,2),
(“及”,1),
],
);
}
从规则的角度来看,我想这取决于“单词”的定义。如果您只是将单引号“
作为不会导致分词的字符之一,那么您将包括
- 所有宫缩,以及
- 所有带引号的单词(作为不同的单词类型)
let single_quote: char = '\'';
....
split( |c: char| !c.is_alphanumeric() && c != single_quote)
这将把'large'
视为一个与large
不同的词,后者可能不是您想要的,但同样,规则并不明确
这是我的全部计划
use std::collections::HashMap;
pub fn word_count(words: &str) -> HashMap<String, u32> {
let mut indexes: HashMap<String, u32> = HashMap::new();
let to_lowercase = words.to_lowercase();
let single_quote: char = '\'';
for c in to_lowercase.split
( |c: char| !c.is_alphanumeric() && c != single_quote)
.filter(|x| !x.is_empty())
.collect::<Vec<&str>>(){
let entry = indexes.entry(c.to_string()).or_insert(0);
*entry += 1;
};
indexes
}
fn main(){
let phrase = "Joe can't tell between 'large' and large.";
let indices = word_count(phrase);
println!("Phrase: {}", phrase);
for (word,index) in indices {
println!("word: {}, count: {}", word, index);
}
}
我这个练习的版本没有测试包含缩略语的单词。这个短语在您的测试套件中吗?您是否希望“大”和“大”被视为不同的单词?
Phrase: Joe can't tell between 'large' and large.
word: joe, count: 1
word: can't, count: 1
word: 'large', count: 1
word: and, count: 1
word: between, count: 1
word: tell, count: 1
word: large, count: 1