Rust 如何根据'；锈迹斑斑_Rust

Rust 如何根据'；锈迹斑斑

rust

Rust 如何根据'；锈迹斑斑,rust,Rust,该程序用于检查字符串中每个单词的出现次数。每个测试都成功运行，除了当Joe不能区分“大”和“大”时。或words=“首先：不要笑。然后：不要哭。”。如果我摆脱！c、如果在split闭包中是字母数字（），那么我必须写出每个特殊字符，在这些字符上必须拆分单词这是一个关于Exercism的初级练习，所以我想避免使用正则表达式 use std::collections::HashMap; pub fn word_count(words: &str) -> HashMap<St

该程序用于检查字符串中每个单词的出现次数。每个测试都成功运行，除了当Joe不能区分“大”和“大”时。或

words=“首先：不要笑。然后：不要哭。”

。如果我摆脱

！c、 如果在split闭包中是字母数字（）

，那么我必须写出每个特殊字符，在这些字符上必须拆分单词

这是一个关于Exercism的初级练习，所以我想避免使用正则表达式

use std::collections::HashMap;

pub fn word_count(words: &str) -> HashMap<String, u32> {
    let mut indexes: HashMap<String, u32> = HashMap::new();
    let to_lowercase = words.to_lowercase();

    for c in to_lowercase.split(|c: char| !c.is_alphanumeric()).filter(|&x| x!="").collect::<Vec<&str>>(){


        let entry = indexes.entry(c.to_string()).or_insert(0);
        *entry += 1;
    };    

    indexes
}

使用std:：collections:：HashMap；
pub fn单词计数（单词：&str）->HashMap{
让mut索引：HashMap=HashMap:：new（）；
let to_lowercase=单词。to_lowercase（）；
对于c，使用小写字母。拆分（| c:char |！c.是字母数字（））。筛选（|&x | x！=“”）。收集：：（）{
让entry=index.entry（c.to_string（））或_insert（0）；
*条目+=1；
};    
索引
}

一些测试

fn check_word_count(s: &str, pairs: &[(&str, u32)]) {
    // The reason for the awkward code in here is to ensure that the failure
    // message for assert_eq! is as informative as possible. A simpler
    // solution would simply check the length of the map, and then
    // check for the presence and value of each key in the given pairs vector.
    let mut m: HashMap<String, u32> = word_count(s);
    for &(k, v) in pairs.iter() {
        assert_eq!((k, m.remove(&k.to_string()).unwrap_or(0)), (k, v));
    }
    // may fail with a message that clearly shows all extra pairs in the map
    assert_eq!(m.iter().collect::<Vec<(&String, &u32)>>(), vec![]);
}


fn with_apostrophes() {
    check_word_count(
        "First: don't laugh. Then: don't cry.",
        &[
            ("first", 1),
            ("don't", 2),
            ("laugh", 1),
            ("then", 1),
            ("cry", 1),
        ],
    );
}

#[test]
#[ignore]
fn with_quotations() {
    check_word_count(
        "Joe can't tell between 'large' and large.",
        &[
            ("joe", 1),
            ("can't", 1),
            ("tell", 1),
            ("between", 1),
            ("large", 2),
            ("and", 1),
        ],
    );
}

fn检查单词计数（s:&str，pairs:&[（&str，u32）]）{
//这里代码笨拙的原因是为了确保
//assert_eq！的消息尽可能提供信息。更简单的
//解决方案只需检查地图的长度，然后
//检查给定向量对中每个键的存在和值。
让mut m:HashMap=word\u计数；
成对的&（k，v）。iter（）{
assert_eq！（（k，m.remove（&k.to_string（））。unwrap_或（0）），（k，v））；
}
//可能会失败，并显示一条消息，清楚显示地图中的所有额外对
断言（m.iter（）.collect:：（），vec！[]）；
}
fn带_撇号（）{
检查字数(
“首先：不要笑。然后：不要哭。”，
&[
（“第一”，1），
（“不要”，2），
（“笑”，1），
（“then”，1），
（“哭”，1），
],
);
}
#[测试]
#[忽略]
fn带_引号（）{
检查字数(
“乔分不清‘大’和‘大’。”，
&[
（“乔”，1），
（“不能”，1），
（“告诉”，1），
（“介于”，1），
（“大”，2），
（“及”，1），
],
);
}

从规则的角度来看，我想这取决于“单词”的定义。如果您只是将单引号

“

作为不会导致分词的字符之一，那么您将包括

所有宫缩，以及
所有带引号的单词（作为不同的单词类型）

以下代码防止在单个报价单上拆分：

let single_quote: char = '\'';
....
split( |c: char| !c.is_alphanumeric() && c != single_quote)

这将把

'large'

视为一个与

large

不同的词，后者可能不是您想要的，但同样，规则并不明确

这是我的全部计划

use std::collections::HashMap;

pub fn word_count(words: &str) -> HashMap<String, u32> {
    let mut indexes: HashMap<String, u32> = HashMap::new();
    let to_lowercase = words.to_lowercase();
    let single_quote: char = '\'';

    for c in to_lowercase.split
    ( |c: char| !c.is_alphanumeric() && c != single_quote)
        .filter(|x| !x.is_empty())
        .collect::<Vec<&str>>(){

        let entry = indexes.entry(c.to_string()).or_insert(0);
        *entry += 1;
    };    

   indexes
}

fn main(){
    let phrase = "Joe can't tell between 'large' and large.";
    let indices = word_count(phrase);
    println!("Phrase: {}", phrase);
    for (word,index) in indices {
        println!("word: {}, count: {}", word, index);
    }
}

我这个练习的版本没有测试包含缩略语的单词。这个短语在您的测试套件中吗？您是否希望“大”和“大”被视为不同的单词？

Phrase: Joe can't tell between 'large' and large.
word: joe, count: 1    
word: can't, count: 1  
word: 'large', count: 1
word: and, count: 1    
word: between, count: 1
word: tell, count: 1   
word: large, count: 1