将(多个)电子邮件从用户输入文本提取到MailAddress格式(.NET)

将(多个)电子邮件从用户输入文本提取到MailAddress格式(.NET),.net,email,email-address,.net,Email,Email Address,该类不提供解析包含多封电子邮件的字符串的方法。类,但它只接受CSV,不允许在引号内使用逗号。我正在寻找一个文本处理器来创建一个从用户输入没有这些限制的电子邮件收集 处理器应采用以下任何格式的逗号或分号分隔值: "First Middle Last" <fml@example.com> First Middle Last <fml@example.com> fml@example.com "Last, First" <fml@example.com> “第一个

该类不提供解析包含多封电子邮件的字符串的方法。类,但它只接受CSV,不允许在引号内使用逗号。我正在寻找一个文本处理器来创建一个从用户输入没有这些限制的电子邮件收集

处理器应采用以下任何格式的逗号或分号分隔值:

"First Middle Last" <fml@example.com>
First Middle Last <fml@example.com>
fml@example.com
"Last, First" <fml@example.com>
“第一个中间最后一个”
前中后
fml@example.com
“最后,第一”
开源库(old)有一个EmailAddress类,可以解析几乎所有合法形式的电子邮件地址,还有一个。你可以从那里开始。

问了一个问题后,我发现了一个更好的方法:

/// <summary>
/// Extracts email addresses in the following formats:
/// "Tom W. Smith" &lt;tsmith@contoso.com&gt;
/// "Smith, Tom" &lt;tsmith@contoso.com&gt;
/// Tom W. Smith &lt;tsmith@contoso.com&gt;
/// tsmith@contoso.com
/// Multiple emails can be separated by a comma or semicolon.
/// Watch out for <see cref="FormatException"/>s when enumerating.
/// </summary>
/// <param name="value">Collection of emails in the accepted formats.</param>
/// <returns>
/// A collection of <see cref="System.Net.Mail.MailAddress"/>es.
/// </returns>
/// <exception cref="ArgumentException">Thrown if the value is null, empty, or just whitespace.</exception>
public static IEnumerable<MailAddress> ExtractEmailAddresses(this string value)
{
    if (string.IsNullOrWhiteSpace(value)) throw new ArgumentException("The arg cannot be null, empty, or just whitespace.", "value");

    // Remove commas inside of quotes
    value = value.Replace(';', ',');
    var emails = value.SplitWhilePreservingQuotedValues(',');
    var mailAddresses = emails.Select(email => new MailAddress(email));
    return mailAddresses;
}

/// <summary>
/// Splits the string while preserving quoted values (i.e. instances of the delimiter character inside of quotes will not be split apart).
/// Trims leading and trailing whitespace from the individual string values.
/// Does not include empty values.
/// </summary>
/// <param name="value">The string to be split.</param>
/// <param name="delimiter">The delimiter to use to split the string, e.g. ',' for CSV.</param>
/// <returns>A collection of individual strings parsed from the original value.</returns>
public static IEnumerable<string> SplitWhilePreservingQuotedValues(this string value, char delimiter)
{
    Regex csvPreservingQuotedStrings = new Regex(string.Format("(\"[^\"]*\"|[^{0}])+", delimiter));
    var values =
        csvPreservingQuotedStrings.Matches(value)
        .Cast<Match>()
        .Select(m => m.Value.Trim())
        .Where(v => !string.IsNullOrWhiteSpace(v));
    return values;
}
//
///按以下格式提取电子邮件地址:
///“汤姆·W·史密斯”tsmith@contoso.com
///“史密斯,汤姆”tsmith@contoso.com
///汤姆·W·史密斯tsmith@contoso.com
/// tsmith@contoso.com
///多封电子邮件可以用逗号或分号分隔。
///在枚举时,请注意s。
/// 
///以公认格式收集电子邮件。
/// 
///一套小册子。
/// 
///如果值为null、空或仅为空白,则引发。
公共静态IEnumerable ExtractEmailAddresses(此字符串值)
{
if(string.IsNullOrWhiteSpace(value))抛出新的ArgumentException(“参数不能为null、空或仅为空格。”,“value”);
//删除引号内的逗号
value=value.Replace(“;”,“,”);
var=value.SplitWhilePreservingQuotedValues(',');
var mailAddresses=电子邮件。选择(电子邮件=>新邮件地址(电子邮件));
返回邮件地址;
}
/// 
///在保留引号值的同时拆分字符串(即引号内分隔符的实例将不会被拆分)。
///从单个字符串值中修剪前导和尾随空格。
///不包括空值。
/// 
///要拆分的字符串。
///用于拆分字符串的分隔符,例如CSV的“,”。
///从原始值解析的单个字符串的集合。
公共静态IEnumerable SplitWhilePreservingQuotedValues(此字符串值,字符分隔符)
{
Regex csvPreservingQuotedStrings=new Regex(string.Format(“(\”[^\“]*\”[124;[^{0}])+”,分隔符));
var值=
csvPreservingQuotedStrings.Matches(值)
.Cast()
.Select(m=>m.Value.Trim())
.Where(v=>!string.IsNullOrWhiteSpace(v));
返回值;
}
此方法通过以下测试:

[TestMethod]
public void ExtractEmails_SingleEmail_Matches()
{
    string value = "a@a.a";
    var expected = new List<MailAddress>
        {
            new MailAddress("a@a.a"),
        };

    var actual = value.ExtractEmailAddresses();

    CollectionAssert.AreEqual(expected, actual.ToList());
}

[TestMethod()]
public void ExtractEmails_JustEmailCSV_Matches()
{
    string value = "a@a.a; a@a.a";
    var expected = new List<MailAddress>
        {
            new MailAddress("a@a.a"),
            new MailAddress("a@a.a"),
        };

    var actual = value.ExtractEmailAddresses();

    CollectionAssert.AreEqual(expected, actual.ToList());
}

[TestMethod]
public void ExtractEmails_MultipleWordNameThenEmailSemicolonSV_Matches()
{
    string value = "a a a <a@a.a>; a a a <a@a.a>";
    var expected = new List<MailAddress>
        {
            new MailAddress("a a a <a@a.a>"),
            new MailAddress("a a a <a@a.a>"),
        };

    var actual = value.ExtractEmailAddresses();

    CollectionAssert.AreEqual(expected, actual.ToList());
}

[TestMethod]
public void ExtractEmails_JustEmailsSemicolonSV_Matches()
{
    string value = "a@a.a; a@a.a";
    var expected = new List<MailAddress>
        {
            new MailAddress("a@a.a"),
            new MailAddress("a@a.a"),
        };

    var actual = value.ExtractEmailAddresses();

    CollectionAssert.AreEqual(expected, actual.ToList());
}

[TestMethod]
public void ExtractEmails_NameInQuotesWithCommaThenEmailsCSV_Matches()
{
    string value = "\"a, a\" <a@a.a>; \"a, a\" <a@a.a>";
    var expected = new List<MailAddress>
        {
            new MailAddress("\"a, a\" <a@a.a>"),
            new MailAddress("\"a, a\" <a@a.a>"),
        };

    var actual = value.ExtractEmailAddresses();

    CollectionAssert.AreEqual(expected, actual.ToList());
}

[TestMethod]
[ExpectedException(typeof(ArgumentException))]
public void ExtractEmails_EmptyString_Throws()
{
    string value = string.Empty;

    var actual = value.ExtractEmailAddresses();
}

[TestMethod]
[ExpectedException(typeof(FormatException))]
public void ExtractEmails_NonEmailValue_ThrowsOnEnumeration()
{
    string value = "a";

    var actual = value.ExtractEmailAddresses();

    actual.ToList();
}
[TestMethod]
public void ExtractEmails\u SingleEmail\u Matches()
{
字符串值=”a@a.a";
var预期值=新列表
{
新邮件地址(“a@a.a"),
};
var实际值=value.ExtractEmailAddresses();
CollectionAssert.AreEqual(预期的、实际的.ToList());
}
[TestMethod()]
public void ExtractEmails\u JustEmailCSV\u Matches()
{
字符串值=”a@a.a; a@a.a";
var预期值=新列表
{
新邮件地址(“a@a.a"),
新邮件地址(“a@a.a"),
};
var实际值=value.ExtractEmailAddresses();
CollectionAssert.AreEqual(预期的、实际的.ToList());
}
[测试方法]
public void ExtractEmails\u multiplewordnamethemailSemicolonsv\u Matches()
{
string value=“a;a”;
var预期值=新列表
{
新邮寄地址(“a”),
新邮寄地址(“a”),
};
var实际值=value.ExtractEmailAddresses();
CollectionAssert.AreEqual(预期的、实际的.ToList());
}
[测试方法]
public void extractedemails\u JustEmailsSemicolonSV\u Matches()
{
字符串值=”a@a.a; a@a.a";
var预期值=新列表
{
新邮件地址(“a@a.a"),
新邮件地址(“a@a.a"),
};
var实际值=value.ExtractEmailAddresses();
CollectionAssert.AreEqual(预期的、实际的.ToList());
}
[测试方法]
public void extracting email_nameinquotes with commathenemailscsv_Matches()
{
字符串值=“\”a,a\”;“a,a\”;
var预期值=新列表
{
新邮件地址(“\”a,a\”),
新邮件地址(“\”a,a\”),
};
var实际值=value.ExtractEmailAddresses();
CollectionAssert.AreEqual(预期的、实际的.ToList());
}
[测试方法]
[ExpectedException(typeof(ArgumentException))]
公共无效提取电子邮件\u清空字符串\u抛出()
{
字符串值=string.Empty;
var实际值=value.ExtractEmailAddresses();
}
[测试方法]
[ExpectedException(typeof(FormatException))]
public void extractedemails\u NonEmailValue\u throwSoneumeration()
{
字符串值=“a”;
var实际值=value.ExtractEmailAddresses();
ToList();
}

事实上,MailAddressCollection确实支持逗号分隔的地址,即使引号中有逗号。问题是CSV列表必须已经编码到ASCII字符集中,即Unicode地址的Q编码或B编码


基类库中没有执行此编码的函数,尽管我在中提供了B编码。我还添加了一个电子邮件解析函数,用于解决此线程中的问题。

MailAddressCollection.Add()例程支持逗号分隔的地址列表

Dim mc As New Net.Mail.MailAddressCollection()
mc.Add("Bob <bob@bobmail.com>, mary@marymail.com, ""John Doe"" <john.doe@myemail.com>")
For Each m As Net.Mail.MailAddress In mc
    Debug.Print("{0} ({1})", m.DisplayName, m.Address)
Next

尽管,正如naasking所说,MailAddressCollection实际上支持引号字符串中的逗号,但如果它抛出格式异常,它无法准确识别字符串中的错误位置。因此,这个答案也是很好的:通过首先分离电子邮件,您还可以识别问题可能发生的位置,因为我们处理在单个MailAddress构造函数上引发的异常。您没有抓住要点。它不支持引号内的逗号-
“Doe,John”
-它将其视为分隔输入的逗号,而不是单个输入的一部分。
Bob (bob@bobmail.com)
(mary@marymail.com)
John Doe (john.doe@myemail.com)