C++ 将wchar\u t转换为char_C++ - Fatal编程技术网

C++ 将wchar\u t转换为char

c++

C++ 将wchar\u t转换为char,c++,C++,我想知道这样做安全吗 wchar_t wide = /* something */; assert(wide >= 0 && wide < 256 &&); char myChar = static_cast<char>(wide); wchar\u t wide=/*something*/；断言（宽>=0&&widewstring-->string-->char wchar_t wide; wstring wstrValue; wst

我想知道这样做安全吗

wchar_t wide = /* something */;
assert(wide >= 0 && wide < 256 &&);
char myChar = static_cast<char>(wide);

wchar\u t wide=/*something*/；
断言（宽>=0&&wide<256&&）；
char myChar=静态_转换（宽）；

如果我非常确定宽字符将在ASCII范围内。

为什么不使用库例程。

assert

用于确保在调试模式下某些内容是真实的，而不会对发布版本产生任何影响。最好使用

if

语句，并为超出范围的字符制定备用计划，除非获得超出范围的字符的唯一方法是通过程序错误

此外，根据您的字符编码，您可能会发现Unicode字符0x80到0xff与其

char

版本之间存在差异。

从技术上讲，

char

”可以与“

有符号char

”或“

无符号char

”具有相同的范围。对于无符号字符，您的范围是正确的；理论上，对于有符号字符，您的条件是错误的。实际上，很少有编译器会反对——结果也是一样的

挑剔：断言中的最后一个

&&

是语法错误

断言是否合适取决于当代码到达客户时您是否能够承受崩溃，以及如果违反断言条件但断言未编译到代码中，您可以或应该做什么。对于调试工作来说，这似乎很好，但您可能还需要一个活动测试来进行运行时检查。

通常，不需要。

int（wchar\u t（255））==int（char（255））

，当然，这意味着它们具有相同的int值。它们可能不代表相同的字符

你甚至会在大多数Windows PC上看到这种差异。例如，在Windows代码页1250上，char（0xFF）与wchar\u t（0x02D9）（上面的点）是相同的字符，而不是wchar\u t（0x00FF）（带分隔符的小y）

注意，它甚至不适用于ASCII范围，因为C++甚至不需要ASCII。特别是在IBM系统上，您可能会看到

'A'！=65

您正在寻找：它在ANSI标准中，所以您可以信赖它。即使

wchar\u t

使用的代码高于255，它也能工作。你几乎肯定不想使用它

wchar\u t

是一种整数类型，因此如果您确实执行以下操作，编译器不会抱怨：

char x = (char)wc;

但因为它是一种积分类型，所以绝对没有理由这样做。如果你不小心读到了，或者是基于它的任何一本书，那么你就完全被误导了。字符的类型应为

int

或更高。这意味着你应该写下：

int x = getchar();

而不是这个：

char x = getchar(); /* <- WRONG! */

这是荒谬的错误。它不会做你想做的事；它将以微妙而严肃的方式出现，在不同的平台上表现不同，你肯定会把你的用户搞糊涂。如果您看到这一点，您正在尝试重新实现ANSI C的一部分，但它仍然是错误的
您真正需要的是，它将字符串从一种编码（即使它打包成
wchar\t
数组）转换为另一种编码的字符串

现在开始阅读，了解iconv的错误。
还可以转换wchar\u t-->wstring-->string-->char

wchar_t wide; wstring wstrValue; wstrValue[0] = wide string strValue; strValue.assign(wstrValue.begin(), wstrValue.end()); // convert wstring to string char char_value = strValue[0];

我不久前编写了一个短函数，用于将wchar\u t数组打包到char数组中。不在ANSI代码页（0-127）上的字符将替换为“？”字符，并正确处理代理项对

size_t to_narrow(const wchar_t * src, char * dest, size_t dest_len){ size_t i; wchar_t code; i = 0; while (src[i] != '\0' && i < (dest_len - 1)){ code = src[i]; if (code < 128) dest[i] = char(code); else{ dest[i] = '?'; if (code >= 0xD800 && code <= 0xD8FF) // lead surrogate, skip the next code unit, which is the trail i++; } i++; } dest[i] = '\0'; return i - 1; }

size\t to\u狭窄（常量wchar\u t*src、char*dest、size\u t dest\u len）{ 尺寸i； wchar__t代码； i=0；而（src[i]！='\0'和&i<（dest_len-1））{ 代码=src[i]；如果（代码<128） dest[i]=字符（代码）；否则{ dest[i]='？'；如果（code>=0xD800&&code这里有另一种方法，请记住对结果使用free（） char* wchar_to_char(const wchar_t* pwchar) { // get the number of characters in the string. int currentCharIndex = 0; char currentChar = pwchar[currentCharIndex]; while (currentChar != '\0') { currentCharIndex++; currentChar = pwchar[currentCharIndex]; } const int charCount = currentCharIndex + 1; // allocate a new block of memory size char (1 byte) instead of wide char (2 bytes) char* filePathC = (char*)malloc(sizeof(char) * charCount); for (int i = 0; i < charCount; i++) { // convert to char (1 byte) char character = pwchar[i]; *filePathC = character; filePathC += sizeof(char); } filePathC += '\0'; filePathC -= (sizeof(char) * charCount); return filePathC; } char*wchar\u to_char（const wchar\u t*pwchar） { //获取字符串中的字符数。 int currentCharIndex=0； char currentChar=pwchar[currentCharIndex]；而（currentChar！='\0'） { currentCharIndex++； currentChar=pwchar[currentCharIndex]； } 常量int charCount=currentCharIndex+1； //分配一个新的内存块大小字符（1字节），而不是宽字符（2字节） char*filePathC=（char*）malloc（sizeof（char）*charCount）； for（int i=0；i 一个简单的方法是： wstring your_wchar_in_ws(<your wchar>); string your_wchar_in_str(your_wchar_in_ws.begin(), your_wchar_in_ws.end()); char* your_wchar_in_char = your_wchar_in_str.c_str(); wstring您的_wchar_in_ws（）；在字符串中串上你的字母（你的字母在字母ws.begin（）中，你的字母在字母ws.end（）中）； char*your_wchar_in_char=your_wchar_in_str.c_str（）；我使用这种方法已有多年：）这是字符串。我只想转换一个字符。@Igor Zevaka，我刚刚测试了一下，发现它错了。你纠正了错误了吗？谢谢。我不认为这个答案的开头陈述是有道理的。在我看来，他问的是把16位值截断为8位值；他什么也没问关于保留语义。此外，他处理的char 值可能是char ，因为它来自于，比如说，cin.getline（），作用于char[] 。正确但徒劳，char 是一个非常有争议的语句。w 应该是什么？它不应该是src？它应该是src。代码与我原来的不完全一样，我在重构时错过了那个实例。'char'和'signed char'是同义词。@cvanbrederode:那不是标准说明。§6.2.5类型»15 wstring your_wchar_in_ws(<your wchar>); string your_wchar_in_str(your_wchar_in_ws.begin(), your_wchar_in_ws.end()); char* your_wchar_in_char = your_wchar_in_str.c_str();