C++ 从ASCII到Unicode字符码的转换（FreeType2）_C++_Unicode_Ascii_Freetype

C++ 从ASCII到Unicode字符码的转换（FreeType2）

c++ unicode

C++ 从ASCII到Unicode字符码的转换（FreeType2）,c++,unicode,ascii,freetype,C++,Unicode,Ascii,Freetype,我正在我的一个项目中使用FreeType2。为了呈现字母，我需要提供一个Unicode双字节字符代码。不过，程序读取的字符代码是ASCII单字节格式。对于128以下的字符代码（字符代码相同），这没有问题，但其他128个字符代码不匹配。例如： ASCII中的“a”是0x61，Unicode中的“a”是0x0061-这很好 ASCII中的“ą”是0xB9，Unicode中的“ą”是0x0105-完全不同我试图在那里使用WinAPI函数，但一定是做错了什么。以下是一个示例： unsigned cha

我正在我的一个项目中使用FreeType2。为了呈现字母，我需要提供一个Unicode双字节字符代码。不过，程序读取的字符代码是ASCII单字节格式。对于128以下的字符代码（字符代码相同），这没有问题，但其他128个字符代码不匹配。例如：

ASCII中的“a”是0x61，Unicode中的“a”是0x0061-这很好
ASCII中的“ą”是0xB9，Unicode中的“ą”是0x0105-完全不同

我试图在那里使用WinAPI函数，但一定是做错了什么。以下是一个示例：

unsigned char szTest1[] = "ąółź"; //ASCII format
wchar_t* wszTest2;
int size = MultiByteToWideChar(CP_UTF8, 0, (char*)szTest1, 4, NULL, 0);
printf("size = %d\n", size);
wszTest2 = new wchar_t[size];
MultiByteToWideChar(CP_UTF8, 0, (char*)szTest1, 4, wszTest2, size);
printf("HEX: %x\n", wszTest2[0]);
delete[] wszTest2;

我希望创建一个新的宽字符串，末尾没有NULL。但是，大小变量始终等于0。知道我做错了什么吗？或者可能有更简单的方法来解决这个问题？

将

CodePage

参数设置为

MultiByteToWideChar

是错误的。Utf-8与ASCII不同。您应该使用

CP\u ACP

，它告诉您当前的系统代码页（与ASCII不同-请参阅）

大小很可能为零，因为测试字符串不是有效的Utf-8字符串

对于几乎所有的Win32函数，您都可以在函数无法获取详细错误代码后调用GetLastError（），因此调用该函数也会提供更多详细信息。

纯ASCII字符集限制在0-127（7位）范围内。具有最高有效位集的8位字符（即128-255范围内的字符）不是唯一定义的：它们的定义取决于代码页。因此，您的字符
ą
（带OGONEK的拉丁文小写字母A）由特定代码页中的值
0xB9
表示，该值应为。在其他代码页中，值
0xB9
与不同的字符相关联（例如，在中，
0xB9
与字符
1
相关联，即上标数字1）
要使用Windows Win32 API将特定代码页中的字符转换为Unicode UTF-16，您可以使用，并指定正确的代码页（这不是问题代码中所写的
CP\u UTF8
；事实上，
CP\u UTF8
标识Unicode UTF-8）。您可能希望尝试将
1250
（ANSI中欧版；中欧版（Windows））指定为正确的
如果您可以在代码中访问ATL，您可以使用like
CA2W
的便利性，它将
MultiByteToWideChar（
）调用和内存分配封装在RAII类中；e、 g:

#include <atlconv.h> // ATL String Conversion Helpers // 'test' is a Unicode UTF-16 string. // Conversion is done from code-page 1250 // (ANSI Central European; Central European (Windows)) CA2W test("ąółź", 1250);

没有高于127的！根据定义！对不起，大喊大叫。现在我已经引起了您的注意，您需要找出“ascii文本”的实际编码是什么，并实际使用该编码对其进行解码。是的，测试字符串绝对无效UTF-8:0xB9不是有效的前导字节。是的，确实存在错误。这是错误的翻译。将CP_UTF8更改为CP_ACP后，尺寸变量增加到4，如怀疑的那样。此外，还返回了正确的字符代码，这太棒了！还感谢您提供编码格式之间的差异列表。会派上用场的。干杯
/////////////////////////////////////////////////////////////////////////////// // // Modern STL-based C++ wrapper to Win32's MultiByteToWideChar() C API. // // (based on http://code.msdn.microsoft.com/windowsdesktop/C-UTF-8-Conversion-Helpers-22c0a664) // /////////////////////////////////////////////////////////////////////////////// #include <exception> // for std::exception #include <iostream> // for std::cout #include <ostream> // for std::endl #include <stdexcept> // for std::runtime_error #include <string> // for std::string and std::wstring #include <Windows.h> // Win32 Platform SDK //----------------------------------------------------------------------------- // Define an exception class for string conversion error. //----------------------------------------------------------------------------- class StringConversionException : public std::runtime_error { public: // Creates exception with error message and error code. StringConversionException(const char* message, DWORD error) : std::runtime_error(message) , m_error(error) {} // Creates exception with error message and error code. StringConversionException(const std::string& message, DWORD error) : std::runtime_error(message) , m_error(error) {} // Windows error code. DWORD Error() const { return m_error; } private: DWORD m_error; }; //----------------------------------------------------------------------------- // Converts an ANSI/MBCS string to Unicode UTF-16. // Wraps MultiByteToWideChar() using modern C++ and STL. // Throws a StringConversionException on error. //----------------------------------------------------------------------------- std::wstring ConvertToUTF16(const std::string & source, const UINT codePage) { // Fail if an invalid input character is encountered static const DWORD conversionFlags = MB_ERR_INVALID_CHARS; // Require size for destination string const int utf16Length = ::MultiByteToWideChar( codePage, // code page for the conversion conversionFlags, // flags source.c_str(), // source string source.length(), // length (in chars) of source string NULL, // unused - no conversion done in this step 0 // request size of destination buffer, in wchar_t's ); if (utf16Length == 0) { const DWORD error = ::GetLastError(); throw StringConversionException( "MultiByteToWideChar() failed: Can't get length of destination UTF-16 string.", error); } // Allocate room for destination string std::wstring utf16Text; utf16Text.resize(utf16Length); // Convert to Unicode UTF-16 if ( ! ::MultiByteToWideChar( codePage, // code page for conversion 0, // validation was done in previous call source.c_str(), // source string source.length(), // length (in chars) of source string &utf16Text[0], // destination buffer utf16Text.length() // size of destination buffer, in wchar_t's )) { const DWORD error = ::GetLastError(); throw StringConversionException( "MultiByteToWideChar() failed: Can't convert to UTF-16 string.", error); } return utf16Text; } //----------------------------------------------------------------------------- // Test. //----------------------------------------------------------------------------- int main() { // Error codes static const int exitOk = 0; static const int exitError = 1; try { // Test input string: // // ą - LATIN SMALL LETTER A WITH OGONEK std::string inText("x - LATIN SMALL LETTER A WITH OGONEK"); inText[0] = 0xB9; // ANSI Central European; Central European (Windows) code page static const UINT codePage = 1250; // Convert to Unicode UTF-16 const std::wstring utf16Text = ConvertToUTF16(inText, codePage); // Verify conversion. // ą - LATIN SMALL LETTER A WITH OGONEK // --> Unicode UTF-16 0x0105 // http://www.fileformat.info/info/unicode/char/105/index.htm if (utf16Text[0] != 0x0105) { throw std::runtime_error("Wrong conversion."); } std::cout << "All right." << std::endl; } catch (const StringConversionException& e) { std::cerr << "*** ERROR:\n"; std::cerr << e.what() << "\n"; std::cerr << "Error code = " << e.Error(); std::cerr << std::endl; return exitError; } catch (const std::exception& e) { std::cerr << "*** ERROR:\n"; std::cerr << e.what(); std::cerr << std::endl; return exitError; } return exitOk; } ///////////////////////////////////////////////////////////////////////////////