C fopen的一切-这是可能的吗?
我过去习惯于编写windows程序,但我想尝试制作一个跨平台的应用程序。如果你不介意的话,我有一些问题: 问题1 是否有某种方法可以打开UNICODE\ASCII文件,并使用裸ANSI C自动检测其编码。MSDN说,如果我使用“ccs=UNICODE”标志,fopen()可以在各种UNICODE格式(utf-8、utf-16、UNICODE BI\LI)之间切换。实验发现,从UNICODE到ASCII的转换并没有发生,但为了解决这个问题,我发现文本UNICODE文件有一些前缀,如0xFFFE、0xFEFF或0xFEBBC fopen的一切-这是可能的吗?,c,unicode,console,cross-platform,ansi-c,C,Unicode,Console,Cross Platform,Ansi C,我过去习惯于编写windows程序,但我想尝试制作一个跨平台的应用程序。如果你不介意的话,我有一些问题: 问题1 是否有某种方法可以打开UNICODE\ASCII文件,并使用裸ANSI C自动检测其编码。MSDN说,如果我使用“ccs=UNICODE”标志,fopen()可以在各种UNICODE格式(utf-8、utf-16、UNICODE BI\LI)之间切换。实验发现,从UNICODE到ASCII的转换并没有发生,但为了解决这个问题,我发现文本UNICODE文件有一些前缀,如0xFFFE、0
FILE *file;
{
__int16 isUni;
file = _tfopen(filename, _T("rb"));
fread(&(isUni),1,2,file);
fclose(file);
if( isUni == (__int16)0xFFFE || isUni == (__int16)0xFEFF || isUni == (__int16)0xFEBB)
file = _tfopen(filename, _T("r,ccs=UNICODE"));
else
file = _tfopen(filename, _T("r"));
}
那么,我可以做一些像这样的跨平台的东西,而不是那么难看吗
问题2
我可以在windows中做类似的事情,但它在Linux中工作吗
file = fopen(filename, "r");
fwscanf(file,"%lf",buffer);
如果没有,那么是否有某种ANSI C函数将ASCII字符串转换为Unicode?我想在我的程序中使用Unicode字符串
问题3
此外,我需要将Unicode字符串输出到控制台。windows中有setlocale(*),但在Linux中我应该怎么做?控制台似乎已经在那里了
问题4
一般来说,我想在我的程序中使用Unicode,但我遇到了一些奇怪的问题:
f = fopen("inc.txt","rt");
fwprintf(f,L"Текст"); // converted successfully
fclose(f);
f = fopen("inc_u8.txt","rt, ccs = UNICODE");
fprintf(f,"text"); // failed to convert
fclose(f);
另外,有关于跨平台编程的好书吗?有比较windows和linux程序代码的书吗?还有一些关于Unicode使用方法的书,实用的方法。我不想沉浸在简单的UNICODE BI历史中,我对特定的C/C++库感兴趣。问题1:
是的,您可以检测字节顺序标记,这是您发现的字节序列-如果您的文件有一个谷歌和stackoverflow上的搜索将完成剩下的工作。 至于“不那么难看”:你可以重构/美化你的代码,例如编写一个确定BOM的函数,并在开始时执行,然后根据需要调用fopen或_tfopen。 然后您可以再次重构它,并编写自己的fopen函数。但它仍然是丑陋的 问题2: 是的,但是unicode函数在Linux上的调用并不总是与在Windows上的调用相同
使用定义。 也许写你自己的TCHAR.H 问题3:
#include <locale.h>
setlocale(LC_ALL, "en.UTF-8")
编辑:
以下是对Linux(wchar.h)上可用Unicode函数的一些见解:
\u开始\u名称空间\u标准
/*将SRC复制到DEST*/
外部控制系统(控制系统),
__const wchar_t*uuu restrict_uusrc)uu THROW;
/*将不超过N个SRC宽字符复制到DEST*/
外部限制,
__const wchar\u t*\u\u restrict\u src,size\u t\n)
__投掷;
/*将SRC附加到DEST*/
外部wchar\U t*wcscat(wchar\U t*\U限制\U目的地,
__const wchar_t*uuu restrict_uusrc)uu THROW;
/*在DEST上追加不超过N个SRC宽字符*/
外部控制中心(控制中心),
__const wchar\u t*\u\u restrict\u src,size\u t\n)
__投掷;
/*比较S1和S2*/
外部连接wcscmp(常数wchar\U t*\U s1,常数wchar\U t*\U s2)
__投掷uuu属性u纯uuuu;
/*比较S1和S2的N个宽字符*/
外部连接wcsncmp(常数wchar\U t*\U s1,常数wchar\U t*\U s2,大小)
__投掷uuu属性u纯uuuu;
__结束\u名称空间\u标准
#如果定义uuu使用XOPEN2K8
/*比较S1和S2,忽略大小写*/
外部连接wcscasecmp(常量wchar\U t*\U s1,常量wchar\U t*\U s2)\U抛出;
/*比较S1和S2不超过N个字符,忽略大小写*/
外部输入wcsncasecmp(常数wchar\U t*\U s1,常数wchar\U t*\U s2,
尺寸(t)掷骰;
/*类似于上面的两个函数,但从
提供的区域设置,而不是全局区域设置*/
#包括
外部连接wcscasecmp\u l(uuu const wchar\u t*uu s1,uuu const wchar\u t*u s2,
__地点(地点)\u t\u loc)\u投掷;
外部数据(常数为wchar\u t*\u s1,常数为wchar\u t*\u s2,
大小、地点、地点、投掷;
#恩迪夫
/*上述函数的特殊版本,将区域设置转换为
用作附加参数*/
外部长时间工作时间(\uuuuu const wchar\uu t*\uuuuu restrict\uuuu nptr,
wchar\u t**\uuuuu restrict\uuuu endptr,int\uuuu base,
__地点(地点)\u t\u loc)\u投掷;
外部无符号长整数wcstoul\u l(uuuu const wchar\u t*uuuu restrict\uuuu nptr,
wchar\u t**uuuuu restrict\uuuu endptr,
国际基地,地区,地点,投法;
__延伸__
外部长时间内部循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环循环,
wchar\u t**uuuuu restrict\uuuu endptr,
国际基地,地区,地点,投法;
__延伸__
外部无符号长整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整型整,
wchar\u t**uuuuu restrict\uuuu endptr,
国际基地,地区,地点,地点)
__投掷;
外部双wcstod l(uuu const wchar\u t*uu\u restrict\uuu nptr,
wchar\u t**\uuuu restrict\uuu endptr,\uuuu locale\uu t\uu loc)
__投掷;
外部浮点数,
wchar\u t**\uuuu restrict\uuu endptr,\uuuu locale\uu t\uu loc)
__投掷;
外部长双wcstold l(uuu const wchar_t*uuu restrict_uunptr,
wchar\u t**uuuuu restrict\uuuu endptr,
__地点(地点)\u t\u loc)\u投掷;
/*将SRC复制到DEST,返回中终止的L'\0'的地址
目的地*/
外部wchar\U t*wcpcpy(wchar\U t*\U限制目标,
__const wchar_t*uuu restrict_uusrc)uu THROW;
/*将不超过N个字符的SRC复制到DEST,并返回
写入DEST的最后一个字符*/
外部控制系统(控制系统),
__const wchar\u t*\u\u restrict\u src,size\u t\n)
__投掷;
#endif/*使用GNU*/
/*宽字符I/O函数*/
#如果定义uuu使用XOPEN2K8
/*与OPEN_MEMSTREAM类似,但该流是面向广泛的,并产生
宽字符串*/
外部文件*打开wmemstream(wc
std::string Unicode2ASCII( std::wstring wstrStringToConvert )
{
size_t sze_StringLength = wstrStringToConvert.length() ;
if(0 == sze_StringLength)
return "" ;
char* chrarry_Buffer = new char[ sze_StringLength + 1 ] ;
wcstombs( chrarry_Buffer, wstrStringToConvert.c_str(), sze_StringLength ) ; // Unicode2ASCII, const wchar_t* C-String 2 mulibyte C-String
chrarry_Buffer[sze_StringLength] = '\0' ;
std::string strASCIIstring = chrarry_Buffer ;
delete chrarry_Buffer ;
return strASCIIstring ;
}
std::wstring ASCII2Unicode( std::string strStringToConvert )
{
size_t sze_StringLength = strStringToConvert.length() ;
if(0 == sze_StringLength)
return L"" ;
wchar_t* wchrarry_Buffer = new wchar_t[ sze_StringLength + 1 ] ;
mbstowcs( wchrarry_Buffer, strStringToConvert.c_str(), sze_StringLength ) ; // Unicode2ASCII, const. mulibyte C-String 2 wchar_t* C-String
wchrarry_Buffer[sze_StringLength] = L'\0' ;
std::wstring wstrUnicodeString = wchrarry_Buffer ;
delete wchrarry_Buffer ;
return wstrUnicodeString ;
}
__BEGIN_NAMESPACE_STD
/* Copy SRC to DEST. */
extern wchar_t *wcscpy (wchar_t *__restrict __dest,
__const wchar_t *__restrict __src) __THROW;
/* Copy no more than N wide-characters of SRC to DEST. */
extern wchar_t *wcsncpy (wchar_t *__restrict __dest,
__const wchar_t *__restrict __src, size_t __n)
__THROW;
/* Append SRC onto DEST. */
extern wchar_t *wcscat (wchar_t *__restrict __dest,
__const wchar_t *__restrict __src) __THROW;
/* Append no more than N wide-characters of SRC onto DEST. */
extern wchar_t *wcsncat (wchar_t *__restrict __dest,
__const wchar_t *__restrict __src, size_t __n)
__THROW;
/* Compare S1 and S2. */
extern int wcscmp (__const wchar_t *__s1, __const wchar_t *__s2)
__THROW __attribute_pure__;
/* Compare N wide-characters of S1 and S2. */
extern int wcsncmp (__const wchar_t *__s1, __const wchar_t *__s2, size_t __n)
__THROW __attribute_pure__;
__END_NAMESPACE_STD
#ifdef __USE_XOPEN2K8
/* Compare S1 and S2, ignoring case. */
extern int wcscasecmp (__const wchar_t *__s1, __const wchar_t *__s2) __THROW;
/* Compare no more than N chars of S1 and S2, ignoring case. */
extern int wcsncasecmp (__const wchar_t *__s1, __const wchar_t *__s2,
size_t __n) __THROW;
/* Similar to the two functions above but take the information from
the provided locale and not the global locale. */
# include <xlocale.h>
extern int wcscasecmp_l (__const wchar_t *__s1, __const wchar_t *__s2,
__locale_t __loc) __THROW;
extern int wcsncasecmp_l (__const wchar_t *__s1, __const wchar_t *__s2,
size_t __n, __locale_t __loc) __THROW;
#endif
/* Special versions of the functions above which take the locale to
use as an additional parameter. */
extern long int wcstol_l (__const wchar_t *__restrict __nptr,
wchar_t **__restrict __endptr, int __base,
__locale_t __loc) __THROW;
extern unsigned long int wcstoul_l (__const wchar_t *__restrict __nptr,
wchar_t **__restrict __endptr,
int __base, __locale_t __loc) __THROW;
__extension__
extern long long int wcstoll_l (__const wchar_t *__restrict __nptr,
wchar_t **__restrict __endptr,
int __base, __locale_t __loc) __THROW;
__extension__
extern unsigned long long int wcstoull_l (__const wchar_t *__restrict __nptr,
wchar_t **__restrict __endptr,
int __base, __locale_t __loc)
__THROW;
extern double wcstod_l (__const wchar_t *__restrict __nptr,
wchar_t **__restrict __endptr, __locale_t __loc)
__THROW;
extern float wcstof_l (__const wchar_t *__restrict __nptr,
wchar_t **__restrict __endptr, __locale_t __loc)
__THROW;
extern long double wcstold_l (__const wchar_t *__restrict __nptr,
wchar_t **__restrict __endptr,
__locale_t __loc) __THROW;
/* Copy SRC to DEST, returning the address of the terminating L'\0' in
DEST. */
extern wchar_t *wcpcpy (wchar_t *__restrict __dest,
__const wchar_t *__restrict __src) __THROW;
/* Copy no more than N characters of SRC to DEST, returning the address of
the last character written into DEST. */
extern wchar_t *wcpncpy (wchar_t *__restrict __dest,
__const wchar_t *__restrict __src, size_t __n)
__THROW;
#endif /* use GNU */
/* Wide character I/O functions. */
#ifdef __USE_XOPEN2K8
/* Like OPEN_MEMSTREAM, but the stream is wide oriented and produces
a wide character string. */
extern __FILE *open_wmemstream (wchar_t **__bufloc, size_t *__sizeloc) __THROW;
#endif
#if defined __USE_ISOC95 || defined __USE_UNIX98
__BEGIN_NAMESPACE_STD
/* Select orientation for stream. */
extern int fwide (__FILE *__fp, int __mode) __THROW;
/* Write formatted output to STREAM.
This function is a possible cancellation point and therefore not
marked with __THROW. */
extern int fwprintf (__FILE *__restrict __stream,
__const wchar_t *__restrict __format, ...)
/* __attribute__ ((__format__ (__wprintf__, 2, 3))) */;
/* Write formatted output to stdout.
This function is a possible cancellation point and therefore not
marked with __THROW. */
extern int wprintf (__const wchar_t *__restrict __format, ...)
/* __attribute__ ((__format__ (__wprintf__, 1, 2))) */;
/* Write formatted output of at most N characters to S. */
extern int swprintf (wchar_t *__restrict __s, size_t __n,
__const wchar_t *__restrict __format, ...)
__THROW /* __attribute__ ((__format__ (__wprintf__, 3, 4))) */;
/* Write formatted output to S from argument list ARG.
This function is a possible cancellation point and therefore not
marked with __THROW. */
extern int vfwprintf (__FILE *__restrict __s,
__const wchar_t *__restrict __format,
__gnuc_va_list __arg)
/* __attribute__ ((__format__ (__wprintf__, 2, 0))) */;
/* Write formatted output to stdout from argument list ARG.
This function is a possible cancellation point and therefore not
marked with __THROW. */
extern int vwprintf (__const wchar_t *__restrict __format,
__gnuc_va_list __arg)
/* __attribute__ ((__format__ (__wprintf__, 1, 0))) */;
/* Write formatted output of at most N character to S from argument
list ARG. */
extern int vswprintf (wchar_t *__restrict __s, size_t __n,
__const wchar_t *__restrict __format,
__gnuc_va_list __arg)
__THROW /* __attribute__ ((__format__ (__wprintf__, 3, 0))) */;
/* Read formatted input from STREAM.
This function is a possible cancellation point and therefore not
marked with __THROW. */
extern int fwscanf (__FILE *__restrict __stream,
__const wchar_t *__restrict __format, ...)
/* __attribute__ ((__format__ (__wscanf__, 2, 3))) */;
/* Read formatted input from stdin.
This function is a possible cancellation point and therefore not
marked with __THROW. */
extern int wscanf (__const wchar_t *__restrict __format, ...)
/* __attribute__ ((__format__ (__wscanf__, 1, 2))) */;
/* Read formatted input from S. */
extern int swscanf (__const wchar_t *__restrict __s,
__const wchar_t *__restrict __format, ...)
__THROW /* __attribute__ ((__format__ (__wscanf__, 2, 3))) */;
# if defined __USE_ISOC99 && !defined __USE_GNU \
&& (!defined __LDBL_COMPAT || !defined __REDIRECT) \
&& (defined __STRICT_ANSI__ || defined __USE_XOPEN2K)
# ifdef __REDIRECT
/* For strict ISO C99 or POSIX compliance disallow %as, %aS and %a[
GNU extension which conflicts with valid %a followed by letter
s, S or [. */
extern int __REDIRECT (fwscanf, (__FILE *__restrict __stream,
__const wchar_t *__restrict __format, ...),
__isoc99_fwscanf)
/* __attribute__ ((__format__ (__wscanf__, 2, 3))) */;
extern int __REDIRECT (wscanf, (__const wchar_t *__restrict __format, ...),
__isoc99_wscanf)
/* __attribute__ ((__format__ (__wscanf__, 1, 2))) */;
extern int __REDIRECT_NTH (swscanf, (__const wchar_t *__restrict __s,
__const wchar_t *__restrict __format,
...), __isoc99_swscanf)
/* __attribute__ ((__format__ (__wscanf__, 2, 3))) */;
# else
extern int __isoc99_fwscanf (__FILE *__restrict __stream,
__const wchar_t *__restrict __format, ...);
extern int __isoc99_wscanf (__const wchar_t *__restrict __format, ...);
extern int __isoc99_swscanf (__const wchar_t *__restrict __s,
__const wchar_t *__restrict __format, ...)