Assembly 使用wprintf linux x86-64平台在程序集中打印unicode字符_Assembly_Unicode_Nasm_Gnu Assembler

Assembly 使用wprintf linux x86-64平台在程序集中打印unicode字符

assembly unicode

Assembly 使用wprintf linux x86-64平台在程序集中打印unicode字符,assembly,unicode,nasm,gnu-assembler,Assembly,Unicode,Nasm,Gnu Assembler,我正在使用linux，只是在试验nasm和gas。我能够使用c语言使用wprintf打印出unicode字符++ #include <wchar.h> #include <locale.h> #include <stdio.h> int main() { //printf("helloworld"); // can't do this AND wprintf in same program setlocale(LC_ALL, ""); wprin

我正在使用linux，只是在试验nasm和gas。我能够使用c语言使用wprintf打印出unicode字符++

#include <wchar.h>
#include <locale.h>
#include <stdio.h>
int main() 
{
  //printf("helloworld"); // can't do this AND wprintf in same program
  setlocale(LC_ALL, "");
  wprintf(L"%lc",0x307E); //prints out japanese hiragana ma ま
}

这将导致5个灰色的问号，然后是平假名ま（马）。您可能会认为在“%”、'l'、'c'之后应该有一个，0，但这不起作用——这样做之后只输出问号。我唯一能够打印出平假名ma而没有问号的方法是跳过格式字符串，将printwide加载到rdi中

同样，目前这是出于教育目的。那么，基本上，您如何使用at&t语法和intel来处理格式字符串呢？在C++中，你只需在它前面加上L。（是的，我想您可以将%lc更改为十六进制，但我不想这样做）

EDIT这是可行的（我将$printwide更改为printwide，并将printformat:to.strings更改为gcc-S清单中的那样）。但是它为什么有效呢？除了使用这么多.string语句之外，还有没有更好的方法写出格式？在英特尔语法中，您将如何做到这一点

.section .data
locale:
    .string ""
printformat:
    .string "%"
    .string ""
    .string ""
    .string "l"
    .string ""
    .string ""
    .string "c"
    .string ""
    .string ""
    .string ""
    .string ""
    .string ""
    .string ""
printwide:
    .word 0x307E
.section .text
.global _start
_start:
movq    $locale,%rsi
movq    $6,%rdi
call    setlocale
movq    $printformat,%rdi
movq    printwide,%rsi
movq    $0,%rax
call    wprintf
movq    $2,%rdi
call    exit

我对这个答案感到惊讶。我猜64位宽的字符是32位的。我通过阅读nasm发现了这一点。您可以通过以下方法以intel语法生成字符串utf-16

printformat dw __utf16__("%lc"),0

然而，它只有在我这样做的时候才起作用

printformat dd __utf32__("%lc"),0

因此at&t语法中的等价物是

.long '%','l','c',0

我猜gcc-S的列表使用了如此多的字符串，因此它的宽度为32位

.string“%”=16位（百分比和自动零），然后是8位的空字符串，然后是8位的空字符串。

我对这个答案感到惊讶。我猜64位宽的字符是32位的。我通过阅读nasm发现了这一点。您可以通过以下方法以intel语法生成字符串utf-16

printformat dw __utf16__("%lc"),0

然而，它只有在我这样做的时候才起作用

printformat dd __utf32__("%lc"),0

因此at&t语法中的等价物是

.long '%','l','c',0

我猜gcc-S的列表使用了如此多的字符串，因此它的宽度为32位

.string“%”=16位（百分比和自动零），然后是8位的空字符串，然后是8位的空字符串。

要使用哪种unicode编码？UTF-8还是UTF-16？我相信我想要UTF-8，在我的C++文件中使用什么？C代码使用UTF-16（或者类似，取决于CRTL）。程序集正在使用

.int

创建字符串常量，因此它是UTF-16或UTF-32，这取决于伪op生成的大小，而伪op生成的大小很可能会受到汇编程序命令行选项或环境变量的影响。我也尝试了.word和.hword。仍然在测试前后输出问号ま我已经编辑了我的问题，您想使用哪种unicode编码？UTF-8还是UTF-16？我相信我想要UTF-8，在我的C++文件中使用什么？C代码使用UTF-16（或者类似，取决于CRTL）。程序集正在使用

.int

创建字符串常量，因此它是UTF-16或UTF-32，这取决于伪op生成的大小，而伪op生成的大小很可能会受到汇编程序命令行选项或环境变量的影响。我也尝试了.word和.hword。仍然在测试前后输出问号ま我编辑了我的问题