Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/go/7.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Unicode中的保留字符代码_Unicode_Character_Reserved - Fatal编程技术网

Unicode中的保留字符代码

Unicode中的保留字符代码,unicode,character,reserved,Unicode,Character,Reserved,为什么Unicode有几个保留字符码? 请参阅Unicode以了解两种语言-和。 这两种语言都非常古老,我认为没有机会在这些语言中添加新字符。 编辑:那他们为什么要浪费一些字符码,把它变成保留字符码? 为什么他们不将保留字符代码放在每个语言字符集的末尾?这与Unicode联盟如何分配其分配的块、脚本和代码点有关。例如,在Block=Tamil中,它的开头是这样运行的: $ unichars '\p{Block=Tamil}' | head -20 U+00B82 ‭ ◌ஂ GC=Mn SC=T

为什么Unicode有几个保留字符码?
请参阅Unicode以了解两种语言-和。 这两种语言都非常古老,我认为没有机会在这些语言中添加新字符。
编辑:那他们为什么要浪费一些字符码,把它变成保留字符码?

为什么他们不将保留字符代码放在每个语言字符集的末尾?

这与Unicode联盟如何分配其分配的块、脚本和代码点有关。例如,在
Block=Tamil
中,它的开头是这样运行的:

$ unichars '\p{Block=Tamil}' | head -20
U+00B82 ‭ ◌ஂ  GC=Mn SC=Tamil        TAMIL SIGN ANUSVARA
U+00B83 ‭ ஃ  GC=Lo SC=Tamil        TAMIL SIGN VISARGA
U+00B85 ‭ அ  GC=Lo SC=Tamil        TAMIL LETTER A
U+00B86 ‭ ஆ  GC=Lo SC=Tamil        TAMIL LETTER AA
U+00B87 ‭ இ  GC=Lo SC=Tamil        TAMIL LETTER I
U+00B88 ‭ ஈ  GC=Lo SC=Tamil        TAMIL LETTER II
U+00B89 ‭ உ  GC=Lo SC=Tamil        TAMIL LETTER U
U+00B8A ‭ ஊ  GC=Lo SC=Tamil        TAMIL LETTER UU
U+00B8E ‭ எ  GC=Lo SC=Tamil        TAMIL LETTER E
U+00B8F ‭ ஏ  GC=Lo SC=Tamil        TAMIL LETTER EE
U+00B90 ‭ ஐ  GC=Lo SC=Tamil        TAMIL LETTER AI
U+00B92 ‭ ஒ  GC=Lo SC=Tamil        TAMIL LETTER O
U+00B93 ‭ ஓ  GC=Lo SC=Tamil        TAMIL LETTER OO
U+00B94 ‭ ஔ  GC=Lo SC=Tamil        TAMIL LETTER AU
U+00B95 ‭ க  GC=Lo SC=Tamil        TAMIL LETTER KA
U+00B99 ‭ ங  GC=Lo SC=Tamil        TAMIL LETTER NGA
U+00B9A ‭ ச  GC=Lo SC=Tamil        TAMIL LETTER CA
U+00B9C ‭ ஜ  GC=Lo SC=Tamil        TAMIL LETTER JA
U+00B9E ‭ ஞ  GC=Lo SC=Tamil        TAMIL LETTER NYA
U+00B9F ‭ ட  GC=Lo SC=Tamil        TAMIL LETTER TTA
它们倾向于为所有相同的“种类”字符保留4、8或16个代码点的连续行。是的,这是有差距的,但这就像在文件系统中,一旦您将一个扇区(或块,如果一个块中没有单独的扇区)分配给一个文件,即使该文件没有使用其(最终)扇区中的所有内容,您也不会将这些未使用的字节分配给其他进程。不管怎样,事物总是会被填充以阻挡边界

我们不会有代码用尽的风险

这里是分配区域的开头,以“符号”开头,如该块中第一个分配的代码点所示。间隙可以表示从一种字符到另一种字符的变化。如果检查块中前五个代码点的特性,可以看到这些未指定的代码点仍然具有正确的块特性:

$ uniprops -a U+00B80 U+00B81 U+00B82 U+00B83 U+00B84 U+00B85
U+0B80 ‹U+0B80› \N{U+0B80}
    \pC \p{Cn}
    All Any InTamil C Other Cn Unassigned Zzzz Unknown
    Age=Unassigned Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered
       CCC=NR Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=Other GCB=XX
       Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=Unknown LB=XX Line_Break=XX Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=Unassigned IN=Unassigned Script=Unknown SC=Zzzz Script=Zzzz Sentence_Break=Other SB=XX
       Sentence_Break=XX Word_Break=Other WB=XX Word_Break=XX
U+0B81 ‹U+0B81› \N{U+0B81}
    \pC \p{Cn}
    All Any InTamil C Other Cn Unassigned Zzzz Unknown
    Age=Unassigned Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered
       CCC=NR Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=Other GCB=XX
       Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=Unknown LB=XX Line_Break=XX Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=Unassigned IN=Unassigned Script=Unknown SC=Zzzz Script=Zzzz Sentence_Break=Other SB=XX
       Sentence_Break=XX Word_Break=Other WB=XX Word_Break=XX
U+0B82 ‹◌ஂ› \N{TAMIL SIGN ANUSVARA}
    \w \pM \p{Mn}
    All Any Alnum Alpha Alphabetic Assigned InTamil Tamil Is_Tamil Case_Ignorable CI M Mn Gr_Ext Grapheme_Extend Graph GrExt ID_Continue IDC
       Mark Nonspacing_Mark Print Taml Word XID_Continue XIDC X_POSIX_Alnum X_POSIX_Alpha X_POSIX_Graph X_POSIX_Print X_POSIX_Word
    Age=1.1 Bidi_Class=Nonspacing_Mark BC=NSM Bidi_Class=NSM Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered
       CCC=NR Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=EX
       Grapheme_Cluster_Break=Extend GCB=EX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=T Joining_Type=Transparent JT=T Line_Break=CM Line_Break=Combining_Mark LB=CM Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0 Present_In=3.1 IN=3.1
       Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1 Present_In=5.0 IN=5.0 Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2
       Present_In=6.0 IN=6.0 Script=Tamil SC=Taml Script=Taml Sentence_Break=EX Sentence_Break=Extend SB=EX Word_Break=Extend WB=Extend
U+0B83 ‹ஃ› \N{TAMIL SIGN VISARGA}
    \w \pL \p{L_} \p{Lo}
    All Any Alnum Alpha Alphabetic Assigned InTamil Tamil Is_Tamil L Lo Gr_Base Grapheme_Base Graph GrBase ID_Continue IDC ID_Start IDS Letter
       L_ Other_Letter Print Taml Word XID_Continue XIDC XID_Start XIDS X_POSIX_Alnum X_POSIX_Alpha X_POSIX_Graph X_POSIX_Print X_POSIX_Word
    Age=1.1 Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered CCC=NR
       Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=Other GCB=XX
       Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=AL Line_Break=Alphabetic LB=AL Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0 Present_In=3.1 IN=3.1
       Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1 Present_In=5.0 IN=5.0 Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2
       Present_In=6.0 IN=6.0 Script=Tamil SC=Taml Script=Taml Sentence_Break=LE Sentence_Break=OLetter SB=LE Word_Break=ALetter WB=LE
       Word_Break=LE
U+0B84 ‹U+0B84› \N{U+0B84}
    \pC \p{Cn}
    All Any InTamil C Other Cn Unassigned Zzzz Unknown
    Age=Unassigned Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered
       CCC=NR Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=Other GCB=XX
       Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=Unknown LB=XX Line_Break=XX Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=Unassigned IN=Unassigned Script=Unknown SC=Zzzz Script=Zzzz Sentence_Break=Other SB=XX
       Sentence_Break=XX Word_Break=Other WB=XX Word_Break=XX
U+0B85 ‹அ› \N{TAMIL LETTER A}
    \w \pL \p{L_} \p{Lo}
    All Any Alnum Alpha Alphabetic Assigned InTamil Tamil Is_Tamil L Lo Gr_Base Grapheme_Base Graph GrBase ID_Continue IDC ID_Start IDS Letter
       L_ Other_Letter Print Taml Word XID_Continue XIDC XID_Start XIDS X_POSIX_Alnum X_POSIX_Alpha X_POSIX_Graph X_POSIX_Print X_POSIX_Word
    Age=1.1 Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered CCC=NR
       Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=Other GCB=XX
       Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=AL Line_Break=Alphabetic LB=AL Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0 Present_In=3.1 IN=3.1
       Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1 Present_In=5.0 IN=5.0 Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2
       Present_In=6.0 IN=6.0 Script=Tamil SC=Taml Script=Taml Sentence_Break=LE Sentence_Break=OLetter SB=LE Word_Break=ALetter WB=LE
       Word_Break=LE
如果您查看其他分配的块,您会看到类似的情况。把块分割成不相关的东西是没有意义的

正如我所说,它们不会耗尽空间,所以我不知道这里有什么问题


顺便说一句,您可以从my获得Unicode探索和处理工具,如、,可以从my获得,也可以从my获得整个套件。

这与Unicode联盟如何分配其分配的块、脚本和代码点有关。例如,在
Block=Tamil
中,它的开头是这样运行的:

$ unichars '\p{Block=Tamil}' | head -20
U+00B82 ‭ ◌ஂ  GC=Mn SC=Tamil        TAMIL SIGN ANUSVARA
U+00B83 ‭ ஃ  GC=Lo SC=Tamil        TAMIL SIGN VISARGA
U+00B85 ‭ அ  GC=Lo SC=Tamil        TAMIL LETTER A
U+00B86 ‭ ஆ  GC=Lo SC=Tamil        TAMIL LETTER AA
U+00B87 ‭ இ  GC=Lo SC=Tamil        TAMIL LETTER I
U+00B88 ‭ ஈ  GC=Lo SC=Tamil        TAMIL LETTER II
U+00B89 ‭ உ  GC=Lo SC=Tamil        TAMIL LETTER U
U+00B8A ‭ ஊ  GC=Lo SC=Tamil        TAMIL LETTER UU
U+00B8E ‭ எ  GC=Lo SC=Tamil        TAMIL LETTER E
U+00B8F ‭ ஏ  GC=Lo SC=Tamil        TAMIL LETTER EE
U+00B90 ‭ ஐ  GC=Lo SC=Tamil        TAMIL LETTER AI
U+00B92 ‭ ஒ  GC=Lo SC=Tamil        TAMIL LETTER O
U+00B93 ‭ ஓ  GC=Lo SC=Tamil        TAMIL LETTER OO
U+00B94 ‭ ஔ  GC=Lo SC=Tamil        TAMIL LETTER AU
U+00B95 ‭ க  GC=Lo SC=Tamil        TAMIL LETTER KA
U+00B99 ‭ ங  GC=Lo SC=Tamil        TAMIL LETTER NGA
U+00B9A ‭ ச  GC=Lo SC=Tamil        TAMIL LETTER CA
U+00B9C ‭ ஜ  GC=Lo SC=Tamil        TAMIL LETTER JA
U+00B9E ‭ ஞ  GC=Lo SC=Tamil        TAMIL LETTER NYA
U+00B9F ‭ ட  GC=Lo SC=Tamil        TAMIL LETTER TTA
它们倾向于为所有相同的“种类”字符保留4、8或16个代码点的连续行。是的,这是有差距的,但这就像在文件系统中,一旦您将一个扇区(或块,如果一个块中没有单独的扇区)分配给一个文件,即使该文件没有使用其(最终)扇区中的所有内容,您也不会将这些未使用的字节分配给其他进程。不管怎样,事物总是会被填充以阻挡边界

我们不会有代码用尽的风险

这里是分配区域的开头,以“符号”开头,如该块中第一个分配的代码点所示。间隙可以表示从一种字符到另一种字符的变化。如果检查块中前五个代码点的特性,可以看到这些未指定的代码点仍然具有正确的块特性:

$ uniprops -a U+00B80 U+00B81 U+00B82 U+00B83 U+00B84 U+00B85
U+0B80 ‹U+0B80› \N{U+0B80}
    \pC \p{Cn}
    All Any InTamil C Other Cn Unassigned Zzzz Unknown
    Age=Unassigned Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered
       CCC=NR Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=Other GCB=XX
       Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=Unknown LB=XX Line_Break=XX Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=Unassigned IN=Unassigned Script=Unknown SC=Zzzz Script=Zzzz Sentence_Break=Other SB=XX
       Sentence_Break=XX Word_Break=Other WB=XX Word_Break=XX
U+0B81 ‹U+0B81› \N{U+0B81}
    \pC \p{Cn}
    All Any InTamil C Other Cn Unassigned Zzzz Unknown
    Age=Unassigned Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered
       CCC=NR Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=Other GCB=XX
       Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=Unknown LB=XX Line_Break=XX Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=Unassigned IN=Unassigned Script=Unknown SC=Zzzz Script=Zzzz Sentence_Break=Other SB=XX
       Sentence_Break=XX Word_Break=Other WB=XX Word_Break=XX
U+0B82 ‹◌ஂ› \N{TAMIL SIGN ANUSVARA}
    \w \pM \p{Mn}
    All Any Alnum Alpha Alphabetic Assigned InTamil Tamil Is_Tamil Case_Ignorable CI M Mn Gr_Ext Grapheme_Extend Graph GrExt ID_Continue IDC
       Mark Nonspacing_Mark Print Taml Word XID_Continue XIDC X_POSIX_Alnum X_POSIX_Alpha X_POSIX_Graph X_POSIX_Print X_POSIX_Word
    Age=1.1 Bidi_Class=Nonspacing_Mark BC=NSM Bidi_Class=NSM Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered
       CCC=NR Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=EX
       Grapheme_Cluster_Break=Extend GCB=EX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=T Joining_Type=Transparent JT=T Line_Break=CM Line_Break=Combining_Mark LB=CM Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0 Present_In=3.1 IN=3.1
       Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1 Present_In=5.0 IN=5.0 Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2
       Present_In=6.0 IN=6.0 Script=Tamil SC=Taml Script=Taml Sentence_Break=EX Sentence_Break=Extend SB=EX Word_Break=Extend WB=Extend
U+0B83 ‹ஃ› \N{TAMIL SIGN VISARGA}
    \w \pL \p{L_} \p{Lo}
    All Any Alnum Alpha Alphabetic Assigned InTamil Tamil Is_Tamil L Lo Gr_Base Grapheme_Base Graph GrBase ID_Continue IDC ID_Start IDS Letter
       L_ Other_Letter Print Taml Word XID_Continue XIDC XID_Start XIDS X_POSIX_Alnum X_POSIX_Alpha X_POSIX_Graph X_POSIX_Print X_POSIX_Word
    Age=1.1 Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered CCC=NR
       Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=Other GCB=XX
       Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=AL Line_Break=Alphabetic LB=AL Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0 Present_In=3.1 IN=3.1
       Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1 Present_In=5.0 IN=5.0 Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2
       Present_In=6.0 IN=6.0 Script=Tamil SC=Taml Script=Taml Sentence_Break=LE Sentence_Break=OLetter SB=LE Word_Break=ALetter WB=LE
       Word_Break=LE
U+0B84 ‹U+0B84› \N{U+0B84}
    \pC \p{Cn}
    All Any InTamil C Other Cn Unassigned Zzzz Unknown
    Age=Unassigned Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered
       CCC=NR Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=Other GCB=XX
       Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=Unknown LB=XX Line_Break=XX Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=Unassigned IN=Unassigned Script=Unknown SC=Zzzz Script=Zzzz Sentence_Break=Other SB=XX
       Sentence_Break=XX Word_Break=Other WB=XX Word_Break=XX
U+0B85 ‹அ› \N{TAMIL LETTER A}
    \w \pL \p{L_} \p{Lo}
    All Any Alnum Alpha Alphabetic Assigned InTamil Tamil Is_Tamil L Lo Gr_Base Grapheme_Base Graph GrBase ID_Continue IDC ID_Start IDS Letter
       L_ Other_Letter Print Taml Word XID_Continue XIDC XID_Start XIDS X_POSIX_Alnum X_POSIX_Alpha X_POSIX_Graph X_POSIX_Print X_POSIX_Word
    Age=1.1 Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=Tamil Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered CCC=NR
       Canonical_Combining_Class=NR Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=Other GCB=XX
       Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group
       JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=AL Line_Break=Alphabetic LB=AL Numeric_Type=None NT=None
       Numeric_Value=NaN NV=NaN Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0 Present_In=3.1 IN=3.1
       Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1 Present_In=5.0 IN=5.0 Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2
       Present_In=6.0 IN=6.0 Script=Tamil SC=Taml Script=Taml Sentence_Break=LE Sentence_Break=OLetter SB=LE Word_Break=ALetter WB=LE
       Word_Break=LE
如果您查看其他分配的块,您会看到类似的情况。把块分割成不相关的东西是没有意义的

正如我所说,它们不会耗尽空间,所以我不知道这里有什么问题


顺便说一句,您可以从my获得Unicode探索和处理工具,如、,可以从my单独获得,也可以从my获得整个套件。

我理解好奇,但你问这个问题还有其他原因吗?请解释一下:你的意思是问为什么在这些区块内有未分配的插槽吗?@Oded我想你误解了他的问题,因为你的问题是非顺序的。我不确定这是否离题。@MarkRansom我想这背后可能有一些编程上的原因。@Oded抱歉,这个问题有点让人困惑。我说得更清楚了。我理解你的好奇,但你问这个问题还有其他原因吗?请解释一下:你的意思是问为什么在那些街区里有未分配的时段吗?@Oded我想你误解了他的问题,因为你的问题是非顺序的。我不确定这是否离题。@MarkRansom我想这背后可能有一些编程上的原因。@Oded抱歉,这个问题有点让人困惑。我说得更清楚了。