Utf 8 如何在Visual Basic 6中解码UTF8?

Utf 8 如何在Visual Basic 6中解码UTF8?,utf-8,vb6,Utf 8,Vb6,如何在Visual Basic 6中解码UTF-8 我遇到了一个问题,ANSI 127和更高版本由于任何原因都没有被正确解码 例如,Ä被解码成Ã,我不知道为什么。以下是我所做的。使用多字节字符,如Comintern所说: Private Const CP_UTF8 As Long = 65001 ' UTF-8 Code Page 'Sys call to convert multiple byte chars to a char Private Declare Function MultiB

如何在Visual Basic 6中解码UTF-8

我遇到了一个问题,ANSI 127和更高版本由于任何原因都没有被正确解码


例如,
Ä
被解码成
Ã
,我不知道为什么。

以下是我所做的。使用多字节字符,如Comintern所说:

Private Const CP_UTF8 As Long = 65001 ' UTF-8 Code Page

'Sys call to convert multiple byte chars to a char
Private Declare Function MultiByteToWideChar Lib "KERNEL32" ( _
    ByVal CodePage As Long, _
    ByVal dwFlags As Long, _
    ByVal lpMultiByteStr As Long, _
    ByVal cchMultiByte As Long, _
    ByVal lpWideCharStr As Long, _
    ByVal cchWideChar As Long) As Long
请注意,我已经指定了windows代码页,这意味着我们正在使用的是UTF-8 Unicode

接下来是我的解码功能。我称之为DecodeURI:

'------------------------------------------------------------------
' NAME:         DecodeURI (PUBLIC)
' DESCRIPTION:  Decodes a UTF8 encoded string
' CALLED BY:    HandleNavigate
' PARAMETERS:
'  EncodedURL (I,REQ) - the UTF-8 encoded string to decode
' RETURNS:      the the decoded UTF-8 string
'------------------------------------------------------------------
Private Function DecodeURI(ByVal EncodedURI As String) As String
    Dim bANSI() As Byte
    Dim bUTF8() As Byte
    Dim lIndex As Long
    Dim lUTFIndex As Long

    If Len(EncodedURI) = 0 Then
        Exit Function
    End If

EncodedURI = Replace$(EncodedURI, "+", " ")         ' In case encoding isn't used.
    bANSI = StrConv(EncodedURI, vbFromUnicode)          ' Convert from unicode text to ANSI values
    ReDim bUTF8(UBound(bANSI))                          ' Declare dynamic array, get length
    For lIndex = 0 To UBound(bANSI)                     ' from 0 to length of ANSI
        If bANSI(lIndex) = &H25 Then                    ' If we have ASCII 37, %, then
            bUTF8(lUTFIndex) = Val("&H" & Mid$(EncodedURI, lIndex + 2, 2)) ' convert hex to ANSI
            lIndex = lIndex + 2                         ' this character was encoded into two bytes
        Else
            bUTF8(lUTFIndex) = bANSI(lIndex)            ' otherwise don't need to do anything special
        End If
        lUTFIndex = lUTFIndex + 1                       ' advance utf index
    Next
    DecodeURI = FromUTF8(bUTF8, lUTFIndex)              ' convert to string
End Function
并使用系统调用从UTF-8转换:

'------------------------------------------------------------------
' NAME:         FromUTF8 (Private)
' DESCRIPTION:  Use the system call MultiByteToWideChar to
'               get chars using more than one byte and return
'               return the whole string
' CALLED BY:    DecodeURI
' PARAMETERS:
'  UTF8 (I,REQ)   - the ID of the element to return
'  Length (I,REQ) - length of the string
' RETURNS:      the full raw data of this field
'------------------------------------------------------------------
Private Function FromUTF8(ByRef UTF8() As Byte, ByVal Length As Long) As String
    Dim lDataLength As Long

    lDataLength = MultiByteToWideChar(CP_UTF8, 0, VarPtr(UTF8(0)), Length, 0, 0)  ' Get the length of the data.
    FromUTF8 = String$(lDataLength, 0)                                         ' Create array big enough
    MultiByteToWideChar CP_UTF8, 0, VarPtr(UTF8(0)), _
                        Length, StrPtr(FromUTF8), lDataLength                  '
End Function
希望有帮助!我用你的角色测试了它,它看起来很有效(所有角色都应该如此)

公共函数UTF8ENCODE(ByVal sStr作为字符串)作为字符串
对于L&=1至Len(sStr)
lChar&=AscW(Mid(sStr,L&,1))
如果lChar&<128,则
sUtf8$=sUtf8$+Mid(sStr、L和1)
ElseIf((lChar&>127)和(lChar&<2048))然后
sUtf8$=sUtf8$+Chr((lChar&\64)或192))
sUtf8$=sUtf8$+Chr((lChar和63)或128))
其他的
sUtf8$=sUtf8$+Chr((lChar&\144)或234))
sUtf8$=sUtf8$+Chr(((lChar&\64和63)或128))
sUtf8$=sUtf8$+Chr((lChar和63)或128))
如果结束
下一个L&
UTF8ENCODE=sUtf8$
端函数

你的帖子给我的印象是,你不知道UTF-8和其他编码之间的区别。ANSI不是一种编码。这可能与您遇到的错误有关。请将其转换为UTF-16。看,Yellowantphil——我没说ANSI是一种编码。我指出ANSI字符(ANSI字符集的字符;[例如字符集])127未正确解码(来自UTF-8;一种编码)。共产国际——谢谢,我现在正在研究这个问题——我很感激你的回答是有意提供帮助的。ANSI也不是一个字符集。需要一个解释。
Public Function UTF8ENCODE(ByVal sStr As String) As String

    For L& = 1 To Len(sStr)

        lChar& = AscW(Mid(sStr, L&, 1))

        If lChar& < 128 Then
            sUtf8$ = sUtf8$ + Mid(sStr, L&, 1)
        ElseIf ((lChar& > 127) And (lChar& < 2048)) Then

            sUtf8$ = sUtf8$ + Chr(((lChar& \ 64) Or 192))
            sUtf8$ = sUtf8$ + Chr(((lChar& And 63) Or 128))

        Else

            sUtf8$ = sUtf8$ + Chr(((lChar& \ 144) Or 234))
            sUtf8$ = sUtf8$ + Chr((((lChar& \ 64) And 63) Or 128))
            sUtf8$ = sUtf8$ + Chr(((lChar& And 63) Or 128))

        End If
    Next L&

    UTF8ENCODE = sUtf8$

End Function