Python 删除列表中的u_Python_Google App Engine_Unicode

Python 删除列表中的u

python google-app-engine unicode

Python 删除列表中的u,python,google-app-engine,unicode,Python,Google App Engine,Unicode,我已经阅读了删除列表中的字符“u”，但我正在使用谷歌应用程序引擎，它似乎不起作用 def get(self): players = db.GqlQuery("SELECT * FROM Player") print players playerInfo = {} test = [] for player in players: email = player.email gem = str(player.gem)

我已经阅读了删除列表中的字符“u”，但我正在使用谷歌应用程序引擎，它似乎不起作用

def get(self):
    players = db.GqlQuery("SELECT * FROM Player")
    print players
    playerInfo  = {}

    test = []

    for player in players:
        email =  player.email
        gem =  str(player.gem)
        a = "{email:"+email + ",gem:" +gem +"}"

        test.append(a)


    ast.literal_eval(json.dumps(test))
    print test

最终输出：

[u'{email:test@gmail.com,gem:0}', u'{email:test,gem:0}', u'{email:test,gem:0}', u'{email:test,gem:0}', u'{email:test,gem:0}', u'{email:test1,gem:0}']

“u”是字符串外部表示的一部分，这意味着它是Unicode字符串，而不是字节字符串。它不在字符串中，它是类型的一部分

例如，您可以使用相同的synax创建新的Unicode字符串文字。例如：

>>> sandwich = u"smörgås"
>>> sandwich
u'sm\xf6rg\xe5s'

这将创建一个新的Unicode字符串，其值为“三明治”的瑞典语单词。您可以看到，非英语字符由其Unicode代码点表示，öis

\xf6

和åis

\xe5

。“u”前缀的出现与示例中的一样，表示该字符串包含Unicode文本

为了消除这些问题，您需要将Unicode字符串编码为一些面向字节的表示形式，例如UTF-8。您可以使用，例如：

>>> sandwich.encode("utf-8")
'sm\xc3\xb6rg\xc3\xa5s'

这里，我们得到一个没有前缀“u”的新字符串，因为这是一个字节字符串。它包含代表Unicode字符串字符的字节，由于UTF-8编码的奇妙之处，瑞典字符产生了多个字节。

您没有“从列表中删除字符“u”，而是对Unicode字符串进行编码。事实上，您拥有的字符串在大多数情况下都是完美的；您只需要在输出它们之前对它们进行适当的编码。

u表示字符串是unicode。将所有字符串转换为ascii以消除它：

a.encode('ascii', 'ignore')

u'AB'

只是对应Unicode字符串的文本表示形式。以下是创建完全相同的Unicode字符串的几种方法：

L = [u'AB', u'\x41\x42', u'\u0041\u0042', unichr(65) + unichr(66)]
print u", ".join(L)

输出内存中没有

u'

。这正是在Python2中表示

unicode

对象的方法（您将如何在Python源代码中编写unicode字符串文本）。默认情况下，

print L

相当于

print“[%s]“%”，”。join（map（repr，L））

即为每个列表项调用：

print L
print "[%s]" % ", ".join(map(repr, L))

输出如果您在REPL中工作，则默认情况下会使用一个可自定义的调用每个对象：

>>> L = [u'AB', u'\x41\x42', u'\u0041\u0042', unichr(65) + unichr(66)]
>>> L
[u'AB', u'AB', u'AB', u'AB']
>>> ", ".join(L)
u'AB, AB, AB, AB'
>>> print ", ".join(L)
AB, AB, AB, AB

不要编码为字节

在您的特定情况下，我将创建一个Python列表，并使用

json.dumps（）

对其进行序列化，而不是使用字符串格式来创建json文本：

#!/usr/bin/env python2
import json
# ...
test = [dict(email=player.email, gem=player.gem)
        for player in players]
print test
print json.dumps(test)

输出这基本上转换字符串中的所有元素。因此，删除了编码。因此，表示编码的u被删除将轻松有效地完成这项工作

[u'{email:test@gmail.com,gem:0}', u'{email:test,gem:0}', u'{email:test,gem:0}', u'{email:test,gem:0}', u'{email:test,gem:0}', u'{email:test1,gem:0}']

“u”表示unicode字符。我们可以使用最后一个列表元素上的map函数轻松地删除此项

map(str, test)

另一种方法是将其附加到列表中

test.append(str(a))

对于python数据集，可以使用索引。请使用

map（）

python函数

输入：如果是值列表

索引=

[u'CARBO1004'u'CARBO1006'u'CARBO1008'u'CARBO1009'u'CARBO1020']

encoded_string = map(str, index)

输出：

['CARBO1004'，'CARBO1006'，'CARBO1008'，'CARBO1009'，'CARBO1020']

encoded_string = map(str, index)

对于单个字符串输入：

index = u'CARBO1004'
# Use Any one of the encoding scheme.
index.encode("utf-8")  # To utf-8 encoding scheme
index.encode('ascii', 'ignore')  # To Ignore Encoding Errors and set to default scheme

输出：

'CARBO1004'

字符“u”不在列表中，它在unicode字符串的

repr

中，如果您尝试

打印整个列表，则会打印该字符串。u
表示unicode字符串。列表中包含Unicode字符串本身似乎没有问题，那么您的实际问题是什么？代码ast.literal\u eval（json.dumps（test））
计算一个值，然后将其丢弃。不要混淆Unicode字符串（内存中的对象）及其文本表示（可以在Python源代码中指定对象）。请考虑<代码>打印（三明治） vs>代码>打印（RePR（三明治））< /code >。不要将文本编码到字节表。您不需要对Unicode字符串进行编码；您可以直接打印“代码>打印（unicoDeiString））。

，这取决于您将其输出到的位置。显然，尽管默认设置仍然是：在Python中使用Unicode处理文本。除非必要，否则不要编码为字节（）--我相信您知道Unicode三明治的概念。虽然此代码可能有助于解决此问题，但它没有解释为什么和/或如何回答此问题。提供此附加上下文将显著提高其长期教育价值。请您的回答添加解释，包括适用的限制和假设。ad使用StackOverflow的代码格式标记对代码片段进行代码格式设置；）这基本上重复了现有的答案Hey@tripleee尝试将timeit用于两种解决方案。你将能够看到差异。Map方法速度更快。而

test.append（str（a））

是同时创建列表，而不是在创建列表后迭代列表，因此节省了时间。

test.append(str(a))

tmpColumnsSQL = ("show columns in dim.date_dim")
hiveCursor.execute(tmpColumnsSQL)
columnlist = hiveCursor.fetchall()

for columns in jayscolumnlist:
    print columns[0]

for i in range(len(jayscolumnlist)):    
    print columns[i][0])

encoded_string = map(str, index)

index = u'CARBO1004'
# Use Any one of the encoding scheme.
index.encode("utf-8")  # To utf-8 encoding scheme
index.encode('ascii', 'ignore')  # To Ignore Encoding Errors and set to default scheme