Python sqlite3.ProgrammingError：除非使用可以解释8位ByTestRing的文本工厂，否则不能使用8位ByTestRing_Python_Unicode_Sqlite_Zlib

Python sqlite3.ProgrammingError：除非使用可以解释8位ByTestRing的文本工厂，否则不能使用8位ByTestRing

python unicode sqlite

Python sqlite3.ProgrammingError：除非使用可以解释8位ByTestRing的文本工厂，否则不能使用8位ByTestRing,python,unicode,sqlite,zlib,Python,Unicode,Sqlite,Zlib,在Python中使用SQLite3，我试图存储UTF-8 HTML代码片段的压缩版本代码如下所示： ... c = connection.cursor() c.execute('create table blah (cid integer primary key,html blob)') ... c.execute('insert or ignore into blah values (?, ?)',(cid, zlib.compress(html))) 在哪一点获取错误： sqlite3.

在Python中使用SQLite3，我试图存储UTF-8 HTML代码片段的压缩版本

代码如下所示：

...
c = connection.cursor()
c.execute('create table blah (cid integer primary key,html blob)')
...
c.execute('insert or ignore into blah values (?, ?)',(cid, zlib.compress(html)))

在哪一点获取错误：

sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

如果我使用“text”而不是“blob”，并且不压缩HTML代码段，那么它就可以正常工作（不过db太大了）。当我使用“blob”并通过pythonzlib库进行压缩时，会收到上面的错误消息。我环顾四周，但找不到这个问题的简单答案。

找到了解决方案，我本应该多花一点时间搜索的

解决方案是将值“强制转换”为Python“缓冲区”，如下所示：

c.execute('insert or ignore into blah values (?, ?)',(cid, buffer(zlib.compress(html))))

希望这能帮助其他人。

如果您想在sqlite3中使用8位字符串而不是unicode字符串，请为sqlite连接设置合适的文本工厂：

connection = sqlite3.connect(...)
connection.text_factory = str

您可以使用repr（html）而不是原始输出来存储值，然后在检索要使用的值时使用eval（html）

c.execute('insert or ignore into blah values (?, ?)',(1, repr(zlib.compress(html))))

为了使用BLOB类型，必须首先将zlib压缩字符串转换为二进制数据，否则sqlite将尝试将其作为文本字符串处理。这是通过sqlite3.Binary（）完成的。例如：

c.execute('insert or ignore into blah values (?, ?)',(cid, 
sqlite3.Binary(zlib.compress(html))))

语法：

5种可能的存储类型：NULL、INTEGER、TEXT、REAL和BLOB

BLOB通常用于存储腌制模型或莳萝腌制模型

> cur.execute('''INSERT INTO Tablename(Col1, Col2, Col3, Col4) VALUES(?,?,?,?)''', 
                                      [TextValue, Real_Value, Buffer(model), sqlite3.Binary(model2)])
> conn.commit()

> # Read Data:
> df = pd.read_sql('SELECT * FROM Model, con=conn) 
> model1 = str(df['Col3'].values[0]))
> model2 = str(df['Col'].values[0]))

像这样使用eval和repr是非常肮脏的。无论您多么信任数据源。我同意，这里的任何东西都比eval（）好。正确的解决方案是使用sqlite3.Binary，但如果出于某种原因不能这样做，最好以更安全的方式对数据进行编码-例如使用base64。当我这样做时，我的数据库中充满了base36文本，这将使数据库比直接存储blob更大。这是错误的，您应该使用sqlite3.Binary来代替文档中所说的。它看起来像sqlite3.Binary（）只是buffer（）的别名，至少在Huh。pysqlite文档的这一部分实际上鼓励使用buffer（）：“因此，可以毫无问题地将以下Python类型发送到SQLite:…”[Python type]buffer。。。[SQLite type]BLOB“这可能会给您带来不同编码的问题，因为您仍在尝试将二进制数据解析为文本。最好改用sqlite3.binary。这很有效。但是，我想知道为什么需要这样做。类型为“BLOB”吗“是否已指示此列中的数据为二进制？注意在Python2中，字符串可以是文本或二进制。sqlite3不应该仅仅将对象（zlib压缩字符串）视为BLOB类型的二进制对象吗？我认为Python内存中没有完整的数据库模式来查询正确的数据类型-很可能它只是根据您传递的数据猜测运行时的类型，因此二进制字符串不能与文本字符串区分。因为SQLite使用动态类型：@user1783732