Python json.decoder.JSONDecodeError:从Google QuickDraw数据集读取json文件时,应为第1行第1列(字符0)
我正在使用Google QuickDraw数据集ndjson文件制作一个应用程序。我正在文件的每一行上运行此函数:Python json.decoder.JSONDecodeError:从Google QuickDraw数据集读取json文件时,应为第1行第1列(字符0),python,json,deep-learning,dataset,Python,Json,Deep Learning,Dataset,我正在使用Google QuickDraw数据集ndjson文件制作一个应用程序。我正在文件的每一行上运行此函数: def parse_line(ndjson_line): """Parse an ndjson line and return ink (as np array) and classname.""" sample = json.loads(ndjson_line) class_name = sample["word"] if not class_na
def parse_line(ndjson_line):
"""Parse an ndjson line and return ink (as np array) and classname."""
sample = json.loads(ndjson_line)
class_name = sample["word"]
if not class_name:
print("Empty classname")
return None, None
inkarray = sample["drawing"]
stroke_lengths = [len(stroke[0]) for stroke in inkarray]
total_points = sum(stroke_lengths)
np_ink = np.zeros((total_points, 3), dtype=np.float32)
current_t = 0
if not inkarray:
print("Empty inkarray")
return None, None
for stroke in inkarray:
if len(stroke[0]) != len(stroke[1]):
print("Inconsistent number of x and y coordinates.")
return None, None
for i in [0, 1]:
np_ink[current_t:(current_t + len(stroke[0])), i] = stroke[i]
current_t += len(stroke[0])
np_ink[current_t - 1, 2] = 1 # stroke_end
# Preprocessing.
# 1. Size normalization.
lower = np.min(np_ink[:, 0:2], axis=0)
upper = np.max(np_ink[:, 0:2], axis=0)
scale = upper - lower
scale[scale == 0] = 1
np_ink[:, 0:2] = (np_ink[:, 0:2] - lower) / scale
# 2. Compute deltas.
np_ink[1:, 0:2] -= np_ink[0:-1, 0:2]
np_ink = np_ink[1:, :]
return np_ink, class_name
它适用于大多数线路,但也适用于少数线路,例如:
{"word":"wristwatch","countrycode":"FR","timestamp":"2017-01-19 09:30:18.19194 UTC","recognized":true,"key_id":"6721203257475072","drawing":[[[0,143],[66,67]],[[1,170],[35,39]],[[169,169,179,186,187,193,216,228,228,225,249,254,255,249,244,246,251,254,242,226,232,238,237,224,235,234,211,201,197,192,170,160,144,141,142],[39,26,7,9,25,15,2,2,12,19,7,23,36,39,34,32,37,56,54,44,47,58,67,65,74,80,84,82,75,92,73,97,85,71,67]],[[94,96,110],[26,88,89]]]}
我得到以下错误:
Traceback (most recent call last):
File "A:\Code\Machine Learning\Software Engineering project\Quick Draw\Create_Dataset_from_ndjson.py", line 168, in <module>
tf.app.run(main=main,argv=[sys.argv[0]]+unparsed)
File "C:\Users\shind\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
_sys.exit(main(argv))
File "A:\Code\Machine Learning\Software Engineering project\Quick Draw\Create_Dataset_from_ndjson.py", line 124, in main
classes = convert_to_tfrecord(FLAGS.source_path,FLAGS.destination_path,FLAGS.train_examples_per_class,FLAGS.eval_examples_per_class,FLAGS.classes_path,FLAGS.output_shards)
File "A:\Code\Machine Learning\Software Engineering project\Quick Draw\Create_Dataset_from_ndjson.py", line 98, in convert_to_tfrecord
drawing, class_name = parse_line(ndjson_line)
File "A:\Code\Machine Learning\Software Engineering project\Quick Draw\Create_Dataset_from_ndjson.py", line 13, in parse_line
sample = json.loads(ndjson_line)
File "C:\Users\shind\AppData\Local\Programs\Python\Python36\lib\json\__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "C:\Users\shind\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\shind\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
回溯(最近一次呼叫最后一次):
文件“A:\Code\Machine Learning\Software Engineering project\Quick Draw\Create\u Dataset\u from\u ndjson.py”,第168行,在
tf.app.run(main=main,argv=[sys.argv[0]]]+未解析)
文件“C:\Users\shind\AppData\Local\Programs\Python\Python36\lib\site packages\tensorflow\Python\platform\app.py”,第125行,正在运行
_系统出口(主(argv))
文件“A:\Code\Machine Learning\Software Engineering project\Quick Draw\Create\u Dataset\u from\u ndjson.py”,主目录第124行
classes=convert_to_tfrecord(标志。源_路径,标志。目标_路径,标志。每类训练_示例,标志。每类评估_示例,标志。类_路径,标志。输出_碎片)
文件“A:\Code\Machine Learning\Software Engineering project\Quick Draw\Create\u Dataset\u from\u ndjson.py”,第98行,在convert\u to\u tfrecord中
绘图,类名称=解析线(ndjson线)
文件“A:\Code\Machine Learning\Software Engineering project\Quick Draw\Create\u Dataset\u from\u ndjson.py”,第13行,parse\u行
sample=json.load(ndjson_行)
文件“C:\Users\shind\AppData\Local\Programs\Python36\lib\json\\ uuuuu init\uuuuu.py”,第354行,在loads中
返回\u默认\u解码器。解码
文件“C:\Users\shind\AppData\Local\Programs\Python\Python36\lib\json\decoder.py”,第339行,在decode中
obj,end=self.raw\u decode(s,idx=\u w(s,0.end())
文件“C:\Users\shind\AppData\Local\Programs\Python\Python36\lib\json\decoder.py”,第357行,原始解码
从None引发JSONDecodeError(“预期值”,s,err.value)
json.decoder.JSONDecodeError:预期值:第1行第1列(字符0)
这一错误的原因可能是什么?它说的是期望值,但我已经通过了这一行,那么它需要什么呢?而且,这看起来与我已经通过的其他行没有任何不同,这些行没有给出任何错误,那么这行有什么不同呢?我需要在JSON文件或代码中做哪些更改?我从GoogleGithub存储库本身获取了代码。另外,我正在使用数据集中简化的JSON文件的数据集。整个数据集是:
可能是您的数据集有空行,您可以尝试添加此项以检查错误字符串
试试看:
sample=json.load(ndjson_行)
除json.decoder.JSONDECODER错误外,其他错误为e:
打印(“错误行:{}\n.format(ndjson_行))#打印解码错误字符串
提高e#以停止程序
更新#2 异常处理
试试看:
sample=json.load(ndjson_行)
除json.decoder.JSONDECODER错误外,其他错误为e:
返回None,None
它不打印任何内容。但是当我检查ndjson文件时,我找不到任何空行。我应该添加什么代码来检测空行,而不是将其传递给函数?事实上,我在数据集的最后一行发现了一个空行。答案更新了,我建议试着理解我做了另一件事,它成功了。在将该行传递给函数之前,我进行了以下检查:if ndjson_line.split():
如果该语句为true,则调用else的函数将继续