Python 无法使用tabla py捕获表数据
无法完全提取MBLHA10B\rGHH4258\r3,因为我们只能从BLHA10B\rGHH4258\r3中看到,它正在跳跃我正在取消对carputer'm的标记 请参考此链接Python 无法使用tabla py捕获表数据,python,tabula-py,Python,Tabula Py,无法完全提取MBLHA10B\rGHH4258\r3,因为我们只能从BLHA10B\rGHH4258\r3中看到,它正在跳跃我正在取消对carputer'm的标记 请参考此链接 from tabula import read_pdf path_s=r'try[enter image description here][1].pdf' json_da = read_pdf(path_s, pages=1, output_format='json',silent=True,lattice=True)
from tabula import read_pdf
path_s=r'try[enter image description here][1].pdf'
json_da = read_pdf(path_s, pages=1, output_format='json',silent=True,lattice=True)
Vehicle_Details1=[]
Vehicle_jsondata2 = json_da[0].get('data')
print('============================================================================================')
for i in range(len(Vehicle_jsondata2)):
for j in range(len(Vehicle_jsondata2[i])):
Vehicle_Details1.append(Vehicle_jsondata2[i][j].get('text'))
print(len(Vehicle_Details1))
print(Vehicle_Details1)
print('============================================================================================')
output:
['Registration\rNo.', 'Make', 'SubType', 'Model', 'CC/KW', 'Mfg year', 'Seat Cap', 'Vehicle/\rTrailer\rChassis\rNo', 'Engine Number', 'JH01CE1936', 'HERO MO-\rTOCORP', 'CAST KICK\rDRUM', 'PASSION PRO', '100', '2016', '2', 'BLHA10B\rGHH4258\r3', 'HA10EVGHH4653\r8']
Expected output:
['Registration\rNo.', 'Make', 'SubType', 'Model', 'CC/KW', 'Mfg year', 'Seat Cap', 'Vehicle/\rTrailer\rChassis\rNo', 'Engine Number', 'JH01CE1936', 'HERO MO-\rTOCORP', 'CAST KICK\rDRUM', 'PASSION PRO', '100', '2016', '2', 'MBLHA10B\rGHH4258\r3', 'HA10EVGHH4653\r8']