Python:将pdf转换为csv/json

Python:将pdf转换为csv/json,python,nlp,Python,Nlp,我是一个新的Python转换文件。我试图在此代码中将pdf转换为csv,我指的是git回购: 我遇到了类似“将英语评估测试套件归档为韩语.pdf失败”的错误。除了“subprocess.Popen”之外,其他一切都正常工作。我做错了什么 PDF文件链接(无法在git上添加附件): FortunatoScienceParsed.txt内容:在txt中复制粘贴。很抱歉,我无法将文件作为附件上载。如果需要,我会在聊天中发送整个koreanenglish_extracted.txt。非常感谢您的帮助 A

我是一个新的Python转换文件。我试图在此代码中将pdf转换为csv,我指的是git回购:

我遇到了类似“将英语评估测试套件归档为韩语.pdf失败”的错误。除了“subprocess.Popen”之外,其他一切都正常工作。我做错了什么

PDF文件链接(无法在git上添加附件):

FortunatoScienceParsed.txt内容:在txt中复制粘贴。很抱歉,我无法将文件作为附件上载。如果需要,我会在聊天中发送整个koreanenglish_extracted.txt。非常感谢您的帮助

A Test Suite for Evaluation of English-to-Korean Machine Translation Systems
Sungryong Koh, Jinee Maeng, Ji-Young Lee, Young-Sook Chae, Key-Sun ChoiKorea Terminology Research Center for Language and Knowledge Engineering (KORTERM)
Korea Advanced Institute of Science and Technology (KAIST) 
Kusong-dong Yusong-gu Taejon 305-701 Korea
{koh,aphroditejin,jinny206}@world.kaist.ac.kr
, pinochae@chollian.net
, kschoi@cs.kaist.ac.kr
Abstract
This paper describes KORTERM™s test suite and their practicability.
 The test-sets have been being constructed on the basis of f
ine-
grained classification of linguistic phenomena 
to evaluate the technical st
atus of English-to-Korean 
MT systems systematically.
 They
consist of about 5000 test-sets and are growi
ng.  

pdf是否包含数据?如果是CSV或JSON格式?如果不是,为什么要将pdf转换为CSV/JSON?@depperm Yes pdf包含数据。我无法在此处上载pdf文档,但我提供了可以下载/保存它的链接。其次,我正在尝试将pdf转换为CSV或JSON。上面的代码用于CSV.the链接不起作用Git:pdf:
A Test Suite for Evaluation of English-to-Korean Machine Translation Systems
Sungryong Koh, Jinee Maeng, Ji-Young Lee, Young-Sook Chae, Key-Sun ChoiKorea Terminology Research Center for Language and Knowledge Engineering (KORTERM)
Korea Advanced Institute of Science and Technology (KAIST) 
Kusong-dong Yusong-gu Taejon 305-701 Korea
{koh,aphroditejin,jinny206}@world.kaist.ac.kr
, pinochae@chollian.net
, kschoi@cs.kaist.ac.kr
Abstract
This paper describes KORTERM™s test suite and their practicability.
 The test-sets have been being constructed on the basis of f
ine-
grained classification of linguistic phenomena 
to evaluate the technical st
atus of English-to-Korean 
MT systems systematically.
 They
consist of about 5000 test-sets and are growi
ng.