Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用--pool EMR作业流时,MRJob无法在EMR上启动新作业_Python_Hadoop_Mrjob - Fatal编程技术网

Python 使用--pool EMR作业流时,MRJob无法在EMR上启动新作业

Python 使用--pool EMR作业流时,MRJob无法在EMR上启动新作业,python,hadoop,mrjob,Python,Hadoop,Mrjob,我正在使用MRJob在Amazon的EMR上运行一个迭代hadoop程序 当我不使用“-pool emr job flows”选项时,一切正常(但运行缓慢)。当我使用这个选项时 Traceback (most recent call last): File "ic_bfs_eval.py", line 297, in <module> res = main() File "ic_bfs_eval.py", line 262, in main frac, mr_

我正在使用MRJob在Amazon的EMR上运行一个迭代hadoop程序

当我不使用“-pool emr job flows”选项时,一切正常(但运行缓慢)。当我使用这个选项时

Traceback (most recent call last):
  File "ic_bfs_eval.py", line 297, in <module>
    res = main()
  File "ic_bfs_eval.py", line 262, in main
    frac, mr_rounds = bfs(db_name, T, samples, total_steps_cap)
  File "ic_bfs_eval.py", line 183, in bfs
    runner.run()
  File "/Library/Python/2.7/site-packages/mrjob-0.4.3_dev-py2.7.egg/mrjob/runner.py", line 620, in __exit__
    self.cleanup()
  File "/Library/Python/2.7/site-packages/mrjob-0.4.3_dev-py2.7.egg/mrjob/emr.py", line 987, in cleanup
    super(EMRJobRunner, self).cleanup(mode=mode)
  File "/Library/Python/2.7/site-packages/mrjob-0.4.3_dev-py2.7.egg/mrjob/runner.py", line 566, in cleanup
    self._cleanup_job()
  File "/Library/Python/2.7/site-packages/mrjob-0.4.3_dev-py2.7.egg/mrjob/emr.py", line 1061, in _cleanup_job
    self._opts['ec2_key_pair_file'])
  File "/Library/Python/2.7/site-packages/mrjob-0.4.3_dev-py2.7.egg/mrjob/ssh.py", line 209, in ssh_terminate_single_job
    num_jobs_match = HADOOP_JOB_LIST_NUM_RE.match(job_list_lines[0])
IndexError: list index out of range

你知道为什么会发生这种情况吗?

当我设置ssh密钥对时,这种情况就消失了。 我认为这仍然是一个bug,因为ssh应该是可选的。但最简单的解决方法是按照中所述设置密钥对

mrJob2 = MRBFSSampleIter(args=["-c", "~/mrjob.conf",
                                       "-r", "emr",
                                       "--no-output",
                                       "--output-dir", tmp_dir_out,
                                       "--pool-emr-job-flows", tmp_dir_in])