Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/316.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Can';t使用os.walk()遍历python中的目录树,因为它表示未定义名称_Python_Operating System_Filesystems_Directory Traversal - Fatal编程技术网

Can';t使用os.walk()遍历python中的目录树,因为它表示未定义名称

Can';t使用os.walk()遍历python中的目录树,因为它表示未定义名称,python,operating-system,filesystems,directory-traversal,Python,Operating System,Filesystems,Directory Traversal,作为家庭作业的一部分,我必须遍历目录树,看来os.walk是最好的选择。我正在使用cygwin运行python脚本。我尝试遍历的树的路径是: /cygdrive/c/Users/Kamal/Documents/School/Spring2015/CS410/htmlfiles 在我的代码中,下面是os.walk()调用的代码片段: 但是,当我执行脚本时,会出现以下错误: $ python testparser.py Traceback (most recent call last): Fil

作为家庭作业的一部分,我必须遍历目录树,看来os.walk是最好的选择。我正在使用cygwin运行python脚本。我尝试遍历的树的路径是: /cygdrive/c/Users/Kamal/Documents/School/Spring2015/CS410/htmlfiles

在我的代码中,下面是os.walk()调用的代码片段:

但是,当我执行脚本时,会出现以下错误:

$ python testparser.py
Traceback (most recent call last):
  File "testparser.py", line 1, in <module>
    Spring2015/CS410/htmlfiles/testparser.py
NameError: name 'Spring2015' is not defined

文件的第1行是什么?
namererror
表示Python试图将Spring2015解析为变量名。我认为你需要发布更多的代码。最好是整个文件。我添加了整个文件。为什么python认为Spring2015是一个变量名?我猜我犯了一些我不知道的语法错误?是您向我们展示的文件
testparser.py
?Python似乎认为该文件以文本
Spring2015/CS410/htmlfiles/testparser.py开头,这在Python中是无效的。
$ python testparser.py
Traceback (most recent call last):
  File "testparser.py", line 1, in <module>
    Spring2015/CS410/htmlfiles/testparser.py
NameError: name 'Spring2015' is not defined
from bs4 import BeautifulSoup
import os
import shutil

cnt = 0

print "starting..."
for dirName, subdirList, fileList in os.walk('/cygdrive/c/Users/Kamal/Documents/School/Spring2015/CS410/htmlfiles'):
for f in fileList:
    #print the path
    print "Processing " + os.path.abspath(f) + "...\n"

    #open the HTML file
    html = open(f)
    soup  = BeautifulSoup(html)

    #Filter out unwanted stuff
    [s.extract() for s in soup(['style', 'script', '[document]',  'head', 'title'])]
    visible_text = soup.getText()
    visible_text_encoded = visible_text.encode('utf-8')
    visible_text_split = visible_text_encoded.split('\n')
    visible_text_filtered = filter(lambda l: l != '', visible_text_split)

    #Generate the name of the output text file
    outfile_name = 'chaya2_' + str(cnt) + '.txt'

    #Open the output file to write in
    outfile = open(outfile_name, "w")

    #Get the URL of the html file using its Path, write it to the first line
    outfile.write(os.path.relpath(f, '/cygdrive/c/Users/Kamal/Documents/School/') + ' \n')

    #Write the visible text to the 
    for l in visible_text_filtered:
        outfile.write(l+'\n')

    #Done writing, move the output file to the appropriate directory
    shutil.move(os.path.abspath(outfile_name), '/cygdrive/c/Users/Kamal/Documents/School/Spring2015/CS410/txtFiles')

    #Rename the html file 
    html_name = 'chaya2_' + str(cnt) + '.html'
    os.rename(f, html_name)

    #Move the html file to the appropriate directory
    shutil.move(os.path.abspath(html_name), '/cygdrive/c/Users/Kamal/Documents/School/Spring2015/CS410/htmlFilesAfter')

    print html_name  + " converted to " + outfile_name + "\n"

    outfile.close()
    html.close()

    cnt+=1