Python正在截断我的文件内容_Python

Python正在截断我的文件内容

python

Python正在截断我的文件内容,python,Python,我在Python中设置了一个任务，为字母表中的字母编码一个长文本文件1-26，为非字母数字编码一个长文本文件26+，请参见下面的代码： #open the file,read the contents and print out normally my_file = open("timemachine.txt") my_text = my_file.read() print (my_text) print "" print "" #open the file and read each li

我在Python中设置了一个任务，为字母表中的字母编码一个长文本文件1-26，为非字母数字编码一个长文本文件26+，请参见下面的代码：

#open the file,read the contents and print out normally
my_file = open("timemachine.txt")
my_text = my_file.read()
print (my_text)

print ""
print ""

#open the file and read each line, taking out the eol chars
with open("timemachine.txt","r") as myfile:
    clean_text = "".join(line.rstrip() for line in myfile)

#close the file to prevent memory hogging
my_file.close()

#print out the result all in lower case 
clean_text_lower = clean_text.lower()
print clean_text_lower

print ""
print ""

#establish a lowercase alphabet as a list   
my_alphabet_list = []
my_alphabet = """ abcdefghijklmnopqrstuvwxyz.,;:-_?!'"()[]  %/1234567890"""+"\n"+"\xef"+"\xbb"+"\xbf"

#find the index for each lowercase letter or non-alphanumeric
for letter in my_alphabet:
    my_alphabet_list.append(letter)
print my_alphabet_list,
print my_alphabet_list.index

print ""
print ""

#go through the text and find the corresponding letter of the alphabet
for letter in clean_text_lower:
    posn = my_alphabet_list.index(letter)
print posn,

当我打印时，我应该得到（1）原始文本，（2）文本简化为小写且没有空格，（3）使用的代码索引，最后（4）转换后的代码。然而，我只能得到原文的后半部分，或者如果我注释掉（4），它将打印所有文本。为什么？

结尾的位：

for letter in clean_text_lower:
    posn = my_alphabet_list.index(letter)
print posn,

不断地重新分配posn，而不实际使用它。因此，您将只获得干净文本中最后一个字母的

my\u alphabet\u list.index（letter）

要解决这个问题，你可以做几件事。首先想到的是初始化列表并向其附加值，即：

posns = []
for letter in clean_text_lower:
    posns.append(my_alphabet_list.index(letter))

print posns,

似乎要做的就是生成一个相同代码编号的逗号分隔列表。谢谢，但似乎要做的就是以逗号分隔这些编号作为列表。我的问题仍然是，为什么原始文本文件（我取自古腾堡项目，是HG Wells小说的长篇叙述）没有完全输出，直到其他输出部分之一（例如代码编号）被注释掉才被禁用。我想知道是什么限制了它？@grumpster3，在你“注释掉一个部分”之前，我不知道它为什么不能“正确输出”，我也不知道你的真正意思。根据您的问题，我将其解释为需要整个文本文件的代码，而不是我所回答的最后一个字母。你还需要其他东西吗？文本文件在古腾堡项目“HG Wells The Time Machine”中。我发现最后一行需要缩进才能正常工作。我使用的文本文件在古腾堡项目“HG Wells The Time Machine”中，但任何长文件都应该足够了。我认为最后的打印命令应该缩进。很抱歉，我所说的“注释掉”是指将代码的一部分格式化为注释以禁用它。希望这更清楚。