Python 连接文本文件将创建每个字母之间有空格的文件_Python

Python 连接文本文件将创建每个字母之间有空格的文件

python

Python 连接文本文件将创建每个字母之间有空格的文件,python,Python,我尝试连接txt文件，几乎一切都很顺利，但是 out文件的每个字母之间都有一个空格如l o r e m i p s u m 这是我的密码 import glob all = open("all.txt","a"); for f in glob.glob("*.txt"): print f t = open(f, "r") all.write(t.read()) t.close() all.close() 我正在使用Windows7和python 2.7

我尝试连接txt文件，几乎一切都很顺利，但是 out文件的每个字母之间都有一个空格如

l o r e m i p s u m

这是我的密码

import glob

all = open("all.txt","a");

for f in glob.glob("*.txt"):
    print f
    t = open(f, "r")
    all.write(t.read())
    t.close()

all.close()

我正在使用Windows7和python 2.7

编辑
也许有更好的方法连接文件

编辑2
我现在遇到了解码问题：

Traceback (most recent call last):
  File "P:\bwiki\BWiki\MobileNotes\export\999.py", line 9, in <module>
    all.write( t.read())
  File "C:\Python27\lib\codecs.py", line 671, in read
    return self.reader.read(size)
  File "C:\Python27\lib\codecs.py", line 477, in read
    newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf3 in position 18: invalid
continuation byte


import codecs
import glob

all =codecs.open("all.txt", "a", encoding="utf-8")

for f in glob.glob("*.txt"):
    print f
    t = codecs.open(f, "r", encoding="utf-8")
    all.write( t.read())

回溯（最近一次呼叫最后一次）：
文件“P:\bwiki\bwiki\MobileNotes\export\999.py”，第9行，在
all.write（t.read（））
文件“C:\Python27\lib\codecs.py”，第671行，已读
返回self.reader.read（大小）
文件“C:\Python27\lib\codecs.py”，第477行，已读
newchars，decodedbytes=self.decode（数据，self.errors）
UnicodeDecodeError:“utf8”编解码器无法解码位置18中的字节0xf3:无效
连续字节
导入编解码器
导入glob
all=codecs.open（“all.txt”，“a”，encoding=“utf-8”）
对于glob.glob（“*.txt”）中的f：
打印f
t=编解码器。打开（f，“r”，encoding=“utf-8”）
all.write（t.read（））

您的输入文件可能是UTF编码的，但您将其读取为ASCII，这会导致出现空格（反映空字节）。尝试：

请运行此程序并将输出编辑到您的问题中（我们可能只需要看到输出的前五行左右）。它以十六进制打印每个文件的前16个字节。这将帮助我们弄清楚到底发生了什么

import glob
import sys

def hexdump(s):
    return " ".join("{:02x}".format(ord(c)) for c in s)

l = 0
for f in glob.glob("*.txt"):
    l = max(l, len(f))

for f in glob.glob("*.txt"):
    with open(f, "rb") as fp:
       sys.stdout.write("{0:<{1}}  {2}\n".format(f, l, hexdump(fp.read(16))))

导入全局
导入系统
def HEX转储：
返回“.join（“{:02x}”。s中c的格式（ord（c））
l=0
对于glob.glob（“*.txt”）中的f：
l=最大值（l，长度（f））
对于glob.glob（“*.txt”）中的f：
以开放式（f，“rb”）作为fp：
sys.stdout.write（“{0:请运行此程序并将输出编辑到您的问题中（我们可能只需要查看输出的前五行左右）。它以十六进制打印每个文件的前16个字节。这将帮助我们了解发生了什么
import glob
import sys

def hexdump(s):
    return " ".join("{:02x}".format(ord(c)) for c in s)

l = 0
for f in glob.glob("*.txt"):
    l = max(l, len(f))

for f in glob.glob("*.txt"):
    with open(f, "rb") as fp:
       sys.stdout.write("{0:<{1}}  {2}\n".format(f, l, hexdump(fp.read(16))))

导入全局
导入系统
def HEX转储：
返回“.join（“{:02x}”。s中c的格式（ord（c））
l=0
对于glob.glob（“*.txt”）中的f：
l=最大值（l，长度（f））
对于glob.glob（“*.txt”）中的f：
以开放式（f，“rb”）作为fp：
sys.stdout.write（“{0:请运行此程序并将输出编辑到您的问题中（我们可能只需要查看输出的前五行左右）。它以十六进制打印每个文件的前16个字节。这将帮助我们了解发生了什么
import glob
import sys

def hexdump(s):
    return " ".join("{:02x}".format(ord(c)) for c in s)

l = 0
for f in glob.glob("*.txt"):
    l = max(l, len(f))

for f in glob.glob("*.txt"):
    with open(f, "rb") as fp:
       sys.stdout.write("{0:<{1}}  {2}\n".format(f, l, hexdump(fp.read(16))))

导入全局
导入系统
def HEX转储：
返回“.join（“{:02x}”。s中c的格式（ord（c））
l=0
对于glob.glob（“*.txt”）中的f：
l=最大值（l，长度（f））
对于glob.glob（“*.txt”）中的f：
以开放式（f，“rb”）作为fp：
sys.stdout.write（“{0:请运行此程序并将输出编辑到您的问题中（我们可能只需要查看输出的前五行左右）。它以十六进制打印每个文件的前16个字节。这将帮助我们了解发生了什么
import glob
import sys

def hexdump(s):
    return " ".join("{:02x}".format(ord(c)) for c in s)

l = 0
for f in glob.glob("*.txt"):
    l = max(l, len(f))

for f in glob.glob("*.txt"):
    with open(f, "rb") as fp:
       sys.stdout.write("{0:<{1}}  {2}\n".format(f, l, hexdump(fp.read(16))))

导入全局
导入系统
def HEX转储：
返回“.join（“{:02x}”。s中c的格式（ord（c））
l=0
对于glob.glob（“*.txt”）中的f：
l=最大值（l，长度（f））
对于glob.glob（“*.txt”）中的f：
以开放式（f，“rb”）作为fp：
sys.stdout.write（字母之间的“{0:”空格”可能表示至少一些文件使用utf-16编码
如果所有文件都使用相同的字符编码，那么您可以使用以下方法：将文件复制为字节（）。下面是与Python代码相对应的：
PS C:\> Get-Content *.txt | Add-Content all.txt

不同于cat*.txt>>all.txt

如果使用二进制文件模式，代码应该可以工作：
from glob import glob
from shutil import copyfileobj

with open('all.txt', 'ab') as output_file:
    for filename in glob("*.txt"):
        with open(filename, 'rb') as file:
            copyfileobj(file, output_file)

同样，所有文件都应具有相同的字符编码，否则输出中可能会出现垃圾（混合内容）。
字母之间的“空格”可能表示至少部分文件使用utf-16编码
如果所有文件都使用相同的字符编码，那么您可以使用以下方法：将文件复制为字节（）。下面是与Python代码相对应的：
PS C:\> Get-Content *.txt | Add-Content all.txt

不同于cat*.txt>>all.txt

如果使用二进制文件模式，代码应该可以工作：
from glob import glob
from shutil import copyfileobj

with open('all.txt', 'ab') as output_file:
    for filename in glob("*.txt"):
        with open(filename, 'rb') as file:
            copyfileobj(file, output_file)

同样，所有文件都应具有相同的字符编码，否则输出中可能会出现垃圾（混合内容）。
字母之间的“空格”可能表示至少部分文件使用utf-16编码
如果所有文件都使用相同的字符编码，那么您可以使用以下方法：将文件复制为字节（）。下面是与Python代码相对应的：
PS C:\> Get-Content *.txt | Add-Content all.txt

不同于cat*.txt>>all.txt

如果使用二进制文件模式，代码应该可以工作：
from glob import glob
from shutil import copyfileobj

with open('all.txt', 'ab') as output_file:
    for filename in glob("*.txt"):
        with open(filename, 'rb') as file:
            copyfileobj(file, output_file)

同样，所有文件都应具有相同的字符编码，否则输出中可能会出现垃圾（混合内容）。
字母之间的“空格”可能表示至少部分文件使用utf-16编码
如果所有文件都使用相同的字符编码，那么您可以使用以下方法：将文件复制为字节（）。下面是与Python代码相对应的：
PS C:\> Get-Content *.txt | Add-Content all.txt

不同于cat*.txt>>all.txt

如果使用二进制文件模式，代码应该可以工作：
from glob import glob
from shutil import copyfileobj

with open('all.txt', 'ab') as output_file:
    for filename in glob("*.txt"):
        with open(filename, 'rb') as file:
            copyfileobj(file, output_file)

同样，所有文件都应该具有相同的字符编码，否则可能会产生垃圾（混合内容）在输出中。
使用简单的批处理命令连接文本文件的最佳方法。您可以简单地将文件添加到一起，就像它们是数字一样。我怀疑此错误可能与您打开all.txt
两次有关。一次是在将其分配给all
时，另一次是在循环中打开它时.all.txt
将匹配glob“*.txt”
@AlexBliskovsky我认为这不会产生所描述的症状，但你是对的