如何比较golang中的两个文件？_Go_File Comparison_Filecompare

如何比较golang中的两个文件？

如何比较golang中的两个文件？,go,file-comparison,filecompare,Go,File Comparison,Filecompare,使用Python，我可以执行以下操作： equals = filecmp.cmp(file_old, file_new) 在go语言中是否有任何内置函数可以实现这一点？我用谷歌搜索了一下，但没有成功我可以在hash/crc32包中使用一些散列函数，但这比上面的Python代码要多我不确定该函数的作用是否与您认为的相同。从除非给出shallow且为false，否则具有相同os.stat（）签名的文件将被视为相等您的呼叫只比较了os.stat，其中只包括：文件模式修正时间大小您可以

使用Python，我可以执行以下操作：

equals = filecmp.cmp(file_old, file_new)

在go语言中是否有任何内置函数可以实现这一点？我用谷歌搜索了一下，但没有成功

我可以在

hash/crc32

包中使用一些散列函数，但这比上面的Python代码要多

我不确定该函数的作用是否与您认为的相同。从

除非给出shallow且为false，否则具有相同os.stat（）签名的文件将被视为相等

您的呼叫只比较了

os.stat

，其中只包括：

文件模式

修正时间

大小

您可以在Go中从函数中学习这三个方面。这实际上只表明它们实际上是同一个文件，或指向同一个文件的符号链接，或该文件的副本

如果您想更深入，可以打开这两个文件并进行比较（python版本一次读取8k）

您可以使用crc或md5对这两个文件进行散列，但如果长文件的开头存在差异，您希望尽早停止。我建议每次从每个读卡器中读取一定数量的字节，并与进行比较。

使用

字节如何。相等
package main

import (
"fmt"
"io/ioutil"
"log"
"bytes"
)

func main() {
    // per comment, better to not read an entire file into memory
    // this is simply a trivial example.
    f1, err1 := ioutil.ReadFile("lines1.txt")

    if err1 != nil {
        log.Fatal(err1)
    }

    f2, err2 := ioutil.ReadFile("lines2.txt")

    if err2 != nil {
        log.Fatal(err2)
    }

    fmt.Println(bytes.Equal(f1, f2)) // Per comment, this is significantly more performant.
}

要完成@captncraig回答，如果您想知道这两个文件是否相同，可以使用OS包中的方法
SameFile报告fi1和fi2是否描述同一个文件。例如，在Unix上，这意味着两个底层结构的设备和inode字段是相同的
否则，如果要检查文件内容，这里有一个解决方案，它逐行检查两个文件，避免在内存中加载整个文件
第一次尝试：

编辑：按字节块读取，如果文件大小不同，则快速失败
你可以使用像这样的软件包
主要API：
func CompareFile(path1, path2 string) (bool, error)

戈多克：
例如：
package main

import (
    "fmt"
    "os"
    "github.com/udhos/equalfile"
 )

func main() {
    if len(os.Args) != 3 {
        fmt.Printf("usage: equal file1 file2\n")
        os.Exit(2)
    }

    file1 := os.Args[1]
    file2 := os.Args[2]

    equal, err := equalfile.CompareFile(file1, file2)
    if err != nil {
        fmt.Printf("equal: error: %v\n", err)
        os.Exit(3)
    }

    if equal {
        fmt.Println("equal: files match")
        os.Exit(0)
    }

    fmt.Println("equal: files differ")
    os.Exit(1)
}

这是我突然拿出的io.Reader
。如果两个流共享的内容不相等，则可以，err:=io.Copy（ioutil.Discard，newcomparereder（a，b））
获取错误。此实现通过限制不必要的数据复制来优化性能
package main

import (
    "bytes"
    "errors"
    "fmt"
    "io"
)

type compareReader struct {
    a    io.Reader
    b    io.Reader
    bBuf []byte // need buffer for comparing B's data with one that was read from A
}

func newCompareReader(a, b io.Reader) io.Reader {
    return &compareReader{
        a: a,
        b: b,
    }
}

func (c *compareReader) Read(p []byte) (int, error) {
    if c.bBuf == nil {
        // assuming p's len() stays the same, so we can optimize for both of their buffer
        // sizes to be equal
        c.bBuf = make([]byte, len(p))
    }

    // read only as much data as we can fit in both p and bBuf
    readA, errA := c.a.Read(p[0:min(len(p), len(c.bBuf))])
    if readA > 0 {
        // bBuf is guaranteed to have at least readA space
        if _, errB := io.ReadFull(c.b, c.bBuf[0:readA]); errB != nil { // docs: "EOF only if no bytes were read"
            if errB == io.ErrUnexpectedEOF {
                return readA, errors.New("compareReader: A had more data than B")
            } else {
                return readA, fmt.Errorf("compareReader: read error from B: %w", errB)
            }
        }

        if !bytes.Equal(p[0:readA], c.bBuf[0:readA]) {
            return readA, errors.New("compareReader: bytes not equal")
        }
    }
    if errA == io.EOF {
        // in happy case expecting EOF from B as well. might be extraneous call b/c we might've
        // got it already from the for loop above, but it's easier to check here
        readB, errB := c.b.Read(c.bBuf)
        if readB > 0 {
            return readA, errors.New("compareReader: B had more data than A")
        }

        if errB != io.EOF {
            return readA, fmt.Errorf("compareReader: got EOF from A but not from B: %w", errB)
        }
    }

    return readA, errA
}

标准方法是统计它们并使用os.SameFile
--
os.SameFile
应该做与Python的filecmp.cmp（f1，f2）
大致相同的事情（即shallow=true
，这意味着它只比较stat获得的文件信息）
func SameFile（fi1、fi2文件信息）bool

SameFile报告fi1和fi2是否描述同一个文件。例如，在Unix上，这意味着两个底层结构的设备和inode字段是相同的；在其他系统上，决策可能基于路径名。SameFile仅适用于此包的Stat返回的结果。在其他情况下，它返回false
但是如果你真的想比较文件的内容，你必须自己做。
类似这样的事情应该可以做到，并且与其他答案相比，应该是内存有效的。我看了一下github.com/udhos/equalfile
，觉得有点过火了。在这里调用compare（）之前，您应该执行两个os.Stat（）
调用，并比较文件大小以获得早期快速路径
与其他答案相比，使用此实现的原因是，如果不需要，您不希望将这两个文件的全部保存在内存中。您可以从A和B中读取一个量，进行比较，然后继续读取下一个量，每次从每个文件读取一个缓冲区负载，直到完成为止。您只需要小心，因为您可能会从A读取50字节，然后从B读取60字节，因为您的读取可能由于某种原因被阻止
此实现假定Read（）调用不会在返回错误的同时返回N>0（读取某些字节）无这是os.File的行为方式，但不是其他Read实现的行为方式，例如net.TCPConn
导入(
“操作系统”
“字节”
“错误”
)
var errNotSame=errors.New（“文件内容不同”）
func比较（p1、p2字符串）错误{
变量(
buf1[8192]字节
buf2[8192]字节
)
fh1，错误：=操作系统打开（p1）
如果错误！=零{
返回错误
}
延迟fh1.关闭（）
fh2，错误：=操作系统打开（p2）
如果错误！=零{
返回错误
}
延迟fh2.关闭（）
为了{
n1，err1:=fh1.Read（buf1[：]）
n2，err2:=fh2.Read（buf2[：]）
如果err1==io.EOF&&err2==io.EOF{
//文件是一样的！
归零
}
如果err1==io.EOF | | err2==io.EOF{
返回错误不相同
}
如果err1！=nil{
返回错误1
}
如果err2！=nil{
返回错误2
}
//关于n1的简短阅读
对于n1
在检查了现有答案后，我制作了一个简单的包，用于比较任意（有限）io.Reader

和文件，作为一种方便的方法：

示例：

主程序包
进口(
“fmt”
“日志”
“操作系统”
“github.com/hlubek/readercomp”
)
func main（）{
结果，err:=readercomp.FilesEqual（os.Args[1]，os.Args[2]）
如果错误！=零{
日志。致命的(
package main

import (
    "bytes"
    "errors"
    "fmt"
    "io"
)

type compareReader struct {
    a    io.Reader
    b    io.Reader
    bBuf []byte // need buffer for comparing B's data with one that was read from A
}

func newCompareReader(a, b io.Reader) io.Reader {
    return &compareReader{
        a: a,
        b: b,
    }
}

func (c *compareReader) Read(p []byte) (int, error) {
    if c.bBuf == nil {
        // assuming p's len() stays the same, so we can optimize for both of their buffer
        // sizes to be equal
        c.bBuf = make([]byte, len(p))
    }

    // read only as much data as we can fit in both p and bBuf
    readA, errA := c.a.Read(p[0:min(len(p), len(c.bBuf))])
    if readA > 0 {
        // bBuf is guaranteed to have at least readA space
        if _, errB := io.ReadFull(c.b, c.bBuf[0:readA]); errB != nil { // docs: "EOF only if no bytes were read"
            if errB == io.ErrUnexpectedEOF {
                return readA, errors.New("compareReader: A had more data than B")
            } else {
                return readA, fmt.Errorf("compareReader: read error from B: %w", errB)
            }
        }

        if !bytes.Equal(p[0:readA], c.bBuf[0:readA]) {
            return readA, errors.New("compareReader: bytes not equal")
        }
    }
    if errA == io.EOF {
        // in happy case expecting EOF from B as well. might be extraneous call b/c we might've
        // got it already from the for loop above, but it's easier to check here
        readB, errB := c.b.Read(c.bBuf)
        if readB > 0 {
            return readA, errors.New("compareReader: B had more data than A")
        }

        if errB != io.EOF {
            return readA, fmt.Errorf("compareReader: got EOF from A but not from B: %w", errB)
        }
    }

    return readA, errA
}