如何仅克隆Git存储库的子目录?

如何仅克隆Git存储库的子目录?,git,repository,subdirectory,git-clone,sparse-checkout,Git,Repository,Subdirectory,Git Clone,Sparse Checkout,我有我的Git存储库,它在根目录下有两个子目录: /finisht /static 当它处于中时,/finisht在一个地方签出,而/static在其他地方签出,如下所示: svn co svn+ssh://admin@domain.com/home/admin/repos/finisht/static static 有没有办法用Git做到这一点?编辑:从Git 2.19开始,这最终是可能的,如图所示 考虑投票表决这个答案 注意:在Git2.19中,只实现了客户端支持,而服务器端支持仍然缺失

我有我的Git存储库,它在根目录下有两个子目录:

/finisht
/static
当它处于中时,
/finisht
在一个地方签出,而
/static
在其他地方签出,如下所示:

svn co svn+ssh://admin@domain.com/home/admin/repos/finisht/static static

有没有办法用Git做到这一点?

编辑:从Git 2.19开始,这最终是可能的,如图所示

考虑投票表决这个答案

注意:在Git2.19中,只实现了客户端支持,而服务器端支持仍然缺失,因此它只在克隆本地存储库时起作用。还要注意的是,大型Git主机,例如GitHub,实际上并不使用Git服务器,而是使用自己的实现,因此,即使支持出现在Git服务器上,也并不自动意味着它可以在Git主机上工作。(OTOH,因为他们不使用Git服务器,所以在Git服务器出现之前,他们可以在自己的实现中更快地实现它。)


不,这在Git中是不可能的

在Git中实现类似的东西将是一项巨大的工作,这将意味着客户端存储库的完整性不再能够得到保证。如果您感兴趣,请在git邮件列表中搜索关于“稀疏克隆”和“稀疏获取”的讨论


一般来说,Git社区的共识是,如果您有多个始终独立签出的目录,那么这实际上是两个不同的项目,应该位于两个不同的存储库中。您可以使用将它们粘合在一起。

如果您从未计划与从中克隆的存储库进行交互,您可以执行完整的git克隆,并使用git筛选器分支--子目录筛选器重写存储库。这样,至少会保留历史记录。

Git1.7.0具有“稀疏签出”。看见 “core.sparseCheckout”在, 中的“稀疏签出”,以及 中的“跳过工作树位”


该接口不如SVN方便(例如,在初始克隆时无法进行稀疏签出),但可以构建更简单接口的基本功能现已可用。

您尝试的操作称为稀疏签出,该功能已添加到git 1.7.0中(2012年2月). 执行稀疏克隆的步骤如下所示:

mkdir <repo>
cd <repo>
git init
git remote add -f origin <url>
现在,您需要定义要实际签出的文件/文件夹。这是通过在
.git/info/sparse checkout
中列出它们来实现的,例如:

echo“some/dir/”>.git/info/sparse签出
echo“另一个/子/树”>>.git/info/sparse签出
最后但并非最不重要的一点是,使用远程服务器的状态更新您的空回购:

git pull origin master
现在,您将在文件系统上为
some/dir
other/sub/tree
“签出”文件(这些路径仍然存在),而不存在其他路径

你可能想看一看报纸,也许你应该读一下官方的

作为一项功能:

函数git\u sparse\u clone()( rurl=“$1”localdir=“$2”和移位2 mkdir-p“$localdir” cd“$localdir” 初始化 git远程添加-f源“$rurl” git config core.sparseCheckout true #在剩余参数上循环 对我来说,是的 echo“$i”>>.git/info/sparse签出 完成 git拉源主机 ) 用法:

git\u稀疏\u克隆”http://github.com/tj/n“./local/location”“/bin”

请注意,这仍然会从服务器下载整个存储库–只有签出的大小会减小。目前不可能只克隆一个目录。但是,如果不需要存储库的历史记录,至少可以通过创建浅层克隆来节省带宽。有关如何组合浅检出和稀疏检出的信息,请参见下文


自git 2.25.0(2020年1月)起,git中添加了一个实验命令:

git稀疏签出初始化
#同:
#git config core.sparseCheckout true
git稀疏签出集“A/B”
#同:
#回显“A/B”>>.git/info/sparse签出
git稀疏签出列表
#同:
#cat.git/info/sparse签出
我只是为了

用法:

python get_git_sub_dir.py path/to/sub/dir <RECURSIVE>
python get\u git\u sub\u dir.py path/to/sub/dir
看起来简单得多:

git archive --remote=<repo_url> <branch> <path> | tar xvf -
git归档--remote=| tar xvf-
您可以将稀疏签出和浅克隆功能结合起来。浅层克隆会切断历史记录,稀疏签出只会提取与您的模式匹配的文件

git init <repo>
cd <repo>
git remote add origin <url>
git config core.sparsecheckout true
echo "finisht/*" >> .git/info/sparse-checkout
git pull --depth=1 origin master
git init
光盘
git远程添加源
git config core.sparsecheckout true
echo“finisht/*”>.git/info/sparse checkout
git pull--深度=1原点主控
您需要最低git 1.9才能正常工作。我自己只用2.2.0和2.2.2进行了测试


通过这种方式,您仍然能够推送,这在
git存档中是不可能的
仅用git克隆子目录是不可能的,但下面是一些解决方法

滤波支路 您可能希望重写存储库,使其看起来好像
trunk/public\u html/
是其项目根目录,并放弃所有其他历史记录(使用),尝试已签出分支:

git filter-branch --subdirectory-filter trunk/public_html -- --all
注意:将过滤器分支选项与修订选项分开的
--
,以及重写所有分支和标记的
--all
。所有信息(包括原始提交时间或合并信息)将被保留。此命令支持
.git/info/grafts
文件和
refs/replace/
命名空间中的refs,因此,如果定义了任何grafts或replacement
refs
,则运行此命令将使其永久化

警告!对于所有对象,重写的历史将具有不同的对象名称,并且不会与原始分支收敛。您将无法在原始分支的顶部轻松推送和分发重写的分支。如果您不知道全部含义,请不要使用此命令,并避免使用i
git filter-branch --subdirectory-filter trunk/public_html -- --all
git clone --no-checkout git@foo/bar.git
cd bar
git config core.sparseCheckout true
echo "trunk/public_html/*"> .git/info/sparse-checkout
git checkout master
svn export <repo>/trunk/<folder>
svn export https://github.com/lodash/lodash.com/trunk/docs
git-download(){
    folder=${@/tree\/master/trunk}
    folder=${folder/blob\/master/trunk}
    svn export $folder
}
localRepo=$1
remoteRepo=$2
subDir=$3


# Create local repository for subdirectory checkout, make it hidden to avoid having to drill down to the subfolder
mkdir ./.$localRepo
cd ./.$localRepo
git init
git remote add -f origin $remoteRepo
git config core.sparseCheckout true

# Add the subdirectory of interest to the sparse checkout.
echo $subDir >> .git/info/sparse-checkout

git pull origin master

# Create convenience symlink to the subdirectory of interest
cd ..
ln -s ./.$localRepo/$subDir $localRepo
git clone \
  --depth 1  \
  --filter=blob:none  \
  --sparse \
  https://github.com/cirosantilli/test-git-partial-clone \
;
cd test-git-partial-clone
git sparse-checkout set d1
git clone \
  --depth 1  \
  --filter=blob:none  \
  --sparse \
  https://github.com/cirosantilli/test-git-partial-clone-big-small \
;
cd test-git-partial-clone-big-small
git sparse-checkout set small
git clone \
  --depth 1  \
  --filter=blob:none  \
  --no-checkout \
  https://github.com/cirosantilli/test-git-partial-clone \
;
cd test-git-partial-clone
git checkout master -- di
  --filter=blob:none \
  --filter=tree:0 \
fatal: invalid filter-spec 'combine:blob:none+tree:0'
git verify-pack -v .git/objects/pack/*.pack
git clone \
  --depth 1 \
  --filter=blob:none \
  --no-checkout \
  https://github.com/cirosantilli/test-git-partial-clone \
;
cd test-git-partial-clone
git sparse-checkout init
git sparse-checkout set d1
git config --local uploadpack.allowfilter 1
git config --local uploadpack.allowanysha1inwant 1
#!/usr/bin/env bash
set -eu

list-objects() (
  git rev-list --all --objects
  echo "master commit SHA: $(git log -1 --format="%H")"
  echo "mybranch commit SHA: $(git log -1 --format="%H")"
  git ls-tree master
  git ls-tree mybranch | grep mybranch
  git ls-tree master~ | grep root
)

# Reproducibility.
export GIT_COMMITTER_NAME='a'
export GIT_COMMITTER_EMAIL='a'
export GIT_AUTHOR_NAME='a'
export GIT_AUTHOR_EMAIL='a'
export GIT_COMMITTER_DATE='2000-01-01T00:00:00+0000'
export GIT_AUTHOR_DATE='2000-01-01T00:00:00+0000'

rm -rf server_repo local_repo
mkdir server_repo
cd server_repo

# Create repo.
git init --quiet
git config --local uploadpack.allowfilter 1
git config --local uploadpack.allowanysha1inwant 1

# First commit.
# Directories present in all branches.
mkdir d1 d2
printf 'd1/a' > ./d1/a
printf 'd1/b' > ./d1/b
printf 'd2/a' > ./d2/a
printf 'd2/b' > ./d2/b
# Present only in root.
mkdir 'root'
printf 'root' > ./root/root
git add .
git commit -m 'root' --quiet

# Second commit only on master.
git rm --quiet -r ./root
mkdir 'master'
printf 'master' > ./master/master
git add .
git commit -m 'master commit' --quiet

# Second commit only on mybranch.
git checkout -b mybranch --quiet master~
git rm --quiet -r ./root
mkdir 'mybranch'
printf 'mybranch' > ./mybranch/mybranch
git add .
git commit -m 'mybranch commit' --quiet

echo "# List and identify all objects"
list-objects
echo

# Restore master.
git checkout --quiet master
cd ..

# Clone. Don't checkout for now, only .git/ dir.
git clone --depth 1 --quiet --no-checkout --filter=blob:none "file://$(pwd)/server_repo" local_repo
cd local_repo

# List missing objects from master.
echo "# Missing objects after --no-checkout"
git rev-list --all --quiet --objects --missing=print
echo

echo "# Git checkout fails without internet"
mv ../server_repo ../server_repo.off
! git checkout master
echo

echo "# Git checkout fetches the missing directory from internet"
mv ../server_repo.off ../server_repo
git checkout master -- d1/
echo

echo "# Missing objects after checking out d1"
git rev-list --all --quiet --objects --missing=print
# List and identify all objects
c6fcdfaf2b1462f809aecdad83a186eeec00f9c1
fc5e97944480982cfc180a6d6634699921ee63ec
7251a83be9a03161acde7b71a8fda9be19f47128
62d67bce3c672fe2b9065f372726a11e57bade7e
b64bf435a3e54c5208a1b70b7bcb0fc627463a75 d1
308150e8fddde043f3dbbb8573abb6af1df96e63 d1/a
f70a17f51b7b30fec48a32e4f19ac15e261fd1a4 d1/b
84de03c312dc741d0f2a66df7b2f168d823e122a d2
0975df9b39e23c15f63db194df7f45c76528bccb d2/a
41484c13520fcbb6e7243a26fdb1fc9405c08520 d2/b
7d5230379e4652f1b1da7ed1e78e0b8253e03ba3 master
8b25206ff90e9432f6f1a8600f87a7bd695a24af master/master
ef29f15c9a7c5417944cc09711b6a9ee51b01d89
19f7a4ca4a038aff89d803f017f76d2b66063043 mybranch
1b671b190e293aa091239b8b5e8c149411d00523 mybranch/mybranch
c3760bb1a0ece87cdbaf9a563c77a45e30a4e30e
a0234da53ec608b54813b4271fbf00ba5318b99f root
93ca1422a8da0a9effc465eccbcb17e23015542d root/root
master commit SHA: fc5e97944480982cfc180a6d6634699921ee63ec
mybranch commit SHA: fc5e97944480982cfc180a6d6634699921ee63ec
040000 tree b64bf435a3e54c5208a1b70b7bcb0fc627463a75    d1
040000 tree 84de03c312dc741d0f2a66df7b2f168d823e122a    d2
040000 tree 7d5230379e4652f1b1da7ed1e78e0b8253e03ba3    master
040000 tree 19f7a4ca4a038aff89d803f017f76d2b66063043    mybranch
040000 tree a0234da53ec608b54813b4271fbf00ba5318b99f    root

# Missing objects after --no-checkout
?f70a17f51b7b30fec48a32e4f19ac15e261fd1a4
?8b25206ff90e9432f6f1a8600f87a7bd695a24af
?41484c13520fcbb6e7243a26fdb1fc9405c08520
?0975df9b39e23c15f63db194df7f45c76528bccb
?308150e8fddde043f3dbbb8573abb6af1df96e63

# Git checkout fails without internet
fatal: '/home/ciro/bak/git/test-git-web-interface/other-test-repos/partial-clone.tmp/server_repo' does not appear to be a git repository
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

# Git checkout fetches the missing directory from internet
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (1/1), 45 bytes | 45.00 KiB/s, done.
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (1/1), 45 bytes | 45.00 KiB/s, done.

# Missing objects after checking out d1
?8b25206ff90e9432f6f1a8600f87a7bd695a24af
?41484c13520fcbb6e7243a26fdb1fc9405c08520
?0975df9b39e23c15f63db194df7f45c76528bccb
git clone --single-branch -b {branch} git@github.com:{user}/{repo}.git
git filter-branch --subdirectory-filter {path/to/folder} HEAD
git remote remove origin
git remote add origin git@github.com:{user}/{new-repo}.git
git push -u origin master
git config --global alias.sparse-checkout "!f(){ [ $# -eq 2 ] && L=${1##*/} L=${L%.git} || L=$2; mkdir -p \"$L/.git/info\" && cd \"$L\" && git init --template= && git remote add origin \"$1\" && git config core.sparseCheckout 1; [ $# -eq 2 ] && echo \"$2\" >> .git/info/sparse-checkout || { shift 2; for i; do echo $i >> .git/info/sparse-checkout; done }; git pull --depth 1 origin master;};f"
git config --global alias.sparse-checkout '!f(){ [ $# -eq 2 ] && L=${1##*/} L=${L%.git} || L=$2; mkdir -p "$L/.git/info" && cd "$L" && git init --template= && git remote add origin "$1" && git config core.sparseCheckout 1; [ $# -eq 2 ] && echo "$2" >> .git/info/sparse-checkout || { shift 2; for i; do echo $i >> .git/info/sparse-checkout; done }; git pull --depth 1 origin master;};f'
# Makes a directory ForStackExchange with Plug checked out
git sparse-checkout https://github.com/YenForYang/ForStackExchange Plug

# To do more than 1 directory, you have to specify the local directory:
git sparse-checkout https://github.com/YenForYang/ForStackExchange ForStackExchange Plug Folder
git clone https://github.com:{user}/{repo}.git ~/my-project
ln -s ~/my-project/my-subfolder ~/Desktop/my-subfolder
cd ~/Desktop/my-subfolder
git status
--- /tmp » git-scp https://github.com/dgraph-io/dgraph/blob/master/contrib/config/kubernetes/helm                                                                                                                  1 ↵
A    helm
A    helm/Chart.yaml
A    helm/README.md
A    helm/values.yaml
Exported revision 6367.

--- /tmp » ls | grep helm
Permissions Size User    Date Modified    Name
drwxr-xr-x     - anthony 2020-01-07 15:53 helm/
mkdir app1
cd app1
git init
git remote add origin git@github.com:some-user/full-repo.git
git config core.sparsecheckout true
echo "app1/" >> .git/info/sparse-checkout
git pull origin master
echo "wpm/*" >> .git/info/sparse-checkout
wpm/*
git config core.sparsecheckout true
wpm/*
git checkout master
"mydir/myfolder"
mydir/myfolder
git sparse-checkout set *