How do I clone the repository with only part of the history?
It depends on what part you want. It's possible to have shallow clones (which is exactly what you need, only a part of the commit history), and the documentation says there are the following options:
Create a shallow clone with a history truncated to the specified number of commits. Implies --single-branch unless --no-single-branch is given to fetch the histories near the tips of all branches. If you want to clone submodules shallowly, also pass --shallow-submodules.
Create a shallow clone with a history after the specified time.
Create a shallow clone with a history, excluding commits reachable from a specified remote branch or tag. This option can be specified multiple times.
So you can choose the maximum depth, a start date or a revision you don't want to be included in the shallow clone. Which one to use, depends on what you need.
The question asks for "a few recent commits" and "the last 5 commits", and I'm afraid that the available options (specially
--depth, mentioned in another answer) might not work in all cases.
--shallow-since= you can set a start date, but can't control the number of commits. And with
--depth, it's not guaranteed that the number of commits will be the same as the depth.
When you clone a repository, Git also sets your local HEAD to be the same HEAD set in the remote repository (or a specific branch, if you provide one in the command line, such as
git clone url --branch=somebranch). And the
--depth option will fetch all commits reachable from that HEAD, stopping at the specified depth. But setting a depth to some value X doesn't mean that it'll fetch exactly X commits. This option only tells Git the maximum number of "levels" to fetch, which might or might not result in the same number of commits.
For instance, if a commit has more than one parent (which is pretty common when a merge happens), then all parents will be at the same depth (at the same "level"), hence all will be fetched/downloaded.
I've made a quick test here: first I checked out the master branch, then I merged 3 branches all at once, so the resulting commit has 4 parents (master's previous commit, plus the tip of the 3 merged branches).
I pushed this to a remote repo, then cloned it with
git clone --depth=2 remore_url, and 5 commits were fetched. I've checked this with
git log --graph --format="%C(#3299ff)%ad %C(auto)%h %C(#cdcd51)[%p] %C(#eeeeee bold)(%an)%C(auto)%d: %s" --decorate, and the output was:
*---. 2023-08-21 13:30:41 534da95 [99be355 8896d09 7519854 615db8f] (Hugo Kotsubo) (HEAD -> master, origin/master, origin/HEAD): Merge branches 'b1', 'b2' and 'b3'
|\ \ \
| | | * 2023-08-21 13:28:43 615db8f  (Hugo Kotsubo) (grafted): b3
| | * 2023-08-21 13:28:24 7519854  (Hugo Kotsubo) (grafted): b2
| * 2023-08-21 13:28:08 8896d09  (Hugo Kotsubo) (grafted): b1
* 2023-08-21 13:27:02 99be355  (Hugo Kotsubo) (grafted): new file
We can see that "level 1" is the remote's HEAD (in this case, the master branch). And "level 2" contains the tip of branches b1 (commit 8896d09), b2 (commit 7519854) and b3 (615db8f), and also the commit 99be355, which was the master branch before the merge.
--depth tells Git what the maximum depth I want, but the number of commits won't necessarily be the same. In the example above, I set the maximum depth to 2, but 5 commits were fetched (because one level has more than 1 commit).
Setting the maximum depth also doesn't guarantee that it'll get the most recent commits of the whole repository. What if another branch created lots of recent commits, but they're not merged onto master yet? With the above solution, the only guarantee I have is that I've got the last commits in the branch that corresponds to remote's HEAD.
Of course I could do
git clone --depth=5 url --branch=anotherbranch, but then I'll get only the most recent commits of that branch - and I'll need to have prior knowledge that that specific branch has the most recent commits, if I want "the most recent of all".
The same applies to
--shallow-since: it'll fetch the commits on the remote's HEAD (or a branch specified by
--branch option), but it won't work in cases where another branch has the most recent commits.
Actually, it's more complicated than that. What if the most recent commit is in one branch, the second most recent is in another branch, and so on? Then cloning one single branch won't do the trick.
If you want to know the most recent commits across all branches, I'm afraid there's no way to do it with
git clone and shallow clones (but I'd love to see an answer proving me wrong). Anyway, for that case, the only solution I can think of is: you'll have to clone the entire repository and then search for those commits (for instance, with something like
git branch --sort=-committerdate or
git for-each-ref --sort=-committerdate refs/heads/, and then getting the first N lines).