The Git Commit Hash

栏目: IT技术 · 发布时间: 3年前

内容简介：This might look familiar to you. The sign of a commit which is built up of a few components: the commit hash - a 40 character long string, followed by the author, date and lastly the commit message.This blog post will focus on theReading this, I assume you

commit 13c988d4f15e06bcdd0b0af290086a3079cdadb0
Author: Mike Street
Date:   Sun Mar 3 16:04:33 2019 +0000

    Initial commit

This might look familiar to you. The sign of a commit which is built up of a few components: the commit hash - a 40 character long string, followed by the author, date and lastly the commit message.

This blog post will focus on the commit hash , a seemingly random mish-mash of letters and numbers that you sometimes have to copy and paste about. What is it? How is it made? Can it change? All those questions answered in this blog post (hopefully).

Reading this, I assume you have basic knowledge of Git and have at least committed a few times and hopefully used branches.

This blog post is intended as a primer to the commit hash. There is a lot more logic & magic behind the scenes that goes into making and using the git commit hash which is beyond the scope of this article

What is a hash?

Before we dive into the git specifics, I thought I would give a very brief overview of what a hash is.

There are many different hashing algorithms - MD5 and SHA-1 are examples of these. What a hash allows you to do is take an arbitrary amount of content (be it one word, 100 words or the whole contents of a JavaScript library) and produce a unique fixed string of characters representing that. The length of the string is dependent on which method you choose.

The string (in “theory”) cannot be reverse engineered (e.g. given the hash it is difficult to work out the contents), but would allow you to compare two things to see if they are the same.

Note: I put theory in " , because there are workarounds to hashing; so be wary if you are using a hashing algorithm to encode passwords

One use case of hashes is to compare the contents of two files. You may have seen the integrity check on script tags appearing ( MDN docs for reference ). This stores a hash of the file you expect and compares it with a hash of the file the browser downloads. If the hashes match, you know the files are the same. This saves computers (or humans) meticulously checking each line of a file.

Different hashing algorithms have different methods of hashing (and different length resulting hashes). Of course, collisions can happen (as mentioned on Stack Overflow ) but they are rare and would be extremely unlikely in a git scenario (especially within the same repository).

How it is made?

The commit hash is an SHA-1 hash made up of a few properties from the commit itself. As mentioned above, it is a lot more complex than this post can go into, but understanding the fundamentals is a great first step.

The git hash is made up of the following:

The commit message
The file changes
The commit author (and committer- they can be different)
The date
The parent commit hash

When you take all these into consideration, hopefully you will begin to see how various actions might impact how the commit hash is formed.

One other thing to note is the Git tree doesn't really "exist" as such - it is constructed by following the parent hash. This may seem like the same thing, but there are some slight nuances.

If a commit hash isn't contained in another commit hash as a parent, then that can create an orphaned commit. The exception to this is branches and you HEAD, which point to a specific commit hash.

Git is entirely a text based system - in your files you will have a .git folder, which contains all the commits, branches and other information about your repository as text files - it is worth a look around.

How can it change?

So now we know and understand what the hash is made off, how can it change and what impact would that have on the repository and git history.

Amending a Commit

If you amend the commit message, or the files in a commit, this will change the git hash. You can amend the last commit by running git commit --amend . This allows you to edit the message, what files and changes are included in a commit and even the author.

All of these are things the hash is based on, so amending the commit will change the hash

Cherry Picking

If you have made a commit on a different branch and wish to have it on your current branch, you may be advised to git cherry-pick <hash> . This will work, however, the hash for the commit will change. This is because it will have a new parent commit and so, a new hash will need to be calculated.

This may also cause issues when you come to merge the other branch, as git will see two commits with different hashes that apply the same change - so be careful if you ever cherry pick

Rebasing

Rebasing one branch onto another will have a similar effect to cherry picking. When you rebase, it essentially removes all of your commits on your branch, updates it with the source branch and then reapplies your commits afterwards.

This changes the parent commit of your first commit and so, all of the following commits also need new hashes generating as their parent has changed.

Rebasing can quickly become a mess! It's a hugely powerful tool, but with great power comes great responsibility.

Squash

If you choose to squash commits while merging, a brand new hash will be generated as it is all of your previous changes in one. Because of this. merge/pull requests can get confused and believe they have not been merged in yet as they compare hashes between branches.

How does it impact the repository?

Changing the commit hash is fine as long as you haven't shared the commit . Git relies on these hashes to navigate its timeline and history between different sources, so changing these can have unwanted side effects. Think of it as a Sci-fi movie, you can go back in time but you shouldn't affect someone else's timeline.

If you make a change, and a hash changes, Git will see this as a new commit. If the original one has been pushed and shared, Git will see two different commits in your timeline and want to merge the two different variants when you later want to push again.

if you are working in your own, or are confident in how it can affect things you can force push ( git push --force ) to tell the target repository (generally origin ) that your version is the source of truth.

The End

I hope that has helped further your understanding of Git and that next time something happens, you know why and how! Let me know if you have any questions, need something expanding or have any other feedback.

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

The Git Commit Hash

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

数据库索引设计与优化

【美】Tapio Lahdenmaki、【美】Michael Leach / 曹怡倩、赵建伟 / 电子工业出版社 / 2015-6 / 79.00元

《数据库索引设计与优化》提供了一种简单、高效、通用的关系型数据库索引设计方法。作者通过系统的讲解及大量的案例清晰地阐释了关系型数据库的访问路径选择原理，以及表和索引的扫描方式，详尽地讲解了如何快速地估算SQL 运行的CPU 时间及执行时间，帮助读者从原理上理解SQL、表及索引结构、访问方式等对关系型数据库造成的影响，并能够运用量化的方法进行判断和优化，指导关系型数据库的索引设计。《数据库索......一起来看看《数据库索引设计与优化》这本书的介绍吧!

码农工具