Use the Git History to Identify Pain Points in Any Project

栏目: IT技术 · 发布时间: 6年前

内容简介:Have you heard ofMuch of the tooling we use to mitigate tech debt is preventive. Compilers, linters, static analysis tools, etc. All trying to prevent developers from checking in code, which might cause problems in the future. What all of these fail to cat

Have you heard of Adam Tornhill ’s work? If not, I highly recommend that you set some time aside and check out Your Code as a Crime Scene or Software DEsign X-Rays . In both books, the author dives into a bit of an unexplored territory - looking at the evolution of a codebase as a factor of its changes over time.

Much of the tooling we use to mitigate tech debt is preventive. Compilers, linters, static analysis tools, etc. All trying to prevent developers from checking in code, which might cause problems in the future. What all of these fail to catch, is that the code might be perfectly fine, and still not lead to a well functioning system.

The process of software development is as much about developer-to-self and developer-to-others interaction, as much as it is about making the machine do certain things. This interaction can only be let to grow, and reflected upon at certain periods of time. And what better tool to help us do that, than the one we use on a daily basis - git.

Git to the rescue

In his books, Tornhill discuses multiple evolutions on the same basic idea - files that change often (with some exceptions) tend to be the ones where most issues occur, hence the need to change. We rarely think of this simple fact when we work on the same project for a long period of time. Yet, when on-boarding a new team member, looking up an unknown piece of code, or simply, retrospecting over your code, such knowledge can be invaluable.

The code is surprisingly simple:

git log --format=format: --name-only | egrep -v '^$' | sort | uniq -c | sort -rg | head -10

What I like to do is add such commands to my list of git aliases. Open up your ~/.gitconfig file and add the following two lines to the [Aliases] section:

code-changes = "!git log --format=format: --name-only | egrep -v '^$' | sort | uniq -c | sort -rg | head -10"
cc = "!git code-changes"

What this will do, is sort the files in your project by their number of changes, and take the first 10. Those are the ones where most changes have occurred over time, consequently there is a higher chance that those will require the most changes in the future.

Let’s do an example. I chose (completely by accident) to look at Gorm, one of the popular Go ORMs. These are the top 10 files that appear at the time of this writing:

272 main.go
246 scope.go
208 README.md
155 scope_private.go
117 main_test.go
116 gorm_test.go
105 model_struct.go
97 do.go
81 model.go
80 utils.go

Excluding the README.md file, one can clearly see some dominance of certain files over others. Many Go projects start out from a single main.go file, and with time, logic phases out to other files and packages. In our case, this is certainly not the case. Gorm ’s main.go is one big chunk of code, which can easily get split into two or more files, especially, since multiple files can share the same Go package.

I’ll go into more details on Adam Tornhil’s work. There’s even more interesting stuff, like identifying which files get changed together, etc. For now, take this simple trick and try to use it on the projects you’re working on, or the libraries you frequently work with.

What do you see?


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

长尾理论2.0

长尾理论2.0

安德森 / 乔江涛、石晓燕 / 中信出版社 / 2009-5 / 42.00元

《长尾理论2.0》是克里斯·安德森对所有问题最明确的回答。在此书中,他详细阐释了长尾的精华所在,揭示了长尾现象是如何从工业资本主义原动力——规模经济与范围经济——的矛盾中产生出来的。长尾现象虽然是明显的互联网现象,但其商务逻辑本身,却是从工业经济中自然而然“长”出来的,网络只是把酝酿了几十年的供应链革命的诸多要素简单地结合在一起了。同时,长尾理论转化为行动,最有力、最可操作的就是营销长尾,通过口碑......一起来看看 《长尾理论2.0》 这本书的介绍吧!

在线进制转换器
在线进制转换器

各进制数互转换器

html转js在线工具
html转js在线工具

html转js在线工具

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具