O.F.K.

Today is in Ornellember time format.

Notes/rabbit hole: auto-generating Git co-authors

Published (gregorian) (ornellember)

Tags: tech

refactoring

git

As a refactoring freak, I sometimes run into the scenario where someone reaches out to me about code that I’ve “written” - but in reality all I’ve done is move it to a new file. Because my name is on the commit message for this code, and there’s no other history for it, it looks like I’m the actual author.

Right now, I’m reorganizing a package in a monorepo and I want to make sure I attribute credit where it’s due / don’t erase the git history and end up having to “support” questions about the content of the files.

So I am digging into git stuff and I have a few TILs. They’re not all relevant. Then some notes about the actual solutions towards the bottom.

# Background/goal

Github has the ability to parse commit messages for co-authors, when you format the commit in a certain way ie

My commit msg.

Co-authored-by: NAME <[email protected]>
Co-authored-by: ANOTHER-NAME <[email protected]>

This is particularly applicable for moving stuff within files, as renames are well-supported. I use VSCode and I believe when you rename a file, it implements git mv under the hood.

So basically I want to do something like: when I move functionality around, I add (in a somewhat straightforward way) the original author(s) as (a) co-author(s).

Say you got a repo like the below:

repo
├── README.md
├── ui
|  └── foo.md
|  └── bar.md
├── api
|  └── foo
|  └── bar

You want to move the ui/ folder out of this repo and into its own repo. You kind of don’t want to “take credit” for all the files, aka lose all the git history and stamp your own name all over the files like you pulled them out of your booty.

So you could add everyone as a co-author there too. I just found out about something that could help me a bunch to go that and I’m excited. filtering git log! you can add filtering in pretty nifty ways, like per directory or file, or search by author name, or aggregate the output e.g. by uniqueness, and even format the output.

There must be a better way to do this - like, chop down the history for a set of files and/or import it - but this is really fun and maybe I’ll research that afterwards.

*** Update I asked my friend and git expert Pauline Vos (author of the upcoming course Git Legit!!) and she said in this case, she’d fork the repo, and then delete everything except the ui folder to keep the history!! genius.

So basically this is about use case 3.

# TILs

Basically, TIL that git log is a lot more powerful than I thought.

# git log

# You can filter git logs??!

  1. by directory or file path git log -- {path}

returns commit history for the path

  1. if within a file, by line! git log -L{number}:{filePath} e.g.
git log -L44:foo/bar.ts

will return the commit history for line 44.

  1. by GROUP OF LINES bruh 🤯 git log -- -L{number},+{count}:{filePath} e.g.
git log -L44,+35:foo/bar.ts

will return the commit history for lines 44 to 79.

NOW, by default, this’ll output the patch (content of the file) - you can suppress that with –no-patch.

  1. by author This was a dead end, in this case, but I also liked it a lot for future ref.
git log --author=0rnella

and many more!

# side note on git blame

Git blame is aight but I’m having trouble working with the output for my usecase. It basically tells you what commit wrote each line, but I am trying to get all the commits for each line and so I’m p sure need log for that.

# format the git log output

  1. –pretty flag with the –pretty flag you can format your output with some presets like oneline, short, medium, full or even email
git log --pretty=oneline

outputs the commit history with only 1 line per commit e.g.

1896157796c1427d284f2b46a1d3a47fbef4b18b (HEAD -> main, origin/main) update to react composability
aac0a034f2df2a9a48a131beedcfb68b66cae377 react composability notes
5c46bd45c807988013e19a871981d4ab827b7da7 new post
91073e4272614229bac38cea4596ba7f0b4d36c5 forgotten update from january

(side note, kind of meta that this is the git history for this project no?)

  1. custom formatsssss????? then it gets really fucking interesting because you can pass a custom format, referencing specific info in the commit with shorthand!
git log --pretty=format:"%an + %ae"

an stands for author name, and ae for email. This will output the commit history as a list of names + emails.

# unix commands

TIL about a bunch of unix commands from stackoverflow - didn’t think to note down all the urls, but here’s the main one. I also got sort and uniq from github copilot - can’t attribute credit further bc gen AI.

# wc to output a count

TIL about the wc command in unix which outputs the word count (or line count, or whatever) of text you pass it.

# uniq for unique values

TIL also about uniq, which gets all unique values from text

# sort for sorting

TIL also about sort, which sorts text that it’s fed. by default it sorts alphabetically. It’s not useful for me though because I’d prefer to write the authors in backwards chronological order, which is what git outputs. but worth noting.

# Something useful

Finally!

# 1. aggregating co-authors for work you’re moving

Going back to the scenario 3, where I am moving some functionality to a different file. I can create a list of co-authors for the given set of lines that i’m moving like this:

git log --pretty=format:"Co-authored-by: %an <%ae>"  -L {startLine},{endLine}:{fileName} --no-patch

e.g. (let’s get meta and use something from this repo)

git log --pretty=format:"Co-authored-by: %an <%ae>"  -L 3,28:layouts/blog/single.html --no-patch

This outputs what I want to stick onto the end of my file!!

Co-authored-by: friggito <[email protected]>
Co-authored-by: friggito <[email protected]>
Co-authored-by: friggito <[email protected]>
Co-authored-by: friggito <[email protected]>
Co-authored-by: friggito <[email protected]>

now let’s make this unique by sticking | uniq at the end

git log --pretty=format:"Co-authored-by: %an <%ae>"  -L 3,28:layouts/blog/single.html --no-patch | uniq

the result:

Co-authored-by: friggito <[email protected]>

obviously I could have used something with more co-authors…. but was lazy.

# 2. Counting…?

shit, I had something with the commit count, but I forgot. fuck. I know at some point I was wondering how to count commits - ah, yeah! to verify that I was really filtering correctly. So this is more of a test.

If I’m getting the commit history for a file, it should be more commits (or the same) than the number of commits for a specific line in that file. So I wanted to test that filtering by a specific line, or group of lines, was working. And my idea to do that was to output the number of unique commits for that group of lines and verifying it’s less than for the whole file.

By the way by doing that, I did realize that my initial filtering command was malformatted (things won’t always error out in this case, unfortunately) so yeah, this served it purpose.

I’m going to make each commit output on one line, then count the number of lines. (We’re talking git log output lines here, not lines in the file.)

git log --oneline -- {filePath} | wc -l

that outputs a commit count for the whole file.

now commit count for a specific line or group of lines:

git log --oneline -L {lineNumber}:{filePath} --no-patch | wc -l

No patch is crucial here bc otherwise you end up counting the lines in the patch too!

So anyway this is good for testing.

# Conclusion

Am I going to use this? Probably not, but at least it’s written down somewhere. Byeeee!