- more sensible/performant defaults: default to primary repositories
only for the last year rather than all repositories forever
- allow specifying more than one user at a time
- output the breakdown of contributions without needing `--csv`
- add a space before the `--csv` output
- consolidate some code
- avoid counting authored commits twice, to improve performance
- retry failed GitHub API calls (this happens often when querying all
maintainers)
- stop counting after we find 1000 commits for a given user to avoid
excessive API queries/pagination
- It's possible to hide your contribution graph and not be searchable on
GitHub. Let's make sure `brew contributions` doesn't fall over if the
user's profile is private (determined by the `/events` user endpoint
returning []).
- The usage of this in `brew contributions` wasn't correct for a user
with 5 authored commits to homebrew/cask that had been committed by
other people, the numbers would turn out as 5 authored, 5 committed.
- I decided to do this properly by getting the SHAs for author and
committer and determine the differences between the two arrays.
This also accounts for when authored commits are 0, or committed
commits, or both.
- Add tests, because I don't want to fix this a third time!
- For a situation where `authored = 3`, `committed = 4`, the previous
calculation was `3 - 4` which meant that `committed = -1` in the end.
- This was incorrect, since a user can't have negative contributions!
- Instead, only do the subtraction to get the deduplicated `committed`
count if the number of authored commits is higher than the number of
committed commits. This approach should achieve the desired "don't
double count things that the user authored and committed, but do count
things that another person authored that the user committed".
- Double counting is artificially inflating folks' contributions (sadly ;-)).
- Since I'm not going to enumerate every possible author to filter by *both*
fields via the API, let's do some arithmetic to figure out the unique
committer numbers for a user.
- The GitHub list commits API now supports this filtering
(https://docs.github.com/en/rest/commits/commits?apiVersion=2022-11-28#list-commits--parameters),
because I wrote it. :-)
- Authoring a commit and committing a commit are two separate concepts: author
is the person who wrote the code and, in old parlance, the committer is the
person who applied the patch (remember when we sent patches to mailing lists?).
- In practice for us in Homebrew, this occurs when we make a change in GitHub's
web editor, or, more obviously, when BrewTestBot pushes `homebrew-core`
commits from users (then, `BrewTestBot` is the `committer`).
- The `reviewed-by` filter retrieved all reviews for a user, including
those they'd added to their own PRs. Since it's impossible to click
the "approve" button on one's own PR, filter this to `review:approved`
to get "further project goals" kinds of reviews.
- Suggested in https://github.com/Homebrew/brew/pull/14813#discussion_r1118696385.
- Signoffs were just a stopgap until we implemented getting "real"
reviews for a user via the GitHub API. They were a suboptimal way of getting
reviews because they only really exist in Homebrew/homebrew-core where
BrewTestBot adds signoffs for each maintainer who reviewed the PR.
- Now `brew contributions --from=2023-02-23 --to=2023-02-26` works to limit the
results for reviews. I forgot this in the original implementation, again,
ugh.
- The search APIs don't have that high a rate limit but we shouldn't need to
worry about that too much because, to get counts, the JSON response comes
with a `total_count` number.
- Turns out in my head a few days ago I was overcomplicating this. I had a
brainwave while in the shower.
- Some refactoring so that we call `totals` to sum up the hash of hashes less,
since the grand total numbers are now used in multiple places.
- `brew contributions --user=issyl0` was taking forever because it went
through all maintainers first, because the conditionals were in the
wrong order.
- This was too quiet, far too quiet, for something that takes so long.
- Now verbose mode tells you what repos it's scanning for a user.
- With `brew contributions`, this will output a list of stats
(across the specified time period, or all time) for people in the
"maintainers" team on GitHub.
- Add a `--user` flag for getting stats for a specific user (either
username, name or email address).
- This assumes that their Git committer details are the same as their name is
set to on GitHub.
- Show an error message if trying to generate a CSV for the full maintainer
list, since I haven't worked out how to best show all of that info yet (or
even how best to show only the totals across everything for every user) in
that format.
- Using `git log` was brittle with name changes and email address changes for
contributors over the years unless we made a Git `mailmap` file which brings
with it its own updatedness overhead.
- Let's use the GitHub commits API (importantly _not_ the search API) so that
we can give it a username and it will return contributions associated with
every email address on that user's account:
https://docs.github.com/en/rest/commits/commits?apiVersion=2022-11-28#list-commits--parameters.
- This is quite significantly slower, but it's worth it for correctness
especially when we get to all maintainers' contributions (in a separate PR).
- The commits API does not (yet?) support trailers or commit "committer"s, just
authors.
- For annual "has this person contributed enough", we focus on the main
Homebrew repos: brew, core and cask. Let's make that easier than
`--repositories=brew,core,cask`.
```
$ brew contributions issyl0 --csv
The user issyl0 has made 1202 contributions in all time.
user,repo,commits,coauthorships,signoffs,total
issyl0,brew,332,13,0,345
issyl0,core,473,24,326,823
issyl0,cask,4,0,0,4
issyl0,aliases,0,0,0,0
issyl0,autoupdate,1,0,0,1
issyl0,bundle,14,2,0,16
issyl0,command-not-found,1,0,0,1
issyl0,test-bot,3,0,0,3
issyl0,services,9,0,0,9
issyl0,cask-drivers,0,0,0,0
issyl0,cask-fonts,0,0,0,0
issyl0,cask-versions,0,0,0,0
```
- This gives users of this command a `--csv` option to pass to... you guessed
it, generate a CSV that's `pbcopy`able elsewhere, for more granular
breakdowns of where a person contributed.
- Inspiration was taken from the mockup in
https://github.com/Homebrew/brew/issues/13642#issuecomment-1254535251
but without the extra dependency of the TerminalTable gem.
- Always print a condensed "total contributions" sentence.
Output:
```
$ brew contributions issyl0
The user issyl0 has made 1201 contributions in all time.
$ brew contributions issyl0 --csv
user,repo,commits,coauthorships,signoffs
issyl0,brew,331,13,0
issyl0,core,473,24,326
issyl0,cask,4,0,0
issyl0,aliases,0,0,0
issyl0,autoupdate,1,0,0
issyl0,bundle,14,2,0
issyl0,command-not-found,1,0,0
issyl0,test-bot,3,0,0
issyl0,services,9,0,0
issyl0,cask-drivers,0,0,0
issyl0,cask-fonts,0,0,0
issyl0,cask-versions,0,0,0
```
❯ brew contributions mikemcquaid
mikemcquaid directly authored 23766 commits, co-authored 241 commits, and signed-off 6730 commits across all Homebrew repos in all time. Total: 30737.
- This doesn't require "all" to be specified as part of the command,
it's the default, so usage is now just:
```
$ brew contributions "Issy Long"
$ brew contributions "Issy Long" --repositories=brew,core
$ brew contributions me@issyl0.co.uk --repositories=cask,bundle
```
- As we discussed in the PR review before, `comma_array` doesn't allow
two names, so we can't (yet) do `comma_array "--repositories",
"--repos"` like we can with `flag`. That's an enhancement for the future
if we want to make the flags here less verbose. But now that "all" is
the default, maybe less necessary.
- Also stop skipping a "versions" repo. Since
023261038192a4f55c95a4d2486873ec1c9a728a the
`Homebrew/homebrew-cask-versions` tap won't get mistaken for
`Homebrew/homebrew-versions` tap.