This resolves `unitialized constant` errors in `brew contributions`
(`Tap`, `GitHub`) and `Utils::GitHub` (`Utils::Curl`).
This also preemptively adds some requires to `Utils::GitHub` and
`GitHub::API`, to avoid similar errors.
- This inconsistency of "author", "committer", "coauthorship", that is, only "coauthor" ending in "ship", has annoyed me ever since I wrote it. It has finally sufficiently annoyed me to fix it.
- This was broken (I did have a commit SHA for the breakage but I can't find it now) since `from` and `args.from` are different variables (one can be nil, the other has a default value).
- So it was reporting very high counts because, despite the message, the `from` restriction was not being passed to `count_repo_commits`.
- more sensible/performant defaults: default to primary repositories
only for the last year rather than all repositories forever
- allow specifying more than one user at a time
- output the breakdown of contributions without needing `--csv`
- add a space before the `--csv` output
- consolidate some code
- avoid counting authored commits twice, to improve performance
- retry failed GitHub API calls (this happens often when querying all
maintainers)
- stop counting after we find 1000 commits for a given user to avoid
excessive API queries/pagination
- It's possible to hide your contribution graph and not be searchable on
GitHub. Let's make sure `brew contributions` doesn't fall over if the
user's profile is private (determined by the `/events` user endpoint
returning []).
- The usage of this in `brew contributions` wasn't correct for a user
with 5 authored commits to homebrew/cask that had been committed by
other people, the numbers would turn out as 5 authored, 5 committed.
- I decided to do this properly by getting the SHAs for author and
committer and determine the differences between the two arrays.
This also accounts for when authored commits are 0, or committed
commits, or both.
- Add tests, because I don't want to fix this a third time!
- For a situation where `authored = 3`, `committed = 4`, the previous
calculation was `3 - 4` which meant that `committed = -1` in the end.
- This was incorrect, since a user can't have negative contributions!
- Instead, only do the subtraction to get the deduplicated `committed`
count if the number of authored commits is higher than the number of
committed commits. This approach should achieve the desired "don't
double count things that the user authored and committed, but do count
things that another person authored that the user committed".
- Double counting is artificially inflating folks' contributions (sadly ;-)).
- Since I'm not going to enumerate every possible author to filter by *both*
fields via the API, let's do some arithmetic to figure out the unique
committer numbers for a user.
- The GitHub list commits API now supports this filtering
(https://docs.github.com/en/rest/commits/commits?apiVersion=2022-11-28#list-commits--parameters),
because I wrote it. :-)
- Authoring a commit and committing a commit are two separate concepts: author
is the person who wrote the code and, in old parlance, the committer is the
person who applied the patch (remember when we sent patches to mailing lists?).
- In practice for us in Homebrew, this occurs when we make a change in GitHub's
web editor, or, more obviously, when BrewTestBot pushes `homebrew-core`
commits from users (then, `BrewTestBot` is the `committer`).
- The `reviewed-by` filter retrieved all reviews for a user, including
those they'd added to their own PRs. Since it's impossible to click
the "approve" button on one's own PR, filter this to `review:approved`
to get "further project goals" kinds of reviews.
- Suggested in https://github.com/Homebrew/brew/pull/14813#discussion_r1118696385.
- Signoffs were just a stopgap until we implemented getting "real"
reviews for a user via the GitHub API. They were a suboptimal way of getting
reviews because they only really exist in Homebrew/homebrew-core where
BrewTestBot adds signoffs for each maintainer who reviewed the PR.
- Now `brew contributions --from=2023-02-23 --to=2023-02-26` works to limit the
results for reviews. I forgot this in the original implementation, again,
ugh.
- The search APIs don't have that high a rate limit but we shouldn't need to
worry about that too much because, to get counts, the JSON response comes
with a `total_count` number.
- Turns out in my head a few days ago I was overcomplicating this. I had a
brainwave while in the shower.
- Some refactoring so that we call `totals` to sum up the hash of hashes less,
since the grand total numbers are now used in multiple places.
- `brew contributions --user=issyl0` was taking forever because it went
through all maintainers first, because the conditionals were in the
wrong order.
- This was too quiet, far too quiet, for something that takes so long.
- Now verbose mode tells you what repos it's scanning for a user.
- With `brew contributions`, this will output a list of stats
(across the specified time period, or all time) for people in the
"maintainers" team on GitHub.
- Add a `--user` flag for getting stats for a specific user (either
username, name or email address).
- This assumes that their Git committer details are the same as their name is
set to on GitHub.
- Show an error message if trying to generate a CSV for the full maintainer
list, since I haven't worked out how to best show all of that info yet (or
even how best to show only the totals across everything for every user) in
that format.