289 lines
11 KiB
Ruby
Raw Normal View History

2023-04-01 18:56:42 -07:00
# typed: true
# frozen_string_literal: true
module Homebrew
module Livecheck
# The `Livecheck::Strategy` module contains the various strategies as well
# as some general-purpose methods for working with them. Within the context
# of the `brew livecheck` command, strategies are established procedures
# for finding new software versions at a given source.
#
# @api private
module Strategy
extend Utils::Curl
module_function
2021-08-10 18:38:21 -04:00
# {Strategy} priorities informally range from 1 to 10, where 10 is the
# highest priority. 5 is the default priority because it's roughly in
# the middle of this range. Strategies with a priority of 0 (or lower)
# are ignored.
DEFAULT_PRIORITY = 5
# cURL's default `--connect-timeout` value can be up to two minutes, so
# we need to use a more reasonable duration (in seconds) to avoid a
# lengthy wait when a connection can't be established.
CURL_CONNECT_TIMEOUT = 10
# cURL does not set a default `--max-time` value, so we provide a value
# to ensure cURL will time out in a reasonable amount of time.
CURL_MAX_TIME = CURL_CONNECT_TIMEOUT + 5
# The `curl` process will sometimes hang indefinitely (despite setting
# the `--max-time` argument) and it needs to be quit for livecheck to
# continue. This value is used to set the `timeout` argument on
2021-08-10 18:38:21 -04:00
# `Utils::Curl` method calls in {Strategy}.
CURL_PROCESS_TIMEOUT = CURL_MAX_TIME + 5
# The maximum number of redirections that `curl` should allow.
MAX_REDIRECTIONS = 5
# This value is passed to `#parse_curl_output` to ensure that the limit
# for the number of responses it will parse corresponds to the maximum
# number of responses in this context. The `+ 1` here accounts for the
# situation where there are exactly `MAX_REDIRECTIONS` number of
# redirections, followed by a final `200 OK` response.
MAX_PARSE_ITERATIONS = MAX_REDIRECTIONS + 1
2021-08-10 18:38:21 -04:00
# Baseline `curl` arguments used in {Strategy} methods.
DEFAULT_CURL_ARGS = [
# Follow redirections to handle mirrors, relocations, etc.
"--location",
"--max-redirs", MAX_REDIRECTIONS.to_s,
# Avoid progress bar text, so we can reliably identify `curl` error
# messages in output
"--silent"
].freeze
# `curl` arguments used in `Strategy#page_content` method.
PAGE_CONTENT_CURL_ARGS = ([
"--compressed",
# Include HTTP response headers in output, so we can identify the
# final URL after any redirections
"--include",
] + DEFAULT_CURL_ARGS).freeze
2021-08-10 18:38:21 -04:00
# Baseline `curl` options used in {Strategy} methods.
DEFAULT_CURL_OPTIONS = {
2021-09-06 22:56:25 -04:00
print_stdout: false,
print_stderr: false,
debug: false,
verbose: false,
timeout: CURL_PROCESS_TIMEOUT,
connect_timeout: CURL_CONNECT_TIMEOUT,
max_time: CURL_MAX_TIME,
retries: 0,
}.freeze
# A regex used to identify a tarball extension at the end of a string.
2021-08-17 19:15:07 -04:00
TARBALL_EXTENSION_REGEX = /
\.t
(?:ar(?:\.(?:bz2|gz|lz|lzma|lzo|xz|Z|zst))?|
b2|bz2?|z2|az|gz|lz|lzma|xz|Z|aZ|zst)
$
/ix
Standardize valid strategy block return types Valid `strategy` block return types currently vary between strategies. Some only accept a string whereas others accept a string or array of strings. [`strategy` blocks also accept a `nil` return (to simplify early returns) but this was already standardized across strategies.] While some strategies only identify one version by default (where a string is an appropriate return type), it could be that a strategy block identifies more than one version. In this situation, the strategy would need to be modified to accept (and work with) an array from a `strategy` block. Rather than waiting for this to become a problem, this modifies all strategies to standardize on allowing `strategy` blocks to return a string or array of strings (even if only one of these is currently used in practice). Standardizing valid return types helps to further simplify the mental model for `strategy` blocks and reduce cognitive load. This commit extracts related logic from `#find_versions` into methods like `#versions_from_content`, which is conceptually similar to `PageMatch#page_matches` (renamed to `#versions_from_content` for consistency). This allows us to write tests for the related code without having to make network requests (or stub them) at this point. In general, this also helps to better align the structure of strategies and how the various `#find_versions` methods work with versions. There's still more planned work to be done here but this is a step in the right direction.
2021-08-10 11:09:55 -04:00
# An error message to use when a `strategy` block returns a value of
# an inappropriate type.
INVALID_BLOCK_RETURN_VALUE_MSG = "Return value of a strategy block must be a string or array of strings."
# Creates and/or returns a `@strategies` `Hash`, which maps a snake
# case strategy name symbol (e.g. `:page_match`) to the associated
2021-08-10 18:38:21 -04:00
# strategy.
#
# At present, this should only be called after tap strategies have been
# loaded, otherwise livecheck won't be able to use them.
# @return [Hash]
sig { returns(T::Hash[Symbol, T.untyped]) }
def strategies
return @strategies if defined? @strategies
@strategies = {}
Strategy.constants.sort.each do |const_symbol|
constant = Strategy.const_get(const_symbol)
next unless constant.is_a?(Class)
2023-03-07 10:11:59 -08:00
key = Utils.underscore(const_symbol).to_sym
@strategies[key] = constant
end
@strategies
end
private_class_method :strategies
2021-08-10 18:38:21 -04:00
# Returns the strategy that corresponds to the provided `Symbol` (or
# `nil` if there is no matching strategy).
#
2021-08-10 18:38:21 -04:00
# @param symbol [Symbol, nil] the strategy name in snake case as a
# `Symbol` (e.g. `:page_match`)
# @return [Class, nil]
2022-01-27 12:23:40 -05:00
sig { params(symbol: T.nilable(Symbol)).returns(T.untyped) }
def from_symbol(symbol)
strategies[symbol] if symbol.present?
end
# Returns an array of strategies that apply to the provided URL.
#
# @param url [String] the URL to check for matching strategies
2021-08-10 18:38:21 -04:00
# @param livecheck_strategy [Symbol] a strategy symbol from the
# `livecheck` block
# @param url_provided [Boolean] whether a url is provided in the
# `livecheck` block
# @param regex_provided [Boolean] whether a regex is provided in the
# `livecheck` block
2021-08-10 18:38:21 -04:00
# @param block_provided [Boolean] whether a `strategy` block is provided
# in the `livecheck` block
# @return [Array]
sig {
params(
url: String,
livecheck_strategy: T.nilable(Symbol),
url_provided: T::Boolean,
regex_provided: T::Boolean,
block_provided: T::Boolean,
).returns(T::Array[T.untyped])
}
def from_url(url, livecheck_strategy: nil, url_provided: false, regex_provided: false, block_provided: false)
usable_strategies = strategies.select do |strategy_symbol, strategy|
if strategy == PageMatch
# Only treat the strategy as usable if the `livecheck` block
# contains a regex and/or `strategy` block
next if !regex_provided && !block_provided
elsif [Json, Xml, Yaml].include?(strategy)
# Only treat the strategy as usable if the `livecheck` block
# specifies the strategy and contains a `strategy` block
next if (livecheck_strategy != strategy_symbol) || !block_provided
elsif strategy.const_defined?(:PRIORITY) &&
2023-04-01 18:56:42 -07:00
!strategy.const_get(:PRIORITY).positive? &&
livecheck_strategy != strategy_symbol
# Ignore strategies with a priority of 0 or lower, unless the
# strategy is specified in the `livecheck` block
next
end
strategy.respond_to?(:match?) && strategy.match?(url)
end.values
# Sort usable strategies in descending order by priority, using the
# DEFAULT_PRIORITY when a strategy doesn't contain a PRIORITY constant
usable_strategies.sort_by do |strategy|
2023-04-01 18:56:42 -07:00
(strategy.const_defined?(:PRIORITY) ? -strategy.const_get(:PRIORITY) : -DEFAULT_PRIORITY)
end
end
2021-08-10 18:38:21 -04:00
# Collects HTTP response headers, starting with the provided URL.
# Redirections will be followed and all the response headers are
# collected into an array of hashes.
#
# @param url [String] the URL to fetch
# @param homebrew_curl [Boolean] whether to use brewed curl with the URL
2021-08-10 18:38:21 -04:00
# @return [Array]
sig { params(url: String, homebrew_curl: T::Boolean).returns(T::Array[T::Hash[String, String]]) }
def self.page_headers(url, homebrew_curl: false)
headers = []
[:default, :browser].each do |user_agent|
begin
parsed_output = curl_headers(
"--max-redirs",
MAX_REDIRECTIONS.to_s,
url,
wanted_headers: ["location", "content-disposition"],
use_homebrew_curl: homebrew_curl,
user_agent: user_agent,
**DEFAULT_CURL_OPTIONS,
)
rescue ErrorDuringExecution
next
end
parsed_output[:responses].each { |response| headers << response[:headers] }
break if headers.present?
end
headers
end
2020-12-22 22:46:52 -05:00
# Fetches the content at the URL and returns a hash containing the
# content and, if there are any redirections, the final URL.
2020-12-14 13:03:10 -05:00
# If `curl` encounters an error, the hash will contain a `:messages`
# array with the error message instead.
2020-12-22 22:46:52 -05:00
#
# @param url [String] the URL of the content to check
# @param homebrew_curl [Boolean] whether to use brewed curl with the URL
2020-12-22 22:46:52 -05:00
# @return [Hash]
sig { params(url: String, homebrew_curl: T::Boolean).returns(T::Hash[Symbol, T.untyped]) }
def self.page_content(url, homebrew_curl: false)
2023-04-01 18:56:42 -07:00
stderr = T.let(nil, T.nilable(String))
[:default, :browser].each do |user_agent|
stdout, stderr, status = curl_output(
*PAGE_CONTENT_CURL_ARGS, url,
**DEFAULT_CURL_OPTIONS,
use_homebrew_curl: homebrew_curl,
user_agent: user_agent
)
next unless status.success?
2020-12-14 13:03:10 -05:00
# stdout contains the header information followed by the page content.
# We use #scrub here to avoid "invalid byte sequence in UTF-8" errors.
output = stdout.scrub
2020-12-14 13:03:10 -05:00
# Separate the head(s)/body and identify the final URL (after any
# redirections)
parsed_output = parse_curl_output(output, max_iterations: MAX_PARSE_ITERATIONS)
final_url = curl_response_last_location(parsed_output[:responses], absolutize: true, base_url: url)
2020-12-14 13:03:10 -05:00
data = { content: parsed_output[:body] }
data[:final_url] = final_url if final_url.present? && final_url != url
return data
2020-12-14 13:03:10 -05:00
end
error_msgs = stderr&.scan(/^curl:.+$/)
{ messages: error_msgs.presence || ["cURL failed without a detectable error"] }
end
Standardize valid strategy block return types Valid `strategy` block return types currently vary between strategies. Some only accept a string whereas others accept a string or array of strings. [`strategy` blocks also accept a `nil` return (to simplify early returns) but this was already standardized across strategies.] While some strategies only identify one version by default (where a string is an appropriate return type), it could be that a strategy block identifies more than one version. In this situation, the strategy would need to be modified to accept (and work with) an array from a `strategy` block. Rather than waiting for this to become a problem, this modifies all strategies to standardize on allowing `strategy` blocks to return a string or array of strings (even if only one of these is currently used in practice). Standardizing valid return types helps to further simplify the mental model for `strategy` blocks and reduce cognitive load. This commit extracts related logic from `#find_versions` into methods like `#versions_from_content`, which is conceptually similar to `PageMatch#page_matches` (renamed to `#versions_from_content` for consistency). This allows us to write tests for the related code without having to make network requests (or stub them) at this point. In general, this also helps to better align the structure of strategies and how the various `#find_versions` methods work with versions. There's still more planned work to be done here but this is a step in the right direction.
2021-08-10 11:09:55 -04:00
# Handles the return value from a `strategy` block in a `livecheck`
# block.
#
# @param value [] the return value from a `strategy` block
# @return [Array]
sig { params(value: T.untyped).returns(T::Array[String]) }
def self.handle_block_return(value)
case value
when String
[value]
when Array
value.compact.uniq
when nil
[]
else
raise TypeError, INVALID_BLOCK_RETURN_VALUE_MSG
end
end
end
end
end
require_relative "strategy/apache"
require_relative "strategy/bitbucket"
2020-12-11 04:31:14 +01:00
require_relative "strategy/cpan"
require_relative "strategy/crate"
require_relative "strategy/electron_builder"
2021-04-04 03:00:34 +02:00
require_relative "strategy/extract_plist"
require_relative "strategy/git"
2020-12-02 18:04:22 +05:30
require_relative "strategy/github_latest"
require_relative "strategy/github_releases"
require_relative "strategy/gnome"
require_relative "strategy/gnu"
require_relative "strategy/hackage"
require_relative "strategy/header_match"
require_relative "strategy/json"
require_relative "strategy/launchpad"
require_relative "strategy/npm"
require_relative "strategy/page_match"
require_relative "strategy/pypi"
require_relative "strategy/sourceforge"
2020-12-12 21:59:04 +01:00
require_relative "strategy/sparkle"
require_relative "strategy/xml"
require_relative "strategy/xorg"
require_relative "strategy/yaml"