This new type converts a comma-separated list of values into a range, only keeping the first and last value.
For example, '1.0, 1.1, 1.2' becomes '1.0 - 1.2'.
- Change headers_selector to header_selector: there was no need to ask the header cell selection, so updated it to match rows selection.
- Set header_selector default value to 'thead tr'.
- Set rows_selector default value to 'tbody tr'.
Make the Apple script compatible with the way update.py now works, which is 'product' oriented, meaning the script will be called once for each product.
To minimize the impacts the responses are now cached to avoid rate-limiting by support.apple.com.
Version patterns have also been moved to product's auto configuration to make future changes simpler.
- add an 'ignore_empty_releases' to exclude empty releases (which are future releases for Debian),
- improve logging,
- add the 'YYYY-mm' month_year date format.
It has been observed that in many cases, such as for amazon-neptune or amazon-corretto, data is split in two tables: supported and unsupported. Those table usually have similar column names, so by allowing mutiple tables the configuration is simplified.
Also fix script's documentation and improve logging.
Replace request_html by playwright, as request_html, as it is [not maintained anymore](https://pypi.org/project/requests-html/) and scripts using it, such as artifactory.py, started to fail.
Release data were not loaded in the `ProductData#__enter__` method. Data would be lost for auto configuration declaring an auto method updating releases followed by an auto method updating versions.
Also raise an error when product data are completely empty after the update, preventing the product data to be updated at all. This does not catch all types of errors (what if the second script silently fails completely ?), but that's a start.
- Add strict typing to the fields. This makes the script fail if some column does not have the expected type (for example because of a change in the HTML page).
- Support regex and templating for all fields (not only the releaseCycle). This make it possible to extract only the necessary information without having to do some sort of 'magic' cleanup (replacements in dates have been reverted).
- Do not inject 'releaseCycle' anymore in the JSON (there is already the name).
- The script was currently not raising any error in case no table were selected.
- It does not really make sense to support selecting multiple tables for now as there is no use case, and using multiple configuration is always possible.
Also fix the way the releaseCycle is handled.
Support retrieving and updating generic release-level data, such as support and eol dates. The JSON format has been changed accordingly to add a new top-level `releases` key.
The `aws-lambda.py` script has been updated to make use of this new feature.
Until now products could declare multiple auto-update methods, but they all had to be of the same kind.
For example if you used the git auto-update method, you could not use an additional github_releases or custom auto-update method.
This is an issue as it prevents us to extend the auto-update process, for example by having a product using the 'git' auto-update method to retrieve all the versions, and a custom script to retrieve support and EOL dates.
This improve the scripts execution orchestration to be able to support auto configurations using a mix of methods, meaning:
- multiple kind of methods, such as git and github_release,
- or multiple custom methods.
A side-effect of those changes is that now a failure in a generic script does not cancel the update of subsequent products.
Another side-effect, unwanted this time, is that now custom scripts managing multiple products, such as apple.py, are now executed multiple times instead of once.
Generic support for cumulative updates has been added to speed up execution time of some scripts that were very long (in comparison with the vast majority of products), usually because they were involving a lot of HTTP requests.
This feature was developed particularily for the firefox.py and unity.py scripts, which was often very long to execute (a minute or moreaccording to GHA summaries). Those scripts has been updated to make use of this new feature.
This way the writing of the JSON file is handled automatically if the update does not fail.
It pave the way to further global improvements, such as a better error handling.
Make endoflife.list_products return product instead of just the product name, to avoid having to reload the product a second time to get more information.
Up to now extra version fields were ignored: only name and date fields were accepted. This changes that by retaining the full JSON data when reading the file, making it possible in the future to support custom fields.
This also fixes a bug with versions having released on the same date: they was not ordered as expected (reverse order).
Support a new regex_exclude option to describe versions that must be excluded from the list of retrieved versions.
This will be useful for products such as KDE Plasma, for which beta releases are designated by the use of minor or patch version >= 80.
This makes the format open for extension, such as adding release cycle level data (such as EOL dates).
Version data is still accessible by the version's name. While this repeats the version name, it's also much more convenient for users of those data.
A few other things have also been updated in the process:
- verbosity of the diff has been increased in update.py to make workflow summaries more readable,
- dates without timezone are now set to UTC by default (this was already supposed, so no impact expected here).