Support a new regex_exclude option to describe versions that must be excluded from the list of retrieved versions.
This will be useful for products such as KDE Plasma, for which beta releases are designated by the use of minor or patch version >= 80.
This makes the format open for extension, such as adding release cycle level data (such as EOL dates).
Version data is still accessible by the version's name. While this repeats the version name, it's also much more convenient for users of those data.
A few other things have also been updated in the process:
- verbosity of the diff has been increased in update.py to make workflow summaries more readable,
- dates without timezone are now set to UTC by default (this was already supposed, so no impact expected here).
It may not be the best place for that (gha.py would have been better), but it's the shorter / faster way to do it for now.
Moreover it now uses logging for writing the group. The logger format has been updated for this to work. This was done to fix issues on GitHub Action logs, where groups were declared after the logs.
- Move frontmatter-related operation from Product to ProductFrontmatter. This makes more senses, as we are manipulating different files / kind of data.
- Use Product directly to load old versions.
- make the script more resilient to changes in the page by using column names,
- use the product release releaseDate as the date, else the date the version was first found, else the current date (previously the date the version was first found was not used),
- move some code to the Product class.
Make the script more readable, mostly by:
- using the Product and AutoConfig classes,
- removing the use of functions when unnecessary,
- a little bit of renaming and documentation.
Note that this also changed the module used for regexes in endoflife.py. The regex module is now used because the Python re module does not support identically named groups (as used in the mariadb product). The regex module being backwards-compatible with the standard re module, this should not be an issue.
Make the script more readable, mostly by:
- using the Product and AutoConfig classes,
- removing the use of functions when unnecessary,
- a little bit of renaming and documentation.
Make the script more readable, mostly by:
- changing slightly the logic,
- using the Product and AutoConfig classes,
- removing the use of functions when unnecessary,
- a little bit of renaming and documentation.
Disabled web.archive.org has also been re-enabled because HTTP retry mechanism has been improved and should handle timeouts a lot better.
Make the script more readable, mostly by:
- using the Product and AutoConfig classes,
- removing the use of functions when unnecessary,
- a little bit of renaming and documentation.
Make the script more readable, mostly by:
- using the endoflife.Product class,
- introducing the endoflife.AutoConfig class to make it easier to manage such configuration,
- removing the unnecessary use of functions,
- a little bit of renaming.
Make the script more readable, mostly by:
- using the endoflife.Product class,
- removing the unnecessary use of functions,
- a little bit of renaming.
Make the script more readable, mostly by:
- using the endoflife.Product class,
- removing the unnecessary use of functions,
- a little bit of renaming.
This will be useful in future PRs:
- add a few more supported formats,
- cleanup string so that we don't have to deal with some special characters in formats.
Create common functions parse_date, parse_month_year_date and parse_datetime.
Those functions support trying multiple formats, and come with default formats lists that support most of the date format encountered so far.
Notable change: year-month dates are now set to the end of month (impacted couchbase-server and ibm-aix).
When a ChunkedEncodingError occurs, request and response are not set and there is no way to get the URL that causes the error.
With this change all URLs are retried. The max_retries parameter is decreased each time so that we do not get stuck in an infinite loop.
I also considered to also wait before retrying, but for now I don't see any benefit to it.
Relates to #188.
Intermittent ChunkedEncodingErrors occurs while fetching URLs. This change try to fix it by retrying.
According to https://stackoverflow.com/a/44511691/374236, most servers transmit all data, but that's not what was observed.
For future reference the traceback was:
```
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/runner/work/release-data/release-data/src/firefox.py", line 36, in <module>
for response in endoflife.fetch_urls(urls):
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/release-data/release-data/src/common/endoflife.py", line 55, in fetch_urls
return [future.result() for future in as_completed(futures)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/release-data/release-data/src/common/endoflife.py", line 55, in <listcomp>
return [future.result() for future in as_completed(futures)]
^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/site-packages/requests/sessions.py", line 747, in send
r.content
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/site-packages/requests/models.py", line 899, in content
self._content = b"".join(self.iter_content(CONTENT_CHUNK_SIZE)) or b""
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/site-packages/requests/models.py", line 818, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))
```
Move the parallel URL fetching from firefox.py to endoflife.py to make available parallel URL fetching for all scripts.
Also a a fix found on https://stackoverflow.com/a/44511691/374236 to avoid ChunkedEncodingError.
The purpose of this new script is to be alerted of new runtimes, while not making updates to the original product file (because release dates cannot be fetched from AWS documentation).
The main reason for doing this is to have some common code between scripts, so that it is easier to change the JSON schema globally and normalize a few things (such as release order).
The Ruby code was kept as is so we can quickly roll back if necessary.
Recently links to web.archive.org were added for various products. Such links are very long to load. This increases the default timeout value so that such links do not make the update fail.
The script only get versions 4.x and above, see script comments for more informations.
Note that the code related to git has been extracted to a common script so that it can be reused for the Debian script.
Create a common function to write resulting JSON files to the releases directory.
It makes this task simpler to read and maintain, while making it modifiable at a central point in the future.
One example of such modification could be the sorting of the versions in a uniform way for all the scripts.
This creates a common function to fetch HTTP URLs, with enhanced capabilities (retry, use of a known User-Agent).
It makes scripts that need those capabilities simpler, while improving other scripts.
This commit also fixes some scripts that did not log properly (cos.py, eks.py, haproxy.py, palo-alto-networks.py, rhel.py, ros.py, unrealircd.py).