Make endoflife.list_products return product instead of just the product name, to avoid having to reload the product a second time to get more information.
Support a new regex_exclude option to describe versions that must be excluded from the list of retrieved versions.
This will be useful for products such as KDE Plasma, for which beta releases are designated by the use of minor or patch version >= 80.
This makes the format open for extension, such as adding release cycle level data (such as EOL dates).
Version data is still accessible by the version's name. While this repeats the version name, it's also much more convenient for users of those data.
A few other things have also been updated in the process:
- verbosity of the diff has been increased in update.py to make workflow summaries more readable,
- dates without timezone are now set to UTC by default (this was already supposed, so no impact expected here).
It may not be the best place for that (gha.py would have been better), but it's the shorter / faster way to do it for now.
Moreover it now uses logging for writing the group. The logger format has been updated for this to work. This was done to fix issues on GitHub Action logs, where groups were declared after the logs.
- Move frontmatter-related operation from Product to ProductFrontmatter. This makes more senses, as we are manipulating different files / kind of data.
- Use Product directly to load old versions.
- make the script more resilient to changes in the page by using column names,
- use the product release releaseDate as the date, else the date the version was first found, else the current date (previously the date the version was first found was not used),
- move some code to the Product class.
Make the script more readable, mostly by:
- using the Product and AutoConfig classes,
- removing the use of functions when unnecessary,
- a little bit of renaming and documentation.
Note that this also changed the module used for regexes in endoflife.py. The regex module is now used because the Python re module does not support identically named groups (as used in the mariadb product). The regex module being backwards-compatible with the standard re module, this should not be an issue.
Make the script more readable, mostly by:
- using the Product and AutoConfig classes,
- removing the use of functions when unnecessary,
- a little bit of renaming and documentation.
Make the script more readable, mostly by:
- changing slightly the logic,
- using the Product and AutoConfig classes,
- removing the use of functions when unnecessary,
- a little bit of renaming and documentation.
Disabled web.archive.org has also been re-enabled because HTTP retry mechanism has been improved and should handle timeouts a lot better.
Make the script more readable, mostly by:
- using the Product and AutoConfig classes,
- removing the use of functions when unnecessary,
- a little bit of renaming and documentation.
Make the script more readable, mostly by:
- using the endoflife.Product class,
- introducing the endoflife.AutoConfig class to make it easier to manage such configuration,
- removing the unnecessary use of functions,
- a little bit of renaming.
Make the script more readable, mostly by:
- using the endoflife.Product class,
- removing the unnecessary use of functions,
- a little bit of renaming.
Make the script more readable, mostly by:
- using the endoflife.Product class,
- removing the unnecessary use of functions,
- a little bit of renaming.
When a ChunkedEncodingError occurs, request and response are not set and there is no way to get the URL that causes the error.
With this change all URLs are retried. The max_retries parameter is decreased each time so that we do not get stuck in an infinite loop.
I also considered to also wait before retrying, but for now I don't see any benefit to it.
Relates to #188.
Intermittent ChunkedEncodingErrors occurs while fetching URLs. This change try to fix it by retrying.
According to https://stackoverflow.com/a/44511691/374236, most servers transmit all data, but that's not what was observed.
For future reference the traceback was:
```
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/runner/work/release-data/release-data/src/firefox.py", line 36, in <module>
for response in endoflife.fetch_urls(urls):
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/release-data/release-data/src/common/endoflife.py", line 55, in fetch_urls
return [future.result() for future in as_completed(futures)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/release-data/release-data/src/common/endoflife.py", line 55, in <listcomp>
return [future.result() for future in as_completed(futures)]
^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/site-packages/requests/sessions.py", line 747, in send
r.content
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/site-packages/requests/models.py", line 899, in content
self._content = b"".join(self.iter_content(CONTENT_CHUNK_SIZE)) or b""
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.6/x64/lib/python3.11/site-packages/requests/models.py", line 818, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))
```
Move the parallel URL fetching from firefox.py to endoflife.py to make available parallel URL fetching for all scripts.
Also a a fix found on https://stackoverflow.com/a/44511691/374236 to avoid ChunkedEncodingError.
The purpose of this new script is to be alerted of new runtimes, while not making updates to the original product file (because release dates cannot be fetched from AWS documentation).
Recently links to web.archive.org were added for various products. Such links are very long to load. This increases the default timeout value so that such links do not make the update fail.
Create a common function to write resulting JSON files to the releases directory.
It makes this task simpler to read and maintain, while making it modifiable at a central point in the future.
One example of such modification could be the sorting of the versions in a uniform way for all the scripts.
This creates a common function to fetch HTTP URLs, with enhanced capabilities (retry, use of a known User-Agent).
It makes scripts that need those capabilities simpler, while improving other scripts.
This commit also fixes some scripts that did not log properly (cos.py, eks.py, haproxy.py, palo-alto-networks.py, rhel.py, ros.py, unrealircd.py).
Generic scripts are scripts that handle multiple product based on some identifier (URL, coordinates...).
This creates a common function to list and load product configurations for those scripts.
It makes them simpler to read and maintain, while making the way they work much more consistent.