Mike Fährmann
9dbe33b6de
replace old %-formatted and .format(…) strings with f-strings ( #7671 )
...
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
d969dbbab1
[common] fix bug in df6f4e53 when first group is empty
2025-06-28 09:02:33 +02:00
Mike Fährmann
df6f4e5307
[common] only auto-set page_url when first group starts with /
2025-06-26 23:35:53 +02:00
Mike Fährmann
26e81e4162
[common] rename 'gallery_url'/'manga_url' to 'page_url
2025-06-26 22:06:57 +02:00
Mike Fährmann
b0d7de3603
support using system certificates via 'truststore' ( #6582 )
...
https://github.com/mikf/gallery-dl/issues/6582#issuecomment-2989290495
2025-06-20 19:55:01 +02:00
Mike Fährmann
fcd1b8a155
[common] add a 'kwdict' member to extractor instances
...
to allow setting general metadata at any point and without having to
rely on a manually implemented 'metadata()' method
2025-06-19 19:08:35 +02:00
Mike Fährmann
a80d55d974
[common] improve 'user-agent' override logic
...
Prevent a general 'user-agent' setting, i.e. extractor.user-agent, from
overriding the User-Agent header set by an extractor's 'browser'
option, default or otherwise, meaning only extractor-level or top-level
'user-agent' settings will override the 'browser' User-Agent.
https://github.com/mikf/gallery-dl/issues/7382#issuecomment-2985296321
2025-06-19 18:58:39 +02:00
Mike Fährmann
41191bb60a
'match.group(N)' -> 'match[N]' ( #7671 )
...
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
efa2ab5903
[common] allow GalleryExtractors to return additional asset files
...
like thumbnails or covers
2025-06-16 22:45:52 +02:00
Mike Fährmann
e08ec7e083
update copyright notices
2025-06-13 00:03:41 +02:00
Mike Fährmann
b4aed5e2c9
[common] allow overriding 'user-agent' when 'browser' is used ( #7647 )
2025-06-10 22:05:28 +02:00
Mike Fährmann
4cfddc144a
[common] import 'datetime' class directly
2025-06-10 21:35:15 +02:00
Mike Fährmann
e68555defa
[common] improve cookie-related logging messages
2025-06-10 21:34:27 +02:00
Mike Fährmann
511cf2363c
[common] update expired cookie messages ( #7644 )
...
- prefix with 'cookies:'
- include domain
- include exact time when it expired
2025-06-09 18:48:04 +02:00
Mike Fährmann
72a01bc4d4
[common] use util.re_compile() in _dump_response
2025-06-05 15:24:22 +02:00
Mike Fährmann
efd49aef73
allow using predefined Firefox/Chrome 'headers' & 'ciphers'
2025-06-05 14:24:38 +02:00
Mike Fährmann
85124fe251
[common] add 'request_json()' convenience function
2025-06-05 09:13:25 +02:00
Mike Fährmann
a7bbccbd7b
[common] add 'request_xml()' convenience function
2025-06-04 23:10:16 +02:00
Mike Fährmann
75b6c8f3d8
re-implement 'category-map' ( #7612 )
2025-06-04 07:57:27 +02:00
Mike Fährmann
8dc5794972
[common] return NullResponse for non-fatal requests ( #7598 )
...
Make Extractor.request(…, fatal=False) actually non-fatal
by returning an empty response instead of raising an exception
when a request fails due to connection issues.
2025-05-29 20:31:41 +02:00
Mike Fährmann
ed9c960bb9
[kemonoparty] remove '_prepare_ddosguard_cookies()'
2025-05-24 18:05:03 +02:00
Mike Fährmann
e199396872
[common] simplify 'user' extractors by using 'Dispatch' mixin
2025-05-24 18:04:53 +02:00
Mike Fährmann
88f1541a83
[common] add 'request_location()' convenience function
2025-04-19 16:45:05 +02:00
Mike Fährmann
4c8c98a14d
use internal, non-caching version of re.compile for extractor patterns
...
speeds up total compile time of extractor patterns by ~10ms
2025-04-15 22:47:19 +02:00
Mike Fährmann
567f5d0bc6
[common] add 'subdomains' argument to 'cookies_check()' ( #7188 )
2025-03-18 14:59:41 +01:00
Mike Fährmann
c9488cee30
[util] move Cloudflare/DDoS-Guard detection into 'detect_challenge()'
2025-03-06 14:21:35 +01:00
Mike Fährmann
ddb2c4d69d
[executables] fix SSLError when using HTTPAdapter ( #6393 )
...
always load certifi certificates instead of relying on
'load_default_certs()', which might load no certs at all
2025-01-31 20:36:41 +01:00
Mike Fährmann
1ae3ac5e39
[common] add '_extract_nextdata' method
2025-01-12 11:48:36 +01:00
Mike Fährmann
3f48e2f820
[common] add '_extract_jsonld' method ( #5272 )
2025-01-12 11:07:48 +01:00
Mike Fährmann
4853406fe3
[common] allow MangaExtractors to skip loading manga_url
2025-01-10 21:30:58 +01:00
Mike Fährmann
041baf8441
[common] compute and use latest Firefox UA
...
instead of the latest ESR UA
2024-12-17 22:20:37 +01:00
Mike Fährmann
0802e42c90
[common] use random unused port for '"user-agent": "browser"'
2024-12-17 21:40:20 +01:00
Mike Fährmann
e8826ed3d4
[common] simplify HTTP error messages
...
[warning] HTTPSConnectionPool(host='domain.tld', port=443): Max retries
exceeded with url: /a.jpg (Caused by NameResolutionError("<urllib3.
connection.HTTPSConnection object at 0x7247fe436ea0>: Failed to resolve
'domain.tld' ([Errno -2] Name or service not known)")) (1/5)
->
[warning] NameResolutionError: Failed to resolve 'domain.tld'
([Errno -2] Name or service not known) (1/5)
2024-12-10 17:13:44 +01:00
Mike Fährmann
86f3f3f763
[common] detect DDoS-Guard challenge pages
2024-12-08 21:39:04 +01:00
Mike Fährmann
7091904b20
[common] restore using environment proxies by default ( #6553 , #6609 )
...
change 'proxy-env' default to 'true'
2024-12-07 17:38:44 +01:00
Mike Fährmann
57f8227473
[common] improve handling of 'user-agent' settings ( #6594 )
...
improves 5412b22dae
ignore 'extractor.user-agent' only for extractors using a custom
'User-Agent' header
2024-12-03 10:55:41 +01:00
Mike Fährmann
5412b22dae
[common] allow overriding more default 'User-Agent' headers ( #6496 )
...
ignore 'extractor.user-agent' if it is the default useragent value
and an extractor wants to set its own custom value
2024-11-26 21:50:28 +01:00
Mike Fährmann
c82f3db098
[common] add 'proxy-env' option
...
(#6134 , #6455 )
disable using environment proxies by default
2024-11-15 18:03:56 +01:00
Mike Fährmann
0a72a5009c
[common] disable Authorization header injection from .netrc auth
...
(#6134 , #6455 )
2024-11-15 17:37:04 +01:00
Mike Fährmann
390b8ddd3e
[common] emit logging messages for --write-pages files
2024-11-03 20:38:33 +01:00
Mike Fährmann
ee61256054
[output] define and use global TTY_STD... values
2024-10-28 14:59:14 +01:00
Mike Fährmann
3946fe5ac4
[cookies] return loaded cookies as list
...
don't set_cookie() them immediately into a CookieJar
also, give some more consistent names to chrome/chromium functions
2024-10-14 14:24:27 +02:00
Mike Fährmann
6d8d882dbf
[common] allow request() to accept all HTTP status codes
...
by passing Ellipsis/... as 'fatal' argument
2024-10-11 19:49:16 +02:00
Mike Fährmann
f8f67dab22
[cookies] add 'cookies-select' option
2024-09-27 10:41:26 +02:00
Mike Fährmann
0db3c11ab0
[common] use 'cf-mitigated' header to detect challenges
2024-09-07 20:16:06 +02:00
Mike Fährmann
6110e3f940
[common] fix Logger names of BaseCategory extractors
...
update of d11ec009
fixes regressions introduced in 0c178846
2024-07-12 22:51:46 +02:00
Mike Fährmann
eb3ef13d28
include 'zstd' in Accept-Encoding header when supported
...
… and slightly update optional dependency list
2024-07-10 00:33:35 +02:00
Mike Fährmann
8aca0e6970
update default User-Agent header to Firefox 128 ESR
2024-07-09 20:42:06 +02:00
Mike Fährmann
11421cf940
[skeb] fix '429 Too Many Requests' errors ( #5766 )
...
Introduce '_handle_429' method to make it easier for Extractors to react
to 429 errors regardless of 'sleep-429' settings.
2024-06-21 00:12:05 +02:00
Mike Fährmann
60b4541199
improve a1bb3279, fix oauth:pixiv ( #5757 )
...
Check 'input' option only when required.
This also fixes an exception in oauth:pixiv caused by using the same
'_input' name as a method defined there.
2024-06-18 16:50:04 +02:00