37 Commits

Author SHA1 Message Date
Mike Fährmann
e006d26c8e Revert "use f-strings when building 'pattern'"
revert d7c97d5a97.
2025-12-20 22:07:37 +01:00
Mike Fährmann
d7c97d5a97 use f-strings when building 'pattern' 2025-10-20 21:23:11 +02:00
Mike Fährmann
9bf76c1352 replace 'util.re()' with 'text.re()'
remove unnecessary 'util' imports
2025-10-20 17:44:58 +02:00
Mike Fährmann
9dbe33b6de replace old %-formatted and .format(…) strings with f-strings (#7671)
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
26e81e4162 [common] rename 'gallery_url'/'manga_url' to 'page_url 2025-06-26 22:06:57 +02:00
Mike Fährmann
41191bb60a 'match.group(N)' -> 'match[N]' (#7671)
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
e08ec7e083 update copyright notices 2025-06-13 00:03:41 +02:00
Mike Fährmann
b5c88b3d3e replace standard library 're' uses with 'util.re()' 2025-06-06 13:24:52 +02:00
Mike Fährmann
83e50e43a8 [hiperdex] update domain to 'hiperdex.com' 2025-01-26 19:26:03 +01:00
Mike Fährmann
2cdb7e86ca [hiperdex] fix 'description' extraction 2025-01-26 19:17:10 +01:00
Mike Fährmann
0b3ddd01af [hiperdex] update domain to 'hipertoon.com' (#6420)
and fix 'description' extraction
2024-11-05 15:54:42 +01:00
Mike Fährmann
0761b22a7f [hiperdex] update domain to 'hiperdex.top' (#5635) 2024-05-24 17:13:10 +02:00
Mike Fährmann
26bc2d55f4 [hiperdex] update URL patterns & fix 'manga' metadata (#5340) 2024-03-18 17:36:16 +01:00
Mike Fährmann
fc1101779c [hiperdex] fix 'manga' metadata 2023-11-26 01:24:42 +01:00
Mike Fährmann
3ecb512722 send Referer headers by default 2023-09-19 00:02:04 +02:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
0b6e5b8161 [hiperdex] send Referer headers during file downloads (#4490) 2023-09-02 12:59:30 +02:00
Mike Fährmann
a05821f8b4 [hiperdex] fix 'manga' metadata
remove trailing ' Manga'
2023-09-02 12:59:30 +02:00
Mike Fährmann
1baf83a9e5 [hiperdex] fix for unicode titles (#4325) 2023-07-22 16:20:57 +02:00
Mike Fährmann
9b5e7ce8b9 [hiperdex] fix extraction 2023-03-25 18:18:27 +01:00
Mike Fährmann
17bd053d94 [hiperdex] fix extraction (#3768) 2023-03-15 14:28:03 +01:00
Mike Fährmann
58c008e30a [hiperdex] update domain (#3572) 2023-01-26 12:01:16 +01:00
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
783eae6fc5 [hiperdex] fix extraction 2021-09-30 00:32:34 +02:00
Mike Fährmann
b519bf567c [hiperdex] use domain from input URL 2021-07-02 23:23:42 +02:00
Mike Fährmann
fb4b4725ba [hiperdex] match 'hiperdex2.com' URLs
still doesn't properly work due to Cloudflare CAPTCHA and IUAM page
2021-06-18 00:50:11 +02:00
Mike Fährmann
968d3e8465 remove '&' from URL patterns
'/?&#' -> '/?#' and '?&#' -> '?#'

According to https://www.ietf.org/rfc/rfc3986.txt, URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
45baa13615 update extractor test results
- don't run Instagram tests on Travis anymore
- replace Twitter test because timeline was made private
- update Hiperdex domain to '.com' (again ...)
2020-05-28 02:18:06 +02:00
Mike Fährmann
2d6724180b [hiperdex] update domain to hiperdex.info 2020-05-12 17:00:51 +02:00
Mike Fährmann
a4fd620a25 [hiperdex] revert domain back to hiperdex.com 2020-04-27 20:42:31 +02:00
Mike Fährmann
d5273f9b0c [hiperdex] update domain to hiperdex.net 2020-04-16 20:39:56 +02:00
Mike Fährmann
a6286bb551 [hiperdex] add 'artist' extractor (#606) 2020-04-12 02:32:37 +02:00
Mike Fährmann
291033720a [hiperdex] fix manga extraction 2020-04-12 02:27:13 +02:00
Mike Fährmann
762c758af4 [hiperdex] fix extraction 2020-04-03 21:25:25 +02:00
Mike Fährmann
39b48d665b [hiperdex] use proper name for 'chapter_minor' 2020-02-29 00:18:54 +01:00
Mike Fährmann
1d4a369ea2 update extractor test results 2020-02-27 22:15:40 +01:00
Mike Fährmann
cc5079c844 [hiperdex] add chapter and manga extractors (closes #606) 2020-02-22 03:09:29 +01:00