92 Commits

Author SHA1 Message Date
Mike Fährmann
00c6821a3f replace 2-element f-strings with simple '+' concatenations
Python's 'ast' module and its 'NodeVisitor' class
were incredibly helpful in identifying these
2025-12-22 11:26:04 +01:00
Mike Fährmann
c38856bd3f [dt] use 'parse_datetime_iso()' for ISO formats 2025-10-19 21:52:05 +02:00
Mike Fährmann
085616e0a8 [dt] replace 'text.parse_datetime()' & 'text.parse_timestamp()' 2025-10-17 17:43:06 +02:00
Mike Fährmann
a097a373a9 simplify if statements by using walrus operators (#7671) 2025-07-22 20:57:54 +02:00
Mike Fährmann
8764f32ea7 [hitomi] fix negative tag searches (#7694) 2025-06-30 18:40:05 +02:00
Mike Fährmann
95338ff0ec replace 'result' with 'results' for lists
more consistent names
2025-06-30 12:10:57 +02:00
Mike Fährmann
9dbe33b6de replace old %-formatted and .format(…) strings with f-strings (#7671)
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
26e81e4162 [common] rename 'gallery_url'/'manga_url' to 'page_url 2025-06-26 22:06:57 +02:00
Mike Fährmann
41191bb60a 'match.group(N)' -> 'match[N]' (#7671)
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
e08ec7e083 update copyright notices 2025-06-13 00:03:41 +02:00
Mike Fährmann
b5c88b3d3e replace standard library 're' uses with 'util.re()' 2025-06-06 13:24:52 +02:00
Mike Fährmann
fd8f652490 [hitomi] fix extractors (#7230) 2025-03-23 20:32:27 +01:00
Mike Fährmann
428eb53086 [hitomi] provide 'search_tags' metadata for search/tag results
(#1015, #6756)
2025-01-02 17:49:30 +01:00
Mike Fährmann
f9d3603bfc [hitomi] fix searches (#6713) 2024-12-24 09:36:29 +01:00
Mike Fährmann
6f54328a39 [hitomi] update
- remove f-strings
- fix flake8 warnings
- move tests to test/results/hitomi.py
2024-10-29 16:56:52 +01:00
space-nuko
f170d73ffc [hitomi] add 'index' and 'search' extractors
- Support hitomi.la multiple tag searches
- Support hitomi.la index searches
- Fix tests
2024-10-29 16:55:52 +01:00
Mike Fährmann
6af26a424a [hitomi] extract 'extension_original' metadata (#6049) 2024-08-18 12:56:50 +02:00
Mike Fährmann
f160859c5c [hitomi] extract 'title_jpn' metadata (#5706) 2024-06-08 00:05:19 +02:00
Mike Fährmann
fc8f86bf24 [hitomi] recognize 'imageset' gallery URLs (#4756) 2023-11-02 15:29:44 +01:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
dd884b02ee replace json.loads with direct calls to JSONDecoder.decode 2023-02-09 15:22:00 +01:00
sudo
a6305d031c [hitomi] apply format check for every image (#3030) (#3280) 2022-11-27 15:55:25 +01:00
Mike Fährmann
b2b0b1c455 [hitomi] fall back to webp when format not available (#3030) 2022-10-11 10:48:28 +02:00
Mike Fährmann
2eb0ddd083 [hitomi] fix error when number of tag results is multiple of 25
(#2870)
2022-08-28 17:06:11 +02:00
Mike Fährmann
946643c23c [hitomi] use maxage for gg.js cache (#2863)
cached values become invalid after 1-2 hours
2022-08-26 17:57:17 +02:00
Mike Fährmann
37d584a9b2 [hitomi] update metadata extraction (fixes #2444)
remove 'hitomi.metadata' option, as it is no longer necessary
to make additional HTTP requests to fetch all metadata.
2022-03-26 12:46:18 +01:00
Mike Fährmann
dee0d22561 update extractor test results 2022-02-06 21:39:24 +01:00
Mike Fährmann
86fa412b47 [hitomi] add 'format' option (#2260)
default is 'webp' since downloading original files is no longer allowed
2022-02-03 23:32:19 +01:00
Mike Fährmann
17c9c47ca0 [hitomi] fix 'tag' extraction (fixes #2189) 2022-01-13 16:45:46 +01:00
Mike Fährmann
8b910dd8ae [hitomi] fix image URLs
again and again ...
2022-01-06 18:21:26 +01:00
Mike Fährmann
38e2af29d6 [hitomi] fix image URLs
update '_parse_gg()' yet again
2022-01-03 16:41:00 +01:00
Mike Fährmann
1e0278702d [hitomi] update '_parse_gg()' 2022-01-01 17:55:58 +01:00
Mike Fährmann
becc7f85a6 [hitomi] fix image URLs 2021-12-29 22:46:17 +01:00
Mike Fährmann
099ed72de7 [hitomi] disable extra 'metadata' by default
safes one HTTP request that not needed with default filename settings
2021-12-16 22:21:07 +01:00
Mike Fährmann
211de95dd0 update extractor test results 2021-11-01 02:58:53 +01:00
YongChan Cho
14852f7050 [hitomi] fix image path (#1988) 2021-10-30 21:45:01 +02:00
Ryu juheon
d4614e5ba4 [hitomi] fix image URLs (#1982) 2021-10-28 19:29:48 +02:00
Ryu juheon
6b6d92d51c [hitomi]: fix image URLs (#1975) 2021-10-26 19:35:01 +02:00
Mike Fährmann
47a780942c update extractor test results 2021-09-03 19:36:12 +02:00
Ryu JuHeon
9429eaa0a3 [hitomi]: fix image URLs (#1765) 2021-08-12 14:39:10 +02:00
Mike Fährmann
5612ca31c2 [hitomi] fix image URLs (closes #1679) 2021-07-09 18:01:49 +02:00
Mike Fährmann
e98fa01c44 [hitomi] update image URL code (fixes #1637) 2021-06-18 16:44:22 +02:00
Mike Fährmann
968d3e8465 remove '&' from URL patterns
'/?&#' -> '/?#' and '?&#' -> '?#'

According to https://www.ietf.org/rfc/rfc3986.txt, URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
ffd38215a4 [hitomi] fix image URLs and URL pattern
- non-webp files are now hosted on [a-c]b.hitomi.la
- removed ampersand from invalid slug characters
2020-10-22 15:15:34 +02:00
Mike Fährmann
7cd383c0f9 update extractor test results 2020-09-20 21:54:39 +02:00
Mike Fährmann
deaacc70bb [hitomi] update URL pattern for tag searches 2020-08-27 22:46:03 +02:00
Mike Fährmann
7140fe7e6d [hitomi] fix redirect processing 2020-08-23 15:18:44 +02:00
Mike Fährmann
a3de234e70 [hitomi] add extractor for tag searches (closes #697) 2020-04-20 21:55:19 +02:00
Mike Fährmann
55ac408bdf [hitomi] fix extraction of galleries without tags 2020-04-20 21:42:14 +02:00