Mike Fährmann
00c6821a3f
replace 2-element f-strings with simple '+' concatenations
...
Python's 'ast' module and its 'NodeVisitor' class
were incredibly helpful in identifying these
2025-12-22 11:26:04 +01:00
Mike Fährmann
c38856bd3f
[dt] use 'parse_datetime_iso()' for ISO formats
2025-10-19 21:52:05 +02:00
Mike Fährmann
085616e0a8
[dt] replace 'text.parse_datetime()' & 'text.parse_timestamp()'
2025-10-17 17:43:06 +02:00
Mike Fährmann
a097a373a9
simplify if statements by using walrus operators ( #7671 )
2025-07-22 20:57:54 +02:00
Mike Fährmann
8764f32ea7
[hitomi] fix negative tag searches ( #7694 )
2025-06-30 18:40:05 +02:00
Mike Fährmann
95338ff0ec
replace 'result' with 'results' for lists
...
more consistent names
2025-06-30 12:10:57 +02:00
Mike Fährmann
9dbe33b6de
replace old %-formatted and .format(…) strings with f-strings ( #7671 )
...
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
26e81e4162
[common] rename 'gallery_url'/'manga_url' to 'page_url
2025-06-26 22:06:57 +02:00
Mike Fährmann
41191bb60a
'match.group(N)' -> 'match[N]' ( #7671 )
...
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
e08ec7e083
update copyright notices
2025-06-13 00:03:41 +02:00
Mike Fährmann
b5c88b3d3e
replace standard library 're' uses with 'util.re()'
2025-06-06 13:24:52 +02:00
Mike Fährmann
fd8f652490
[hitomi] fix extractors ( #7230 )
2025-03-23 20:32:27 +01:00
Mike Fährmann
428eb53086
[hitomi] provide 'search_tags' metadata for search/tag results
...
(#1015 , #6756 )
2025-01-02 17:49:30 +01:00
Mike Fährmann
f9d3603bfc
[hitomi] fix searches ( #6713 )
2024-12-24 09:36:29 +01:00
Mike Fährmann
6f54328a39
[hitomi] update
...
- remove f-strings
- fix flake8 warnings
- move tests to test/results/hitomi.py
2024-10-29 16:56:52 +01:00
space-nuko
f170d73ffc
[hitomi] add 'index' and 'search' extractors
...
- Support hitomi.la multiple tag searches
- Support hitomi.la index searches
- Fix tests
2024-10-29 16:55:52 +01:00
Mike Fährmann
6af26a424a
[hitomi] extract 'extension_original' metadata ( #6049 )
2024-08-18 12:56:50 +02:00
Mike Fährmann
f160859c5c
[hitomi] extract 'title_jpn' metadata ( #5706 )
2024-06-08 00:05:19 +02:00
Mike Fährmann
fc8f86bf24
[hitomi] recognize 'imageset' gallery URLs ( #4756 )
2023-11-02 15:29:44 +01:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
dd884b02ee
replace json.loads with direct calls to JSONDecoder.decode
2023-02-09 15:22:00 +01:00
sudo
a6305d031c
[hitomi] apply format check for every image ( #3030 ) ( #3280 )
2022-11-27 15:55:25 +01:00
Mike Fährmann
b2b0b1c455
[hitomi] fall back to webp when format not available ( #3030 )
2022-10-11 10:48:28 +02:00
Mike Fährmann
2eb0ddd083
[hitomi] fix error when number of tag results is multiple of 25
...
(#2870 )
2022-08-28 17:06:11 +02:00
Mike Fährmann
946643c23c
[hitomi] use maxage for gg.js cache ( #2863 )
...
cached values become invalid after 1-2 hours
2022-08-26 17:57:17 +02:00
Mike Fährmann
37d584a9b2
[hitomi] update metadata extraction ( fixes #2444 )
...
remove 'hitomi.metadata' option, as it is no longer necessary
to make additional HTTP requests to fetch all metadata.
2022-03-26 12:46:18 +01:00
Mike Fährmann
dee0d22561
update extractor test results
2022-02-06 21:39:24 +01:00
Mike Fährmann
86fa412b47
[hitomi] add 'format' option ( #2260 )
...
default is 'webp' since downloading original files is no longer allowed
2022-02-03 23:32:19 +01:00
Mike Fährmann
17c9c47ca0
[hitomi] fix 'tag' extraction ( fixes #2189 )
2022-01-13 16:45:46 +01:00
Mike Fährmann
8b910dd8ae
[hitomi] fix image URLs
...
again and again ...
2022-01-06 18:21:26 +01:00
Mike Fährmann
38e2af29d6
[hitomi] fix image URLs
...
update '_parse_gg()' yet again
2022-01-03 16:41:00 +01:00
Mike Fährmann
1e0278702d
[hitomi] update '_parse_gg()'
2022-01-01 17:55:58 +01:00
Mike Fährmann
becc7f85a6
[hitomi] fix image URLs
2021-12-29 22:46:17 +01:00
Mike Fährmann
099ed72de7
[hitomi] disable extra 'metadata' by default
...
safes one HTTP request that not needed with default filename settings
2021-12-16 22:21:07 +01:00
Mike Fährmann
211de95dd0
update extractor test results
2021-11-01 02:58:53 +01:00
YongChan Cho
14852f7050
[hitomi] fix image path ( #1988 )
2021-10-30 21:45:01 +02:00
Ryu juheon
d4614e5ba4
[hitomi] fix image URLs ( #1982 )
2021-10-28 19:29:48 +02:00
Ryu juheon
6b6d92d51c
[hitomi]: fix image URLs ( #1975 )
2021-10-26 19:35:01 +02:00
Mike Fährmann
47a780942c
update extractor test results
2021-09-03 19:36:12 +02:00
Ryu JuHeon
9429eaa0a3
[hitomi]: fix image URLs ( #1765 )
2021-08-12 14:39:10 +02:00
Mike Fährmann
5612ca31c2
[hitomi] fix image URLs ( closes #1679 )
2021-07-09 18:01:49 +02:00
Mike Fährmann
e98fa01c44
[hitomi] update image URL code ( fixes #1637 )
2021-06-18 16:44:22 +02:00
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
ffd38215a4
[hitomi] fix image URLs and URL pattern
...
- non-webp files are now hosted on [a-c]b.hitomi.la
- removed ampersand from invalid slug characters
2020-10-22 15:15:34 +02:00
Mike Fährmann
7cd383c0f9
update extractor test results
2020-09-20 21:54:39 +02:00
Mike Fährmann
deaacc70bb
[hitomi] update URL pattern for tag searches
2020-08-27 22:46:03 +02:00
Mike Fährmann
7140fe7e6d
[hitomi] fix redirect processing
2020-08-23 15:18:44 +02:00
Mike Fährmann
a3de234e70
[hitomi] add extractor for tag searches ( closes #697 )
2020-04-20 21:55:19 +02:00
Mike Fährmann
55ac408bdf
[hitomi] fix extraction of galleries without tags
2020-04-20 21:42:14 +02:00