81 Commits

Author SHA1 Message Date
Mike Fährmann
53cdfaac37 [common] add reference to 'exception' module to Extractor class
- remove 'exception' imports
- replace with 'self.exc'
2026-02-15 10:57:22 +01:00
Mike Fährmann
12f5e24ab5 use sets for ' in { ... }' checks 2026-02-11 22:55:01 +01:00
Mike Fährmann
e006d26c8e Revert "use f-strings when building 'pattern'"
revert d7c97d5a97.
2025-12-20 22:07:37 +01:00
Mike Fährmann
d7c97d5a97 use f-strings when building 'pattern' 2025-10-20 21:23:11 +02:00
Mike Fährmann
085616e0a8 [dt] replace 'text.parse_datetime()' & 'text.parse_timestamp()' 2025-10-17 17:43:06 +02:00
Mike Fährmann
0837bc2b70 [gelbooru] update 'api-key' & 'user-id' docs to match 'gelbooru_v02' 2025-08-22 17:06:19 +02:00
Mike Fährmann
fca1cd51f5 [gelbooru] improve error for 401 responses (#7674) 2025-07-14 16:54:21 +02:00
Mike Fährmann
f2a72d8d1e replace 'request(…).json()' with 'request_json(…)' 2025-06-29 17:50:19 +02:00
Mike Fährmann
9dbe33b6de replace old %-formatted and .format(…) strings with f-strings (#7671)
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
41191bb60a 'match.group(N)' -> 'match[N]' (#7671)
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
e08ec7e083 update copyright notices 2025-06-13 00:03:41 +02:00
Mike Fährmann
00f3b48eb8 [gelbooru] don't hardcode image server domains (#7392) 2025-04-20 00:09:28 +02:00
Mike Fährmann
85808a954f [gelbooru] fix video URLs (#7345)
update subdomain to 'img4'
2025-04-12 08:41:10 +02:00
Mike Fährmann
257e9fb435 [gelbooru] improve pagination logic for meta tags (#5478)
similar to 494acabd38
2024-04-15 23:14:48 +02:00
Mike Fährmann
31e7ca73b6 [gelbooru] add 'order-posts' option for favorites (#5220) 2024-03-23 13:30:09 +01:00
Mike Fährmann
6d93295fea [gelbooru] add 'date_favorited' metadata field 2024-03-18 20:46:11 +01:00
Mike Fährmann
0d69af94d5 [gelbooru] detect returned favorites order (#5220) 2024-03-18 20:45:06 +01:00
Mike Fährmann
93b4120e77 [gelbooru] support 'all' and empty tag (#5076) 2024-01-18 21:49:33 +01:00
Mike Fährmann
bbf96753e2 [gelbooru] only log "Incomplete API response" for favorites (#5045) 2024-01-10 17:27:46 +01:00
Mike Fährmann
cbfb7bfdf1 [gelbooru] display error for invalid API responses (#4903) 2024-01-06 14:28:35 +01:00
Mike Fährmann
a86775f617 [gelbooru] fix 'favorite' extractor (#4903)
lots of +1/-1 and </<= mistakes
2024-01-04 15:17:49 +01:00
Mike Fährmann
27ec653991 fix bug in test_init and update example URLs 2023-09-14 13:27:03 +02:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
d94aa1ee02 [gelbooru] fix --range for favorites (#3704) 2023-03-23 22:58:13 +01:00
Mike Fährmann
1f82b00b8f [gelbooru] fix and improve --range for pools 2023-03-23 18:22:46 +01:00
Mike Fährmann
dcb8af659a [gelbooru] extract favorites without needing cookies (#3704)
TODO: fix --range
2023-03-15 19:21:35 +01:00
Mike Fährmann
b756dc13aa [gelbooru] warn about missing cookies for favorites (#3704)
and add docstring so it shows up in --list-extractors
2023-03-15 14:58:55 +01:00
Mike Fährmann
b14f8d5817 [gelbooru] add 'favorite' extractor (#3704)
requires logged in cookies to work
2023-02-27 18:03:47 +01:00
Mike Fährmann
ed2d715019 fix 'keywords' in extractor tests (#3491) 2023-01-03 15:14:23 +01:00
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
775895f44b [booru] refactor 'tags' and 'notes' extraction
- move HTML request for post pages into its own function
- move gelbooru_v02.py notes extraction to gelbooru.py
  since it only works there
- clean up some code
2022-10-31 12:01:19 +01:00
KJ16609
300bc03deb [gelbooru] allow alternate parameter order in post URLs (#2821) 2022-10-21 14:59:29 +02:00
Mike Fährmann
d508b2c049 [gelbooru] implement 'pool' pagination (#2853) 2022-08-26 17:57:17 +02:00
Mike Fährmann
f225247670 [gelbooru] add support for api_key and user_id (#2767) 2022-07-18 18:46:31 +02:00
Mike Fährmann
e2be199124 [gelbooru] improve and fix pagination (#2230, #2232)
Use 'id:<POSTID' as a tag instead of going through pages with 'pid'.

Something similar was already implemented in 93cef784,
but that got broken again in 3085aac4.
2022-01-27 17:44:47 +01:00
Mike Fährmann
cdc96e1217 [gelbooru] improve video file detection (fixes #2188)
not all files from 'https://video-cdnN.gelbooru.com' are videos
2022-01-12 21:33:02 +01:00
Mike Fährmann
3085aac4d8 [gelbooru] handle changed API response format (#2157) 2022-01-03 16:42:48 +01:00
Mike Fährmann
3e4ffb0821 [gelbooru] add extractor for '/redirect.php' URLs (#1530) 2021-05-07 15:34:53 +02:00
thatfuckingbird
dff03a6605 [booru] add an option to extract notes (only gelbooru for now) (#1457)
* [booru] add an option to extract notes (currently implemented only for gelbooru)

* appease linter

* [gelbooru] rename "text" to "body" in note extraction

* add a code comment about reusing return value of _extended_tags
2021-04-13 23:40:24 +02:00
thatfuckingbird
918b0441fb [gelbooru] fix tag category extraction (#1455) 2021-04-10 19:05:00 +02:00
Mike Fährmann
780bac4c8a [gelbooru] update video server (fixes #1368)
from 'https://img2.gelbooru.com' to 'https://img3.gelbooru.com'
and provide fallback URLs
2021-03-10 01:48:07 +01:00
Mike Fährmann
08d7934c6e move extractors from booru.py into their own gelbooru_v02 module 2021-02-17 00:26:24 +01:00
Mike Fährmann
e41e2be2f9 [booru] split '_prepare_post()' 2020-12-24 01:13:54 +01:00
Mike Fährmann
a3a863fc13 [booru] add generalized extractors for *booru sites
similar to cc15fbe7
2020-12-08 18:34:30 +01:00
Mike Fährmann
7a0ba370d1 [gelbooru] rewrite mp4 video URLs (fixes #1048) 2020-10-15 15:14:18 +02:00
Mike Fährmann
fda9e296dd [gelbooru] fix extraction without API 2020-08-28 22:33:37 +02:00
Mike Fährmann
9b4635917f [gelbooru] simplify and fix pool extraction
use 'pool:<pool id>' as search tag to get pool posts
2020-05-18 19:04:51 +02:00
Mike Fährmann
2188db6284 [gelbooru] fix non-API tag extraction 2019-12-10 21:31:55 +01:00
Mike Fährmann
7a5e78741c [booru] build directory path for each file (#385) 2019-08-18 23:28:33 +02:00
Mike Fährmann
17a3426845 [gelbooru] enable all content when not using API 2019-07-27 11:13:38 +02:00