Commit Graph

107 Commits

Author SHA1 Message Date
Mike Fährmann
b5c88b3d3e replace standard library 're' uses with 'util.re()' 2025-06-06 13:24:52 +02:00
Mike Fährmann
c3e8af945d [sankaku] fix passing cookies (#7333)
to allow '"tags": "extended"' to work properly
2025-05-23 19:21:56 +02:00
Mike Fährmann
9c06acb385 [sankaku] compile extended 'tags' pattern only once
per extractor run
2025-05-22 22:30:41 +02:00
Mike Fährmann
7b5dd61e17 [sankaku] implement support for new 'tags' categories (#7333 #7553) 2025-05-22 12:41:03 +02:00
Mike Fährmann
f395a3ec79 [sankaku] fix potential infinite loop (#7155)
https://github.com/mikf/gallery-dl/issues/7155#issuecomment-2723019761
2025-03-14 08:35:54 +01:00
Mike Fährmann
898a09bf7f [sankaku] fix 'tags' metadata (#7155)
rename 'tag_names' to 'tags'
2025-03-12 17:07:40 +01:00
Mike Fährmann
94bbbbb16b [sankaku] fix categorized tags for posts with >100 tags (#7155) 2025-03-11 21:01:46 +01:00
Mike Fährmann
1254c4e3d9 [sankaku] update API URLs (#7154 #7155)
and fix errors due to other changes
2025-03-11 18:45:45 +01:00
Mike Fährmann
7afd5bae03 [sankaku] increase wait time on 429 errors (#7129)
to 10 minutes
2025-03-07 20:19:17 +01:00
Mike Fährmann
3ef23cc99b [sankaku] fix search tag limit check 2025-03-07 20:18:34 +01:00
Mike Fährmann
8256a7a8e4 [sankaku] fix extraction (#7071 #7072)
omit 'Platform: web-app' API header to get sankaku to include
'file_url' data in API responses again
2025-02-28 10:28:29 +01:00
Mike Fährmann
0c584f9be7 [sankaku] support alphanumeric book/pool IDs (#6757) 2025-01-02 15:49:07 +01:00
Mike Fährmann
d5fa1d6aba [sankaku] improve tag categorization code
translate tag type ID to name for each category
instead of for each tag
2024-11-03 09:21:39 +01:00
Mike Fährmann
ef4c1b4fc5 [sankaku] restore old 'tags' format (#6043)
lowercase + words separated by underscores
2024-08-17 19:25:19 +02:00
Mike Fährmann
84eefeebd6 [sankaku] match URLs with 'www' subdomain (#5907) 2024-07-30 17:05:22 +02:00
Mike Fährmann
287a7d13cf [sankaku] implement 'notes' extraction (#5865) 2024-07-18 20:44:49 +02:00
Mike Fährmann
34a4ddc399 [sankaku] add 'id-format' option (#5073) 2024-01-26 17:56:08 +01:00
Mike Fährmann
a416d4c3d5 [sankaku] support post URLs with alphanumeric IDs (#5073) 2024-01-18 16:23:14 +01:00
Mike Fährmann
57fc6fcf83 replace '24*3600' with '86400'
and generalize cache maxage values
2023-12-18 23:57:22 +01:00
Mike Fährmann
645b4627ef [sankaku] update URL patterns 2023-11-24 02:41:52 +01:00
Mike Fährmann
c9a2be36d4 [sankaku] support '/posts/' tag search URLs (#4740) 2023-10-29 13:48:42 +01:00
Mike Fährmann
b52fd91ac6 [sankaku] support '/posts/' URLs (#4688) 2023-10-21 13:20:35 +02:00
Mike Fährmann
3ecb512722 send Referer headers by default 2023-09-19 00:02:04 +02:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a60db454af [sankaku] update/fix API headers
'Referer' and 'Origin' were both empty
2023-08-04 17:14:43 +02:00
Mike Fährmann
d97b8c2fba consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
3845c0256d [sankaku] improve warnings for unavailable posts 2023-07-01 19:11:41 +02:00
Mike Fährmann
7f25cab56e [sankaku] support post URLs with MD5 hashes (#3952) 2023-04-23 16:46:40 +02:00
Mike Fährmann
faca32a850 [sankaku] sanitize 'date:…' tags (#1790) 2023-04-19 20:09:11 +02:00
Mike Fährmann
107c60c973 [sankaku] update URL pattern (#3523)
match tag searches with language codes without a trailing slash
2023-01-18 21:38:01 +01:00
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
775895f44b [booru] refactor 'tags' and 'notes' extraction
- move HTML request for post pages into its own function
- move gelbooru_v02.py notes extraction to gelbooru.py
  since it only works there
- clean up some code
2022-10-31 12:01:19 +01:00
Mike Fährmann
5fd4374036 [sankaku] improve 429 and tag limit handling 2022-10-01 11:49:47 +02:00
Mike Fährmann
4089bceddd [sankaku] implement 'refresh' option (#2958) 2022-09-30 19:55:48 +02:00
Mike Fährmann
850608551c [sankaku] detect expired links (#2958) 2022-09-23 11:51:30 +02:00
Mike Fährmann
32c75d12e8 [sankaku] rewrite URLs to s.sankakucomplex.com (#2746) 2022-07-11 12:46:04 +02:00
Mike Fährmann
05d4a0215a [sankaku] extend URL patterns (fixes #2647)
- support URLs with ISO 639-1 language codes
- support black.… and white.… subdomains
2022-06-01 21:31:11 +02:00
Mike Fährmann
211de95dd0 update extractor test results 2021-11-01 02:58:53 +01:00
Mike Fährmann
9ed13703cc [sankaku] handle empty tags (fixes #1617) 2021-06-14 16:20:10 +02:00
Mike Fährmann
c5ca7905ce add 'noop()' and 'identity()' functions 2021-05-04 19:27:17 +02:00
Mike Fährmann
6fa20d456b [sankaku] update invalid-token detection (fixes #1515) 2021-04-30 22:04:45 +02:00
Mike Fährmann
bdfcc9c4b1 update extractor test results 2021-04-18 20:28:15 +02:00
Mike Fährmann
0e601de67b [sankaku] simplify 'pool' tags (#1388)
normalize 'tags' and 'artist_tags' to a string-list
2021-03-23 18:45:45 +01:00
Mike Fährmann
d085ade9d5 [sankaku] add 'tag_string' metadata field (#1388)
The 'join()'ed version of 'tags'.
Handling lists in format strings isn't properly supported yet.
2021-03-23 15:42:13 +01:00
Mike Fährmann
2dffd231b7 [sankaku] add enumeration index for books (#1388) 2021-03-23 15:32:54 +01:00
Mike Fährmann
96a51ff169 [sankaku] update invalid-token detection (fixes #1309) 2021-02-11 19:49:24 +01:00
Mike Fährmann
2da9068ea8 [sankaku] simplify login process 2021-01-12 00:15:22 +01:00
Mike Fährmann
b0beed7a06 [sankaku] add support for book searches (closes #1204) 2020-12-29 17:36:37 +01:00
Mike Fährmann
47a7a51944 [sankaku] fix 'invalid_token' detection 2020-12-27 02:31:01 +01:00
Mike Fährmann
e41e2be2f9 [booru] split '_prepare_post()' 2020-12-24 01:13:54 +01:00