Mike Fährmann
53cdfaac37
[common] add reference to 'exception' module to Extractor class
...
- remove 'exception' imports
- replace with 'self.exc'
2026-02-15 10:57:22 +01:00
Mike Fährmann
a97c320a38
[sankaku] fix re-authentication ( #8779 )
...
Unset the `Authorization` header before performing a re-login
2025-12-30 17:25:24 +01:00
Mike Fährmann
00c6821a3f
replace 2-element f-strings with simple '+' concatenations
...
Python's 'ast' module and its 'NodeVisitor' class
were incredibly helpful in identifying these
2025-12-22 11:26:04 +01:00
Mike Fährmann
e006d26c8e
Revert "use f-strings when building 'pattern'"
...
revert d7c97d5a97 .
2025-12-20 22:07:37 +01:00
Mike Fährmann
ae41de3be5
[sankaku][idolcomplex] fix download URLs ( #8666 )
2025-12-09 10:02:43 +01:00
Mike Fährmann
1b4249ed37
[sankaku][idolcomplex] support URLs with locale code ( #8667 )
2025-12-09 08:23:40 +01:00
Mike Fährmann
d7c97d5a97
use f-strings when building 'pattern'
2025-10-20 21:23:11 +02:00
Mike Fährmann
9bf76c1352
replace 'util.re()' with 'text.re()'
...
remove unnecessary 'util' imports
2025-10-20 17:44:58 +02:00
Mike Fährmann
085616e0a8
[dt] replace 'text.parse_datetime()' & 'text.parse_timestamp()'
2025-10-17 17:43:06 +02:00
Mike Fährmann
bf989ae80e
[sankaku] improve API error messages
2025-08-11 23:14:58 +02:00
Mike Fährmann
e491d56dc3
[idolcomplex] update to new domain and interface ( #7559 #8009 )
2025-08-11 22:24:04 +02:00
Mike Fährmann
a097a373a9
simplify if statements by using walrus operators ( #7671 )
2025-07-22 20:57:54 +02:00
Mike Fährmann
d8ef1d693f
rename 'StopExtraction' to 'AbortExtraction'
...
for cases where StopExtraction was used to report errors
2025-07-09 21:07:28 +02:00
Mike Fährmann
755b2a7eb2
[sankaku] fix extracting extended tag categories ( #7744 )
...
by sending a proper Referer header
and not one from https://sankaku.app/
2025-06-29 22:15:20 +02:00
Mike Fährmann
22b40fc787
[sankaku] remove 'id-format' option ( #5073 #6808 )
2025-06-29 17:50:19 +02:00
Mike Fährmann
9dbe33b6de
replace old %-formatted and .format(…) strings with f-strings ( #7671 )
...
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
41191bb60a
'match.group(N)' -> 'match[N]' ( #7671 )
...
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
e08ec7e083
update copyright notices
2025-06-13 00:03:41 +02:00
Mike Fährmann
b5c88b3d3e
replace standard library 're' uses with 'util.re()'
2025-06-06 13:24:52 +02:00
Mike Fährmann
c3e8af945d
[sankaku] fix passing cookies ( #7333 )
...
to allow '"tags": "extended"' to work properly
2025-05-23 19:21:56 +02:00
Mike Fährmann
9c06acb385
[sankaku] compile extended 'tags' pattern only once
...
per extractor run
2025-05-22 22:30:41 +02:00
Mike Fährmann
7b5dd61e17
[sankaku] implement support for new 'tags' categories ( #7333 #7553 )
2025-05-22 12:41:03 +02:00
Mike Fährmann
f395a3ec79
[sankaku] fix potential infinite loop ( #7155 )
...
https://github.com/mikf/gallery-dl/issues/7155#issuecomment-2723019761
2025-03-14 08:35:54 +01:00
Mike Fährmann
898a09bf7f
[sankaku] fix 'tags' metadata ( #7155 )
...
rename 'tag_names' to 'tags'
2025-03-12 17:07:40 +01:00
Mike Fährmann
94bbbbb16b
[sankaku] fix categorized tags for posts with >100 tags ( #7155 )
2025-03-11 21:01:46 +01:00
Mike Fährmann
1254c4e3d9
[sankaku] update API URLs ( #7154 #7155 )
...
and fix errors due to other changes
2025-03-11 18:45:45 +01:00
Mike Fährmann
7afd5bae03
[sankaku] increase wait time on 429 errors ( #7129 )
...
to 10 minutes
2025-03-07 20:19:17 +01:00
Mike Fährmann
3ef23cc99b
[sankaku] fix search tag limit check
2025-03-07 20:18:34 +01:00
Mike Fährmann
8256a7a8e4
[sankaku] fix extraction ( #7071 #7072 )
...
omit 'Platform: web-app' API header to get sankaku to include
'file_url' data in API responses again
2025-02-28 10:28:29 +01:00
Mike Fährmann
0c584f9be7
[sankaku] support alphanumeric book/pool IDs ( #6757 )
2025-01-02 15:49:07 +01:00
Mike Fährmann
d5fa1d6aba
[sankaku] improve tag categorization code
...
translate tag type ID to name for each category
instead of for each tag
2024-11-03 09:21:39 +01:00
Mike Fährmann
ef4c1b4fc5
[sankaku] restore old 'tags' format ( #6043 )
...
lowercase + words separated by underscores
2024-08-17 19:25:19 +02:00
Mike Fährmann
84eefeebd6
[sankaku] match URLs with 'www' subdomain ( #5907 )
2024-07-30 17:05:22 +02:00
Mike Fährmann
287a7d13cf
[sankaku] implement 'notes' extraction ( #5865 )
2024-07-18 20:44:49 +02:00
Mike Fährmann
34a4ddc399
[sankaku] add 'id-format' option ( #5073 )
2024-01-26 17:56:08 +01:00
Mike Fährmann
a416d4c3d5
[sankaku] support post URLs with alphanumeric IDs ( #5073 )
2024-01-18 16:23:14 +01:00
Mike Fährmann
57fc6fcf83
replace '24*3600' with '86400'
...
and generalize cache maxage values
2023-12-18 23:57:22 +01:00
Mike Fährmann
645b4627ef
[sankaku] update URL patterns
2023-11-24 02:41:52 +01:00
Mike Fährmann
c9a2be36d4
[sankaku] support '/posts/' tag search URLs ( #4740 )
2023-10-29 13:48:42 +01:00
Mike Fährmann
b52fd91ac6
[sankaku] support '/posts/' URLs ( #4688 )
2023-10-21 13:20:35 +02:00
Mike Fährmann
3ecb512722
send Referer headers by default
2023-09-19 00:02:04 +02:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a60db454af
[sankaku] update/fix API headers
...
'Referer' and 'Origin' were both empty
2023-08-04 17:14:43 +02:00
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
3845c0256d
[sankaku] improve warnings for unavailable posts
2023-07-01 19:11:41 +02:00
Mike Fährmann
7f25cab56e
[sankaku] support post URLs with MD5 hashes ( #3952 )
2023-04-23 16:46:40 +02:00
Mike Fährmann
faca32a850
[sankaku] sanitize 'date:…' tags ( #1790 )
2023-04-19 20:09:11 +02:00
Mike Fährmann
107c60c973
[sankaku] update URL pattern ( #3523 )
...
match tag searches with language codes without a trailing slash
2023-01-18 21:38:01 +01:00
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible
2022-11-05 01:14:09 +01:00
Mike Fährmann
775895f44b
[booru] refactor 'tags' and 'notes' extraction
...
- move HTML request for post pages into its own function
- move gelbooru_v02.py notes extraction to gelbooru.py
since it only works there
- clean up some code
2022-10-31 12:01:19 +01:00