Commit Graph

39 Commits

Author SHA1 Message Date
Mike Fährmann
e006d26c8e Revert "use f-strings when building 'pattern'"
revert d7c97d5a97.
2025-12-20 22:07:37 +01:00
Mike Fährmann
1b4249ed37 [sankaku][idolcomplex] support URLs with locale code (#8667) 2025-12-09 08:23:40 +01:00
Mike Fährmann
d7c97d5a97 use f-strings when building 'pattern' 2025-10-20 21:23:11 +02:00
Mike Fährmann
e491d56dc3 [idolcomplex] update to new domain and interface (#7559 #8009) 2025-08-11 22:24:04 +02:00
Mike Fährmann
a097a373a9 simplify if statements by using walrus operators (#7671) 2025-07-22 20:57:54 +02:00
Mike Fährmann
d8ef1d693f rename 'StopExtraction' to 'AbortExtraction'
for cases where StopExtraction was used to report errors
2025-07-09 21:07:28 +02:00
Mike Fährmann
41191bb60a 'match.group(N)' -> 'match[N]' (#7671)
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
e08ec7e083 update copyright notices 2025-06-13 00:03:41 +02:00
Mike Fährmann
b13f464c9a [idolcomplex:pool] add 'Login required' warning 2025-05-21 07:50:22 +02:00
Mike Fährmann
ee3fdb60e9 [idolcomplex] fix 429 error during login by adding a 10s delay 2025-05-21 07:37:11 +02:00
Mike Fährmann
96f7f626d5 [idolcomplex] fix/update pagination logic (#7549) 2025-05-21 07:35:43 +02:00
Mike Fährmann
32262a048b [idolcomplex] fix metadata extraction
- replace legacy 'id' vales with alphanumeric ones, since the former are
  no longer available
- approximate 'vote_average', since the real value is no longer
  available
- fix 'vote_count'
2024-03-22 01:43:05 +01:00
Mike Fährmann
77ab015df2 [idolcomplex] support new pool URLs 2024-03-22 01:38:25 +01:00
Mike Fährmann
6414dc6bca [idolcomplex] fix pagination for tags containing ':' (#5171) 2024-02-09 17:51:08 +01:00
Mike Fährmann
aee5580c62 [idolcomplex] extract 'id_alnum' metadata (#5171) 2024-02-08 18:29:54 +01:00
Mike Fährmann
6ef143ea31 [idolcomplex] support alphanumeric post IDs (#5171) 2024-02-07 14:57:13 +01:00
Mike Fährmann
63f649cd92 [idolcomplex] fix extraction & update URL patterns (#5002) 2024-01-01 17:38:32 +01:00
Mike Fährmann
57fc6fcf83 replace '24*3600' with '86400'
and generalize cache maxage values
2023-12-18 23:57:22 +01:00
Mike Fährmann
1f9b16a70b replace static 'sleep-request' defaults with dynamic ones 2023-12-18 22:06:26 +01:00
Mike Fährmann
013ca21543 [idolcomplex] update to site layout changes 2023-11-27 18:27:08 +01:00
Mike Fährmann
75dec71253 [idolcomplex] disable Referer headers by default (#4726) 2023-10-26 18:02:31 +02:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
a3bf878329 [idolcomplex] improve and fix pagination (#1601)
always rely on the 'next-page-url' value and its query parameters
2021-06-04 20:31:08 +02:00
Mike Fährmann
78f89d2e61 [idolcomplex] fix pagination (closes #1594) 2021-06-02 15:32:46 +02:00
Mike Fährmann
bae874f370 replace 'wait-min/-max' with 'sleep-request'
on exhentai, idolcomplex, reactor
2021-03-02 22:55:45 +01:00
Mike Fährmann
b2c55f0a72 [sankaku] remove login support
The old login method for 'https://chan.sankakucomplex.com/user/login'
and the cookies it produces have no effect on the results from
'beta.sankakucomplex.com'.
2020-12-08 21:05:47 +01:00
Mike Fährmann
ecdea799dd [sankaku] use 'beta.sankakucomplex.com' API endpoints 2020-12-05 22:08:58 +01:00
Mike Fährmann
6284731107 simplify extractor constants
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
ecad69100a [photobucket] add 'image' extractor (#117) 2019-01-22 17:24:43 +01:00
Mike Fährmann
d69db60e2a update unit test results 2018-10-02 20:37:46 +02:00
Mike Fährmann
d98e47817d [deviantart] reduce refresh-token usage
Instead of using a refresh-token-based access-token for every API
request, they are now only used for paginated results.

API requests to get a user's profile and the original download URL
now always use a public access-token.
2018-07-24 17:32:46 +02:00
Mike Fährmann
269dc2bbd5 [sankaku] add 'tags' option (#94) 2018-07-14 09:53:01 +02:00
Mike Fährmann
829ddf4ac1 [sankaku] general improvements
- simplify regex
- unquote search tags
- increase default wait-time between HTTP requests
  - downloading several hundreds of images always resulted
    in '429 Too Many Requests' eventually
- circumvent paging restrictions for unauthenticated users by only
  using the 'next' parameter
  - setting 'page' to a constant, low value (or simply omitting it)
    does the trick
2018-02-27 16:51:14 +01:00
Jad
49463f76bb support multi-page URL (#79)
* support multi-page URL

* fix

* all done.

* fix, again
2018-02-26 11:13:49 +01:00
Mike Fährmann
e420a28bbc fix cookie tests 2018-01-09 21:43:52 +01:00
Mike Fährmann
b33efc99a4 [idolcomplex] add support for idol.sankakucomplex.com 2018-01-09 17:54:37 +01:00