Commit Graph

3706 Commits

Author SHA1 Message Date
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
1baf83a9e5 [hiperdex] fix for unicode titles (#4325) 2023-07-22 16:20:57 +02:00
Mike Fährmann
7da954f810 [flickr] update default API credentials (#4332)
and add a delay between API requests
2023-07-22 15:38:33 +02:00
Mike Fährmann
a45a17ddb7 [pixiv] ignore 'limit_sanity_level' images (#4328) 2023-07-22 14:57:38 +02:00
Mike Fährmann
088e8d5fcf [pornhub] fix extraction (#4301) 2023-07-22 14:05:40 +02:00
Mike Fährmann
d97b8c2fba consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
ceebacc9e1 remove 'pyopenssl' option 2023-07-19 20:44:07 +02:00
Mike Fährmann
3c2c7e21dd merge #4319: [zerochan] fix 'tags' extraction 2023-07-18 18:37:50 +02:00
Mike Fährmann
0ba8d1f168 merge #4312: [redgifs] add 'niches' extractor 2023-07-18 18:36:15 +02:00
Mike Fährmann
c5565f79f7 merge #4096: [danbooru] add support for booru.borvar.art instance 2023-07-18 18:33:08 +02:00
Mike Fährmann
63326e3168 [danbooru] add tests for booruvar 2023-07-18 18:29:57 +02:00
Mike Fährmann
5171d8975c [E621] support 'e6ai.net' (#4320) 2023-07-18 18:16:30 +02:00
Mike Fährmann
a996d936d2 [imagefap] fix pagination (#3013) 2023-07-18 17:56:33 +02:00
Mike Fährmann
22099422ca [deviantart] fix shortened URLs (#4316) 2023-07-18 17:55:13 +02:00
Mike Fährmann
90231f2d5a [twitter] add 'tweet-endpoint' option (#4307)
use the newer TweetResultByRestId only for guests by default
2023-07-18 17:19:32 +02:00
Mike Fährmann
20ed647f6f [twitter] add 'user' extractor and 'include' option (#4275) 2023-07-18 16:42:55 +02:00
Mike Fährmann
86be197d11 [twitter] remove '/search/adaptive.json' 2023-07-18 15:45:37 +02:00
enduser420
d52ed2bc5a [zerochan] fix 'tags' extraction 2023-07-18 16:38:04 +05:30
enduser420
12cd85658b [redgifs] add 'niches' extractor 2023-07-16 21:22:06 +05:30
Mike Fährmann
bc9123cfee [naverwebtoon] fix 'comic' metadata extraction 2023-07-14 22:41:36 +02:00
Mike Fährmann
ab5dde7221 [mangaread] fix 'tags' extraction 2023-07-14 22:36:06 +02:00
Mike Fährmann
c9a82c9313 [erome] ignore duplicate album IDs 2023-07-14 22:21:02 +02:00
Mike Fährmann
c84397023a [slideshare] fix extraction 2023-07-14 21:52:53 +02:00
Mike Fährmann
ffbbbd3baf [gelbooru_v01] 'vidyart' -> 'vidyart2' 2023-07-14 15:09:39 +02:00
Mike Fährmann
e40b90e137 merge #4303: [gelbooru_v01] fix 'source' (#4302) 2023-07-14 15:00:58 +02:00
Mike Fährmann
c6b31a2169 [reddit] set default 0.6s delay between requests (#4292)
to limit API requests to 100 per minute
https://www.reddit.com/r/redditdev/comments/14nbw6g/
2023-07-14 14:41:16 +02:00
Mike Fährmann
20da41018d [pornhub] set 'accessAgeDisclaimerPH' cookie (#4301) 2023-07-14 14:30:27 +02:00
ncaat
75757c4ace [gelbooru_v01] fix 'source' (#4302) 2023-07-14 12:53:24 +02:00
Mike Fährmann
2dd6942d1c [jpgfish] update domain to 'jpeg.pet' 2023-07-13 23:21:01 +02:00
Mike Fährmann
1137b89ed4 [lineblog] remove module
"LINE BLOGは2023年6月29日をもちましてサービスを終了いたしました"
2023-07-13 20:46:04 +02:00
Mike Fährmann
86560fe0cd [bcy] remove module
"The website was shut down on July 12, 2023"
https://danbooru.donmai.us/wiki_pages/bcy
2023-07-13 20:46:04 +02:00
Mike Fährmann
fceabee433 [philomena] use API interface class
handle 429 errors and retry after 10min (#4288)
2023-07-13 20:46:04 +02:00
Mike Fährmann
f079d9a703 [reddit] notify users about registering an oauth application
(#4292, #4253, #3943)
2023-07-12 21:43:00 +02:00
Mike Fährmann
fb3d1462b1 merge #4291: [wikifeet] fix 'tag' extraction 2023-07-10 14:42:56 +02:00
Mike Fährmann
0b08e2e8a8 merge #4287: [twitter] Fix following extractor not getting all users 2023-07-10 14:41:00 +02:00
Mike Fährmann
f6553ffd2f [twitter] simplify '_pagination_users'
- remove 'stop' variable
- call 'cursor.startswith()' only once
2023-07-10 14:39:09 +02:00
Mike Fährmann
1590124aae [twibooru] fix '--range' 2023-07-10 14:12:56 +02:00
enduser420
a2111dd025 [wikifeet] fix 'tag' extraction 2023-07-09 12:48:47 +05:30
Mike Fährmann
a1ffa1ff09 [philomena] fix '--range' (#4288) 2023-07-08 23:17:27 +02:00
Mike Fährmann
a27dbe8c82 [twitter] use 'TweetResultByRestId' endpoint (#4250)
allows accessing single Tweets without login
2023-07-08 23:17:10 +02:00
Mike Fährmann
d3d639a159 [twitter] don't treat missing 'TimelineAddEntries' as fatal (#4278) 2023-07-08 22:49:34 +02:00
ActuallyKit
c321c773f2 make the code less ugly 2023-07-09 02:52:04 +07:00
ActuallyKit
a437a34bcf fix lint i guess? 2023-07-09 02:41:46 +07:00
ActuallyKit
6cbc434b54 Fix users pagination 2023-07-09 02:28:35 +07:00
Mike Fährmann
d5b6802774 [seiga] set 'skip_fetish_warning' cookie (#4242) 2023-07-07 20:51:49 +02:00
Mike Fährmann
88d1e29401 [bunkr] use '.la' TLD for 'media-files12' servers (#4147, #4276) 2023-07-07 20:10:28 +02:00
Mike Fährmann
f0cb951566 [paheal] unescape 'source' 2023-07-07 20:03:00 +02:00
Mike Fährmann
b480b7076a [paheal] fix a78f8ce5 for enabled 'metadata' (#4262) 2023-07-07 20:00:49 +02:00
Mike Fährmann
384337d3dd [fantia] send 'X-Requested-With' header only for API requests (#4273) 2023-07-07 15:16:18 +02:00
Mike Fährmann
c2ac665ff7 [fantia] send 'X-Requested-With' header (#4273) 2023-07-06 19:03:53 +02:00