Commit Graph

3541 Commits

Author SHA1 Message Date
Mike Fährmann
fd6b413f3c [exhentai] fix 'domain' option (#4458)
regression from a383eca7
2023-08-23 23:58:04 +02:00
Mike Fährmann
fdfb22c91f [instagram] fix video preview archive IDs (#2135, #4455) 2023-08-23 12:29:32 +02:00
Mike Fährmann
2b88ad19e9 [twitter] accept 'x.com' URLs (#4452) 2023-08-21 19:47:07 +02:00
Mike Fährmann
8dceea3384 [shimme2] move 'giantessbooru' back into shimmie module (#4373)
Do the same thing as for 'realbooru' and override 'posts()'
insteadd of using a separate module.
2023-08-18 15:25:28 +02:00
Mike Fährmann
6482f9453b [behance] fix cookie usage (#4417) 2023-08-18 14:48:20 +02:00
Mike Fährmann
d34195b41d [behance] fix and update 'user' extractor (#4417) 2023-08-17 16:06:35 +02:00
Mike Fährmann
4d3cf709da [behance] add 'date' metadata field (#4417) 2023-08-17 15:33:47 +02:00
Mike Fährmann
c689cd9720 [behance] show error for mature content (#4417) 2023-08-17 15:31:37 +02:00
Mike Fährmann
33d912490f merge #4419: [bunkr] Fix extracting wmv files 2023-08-17 15:28:29 +02:00
Mike Fährmann
01610a6e9e merge #4412: [bunkr] fix media domain for cdn9 2023-08-17 15:18:49 +02:00
ClosedPort22
6dc8be5e48 [issuu] fix extraction 2023-08-13 21:13:50 +08:00
Luc Ritchie
85a070b9e6 [bunkr] Fix extracting wmv files 2023-08-12 16:53:14 -04:00
Mike Fährmann
3f8ff692a7 [bunkr] fix media domain for cdn9
Fixes #4386
2023-08-11 18:14:47 -04:00
Mike Fährmann
391a7d74c8 [giantessbooru] fix and move to separate module (#4373)
too many differences to the other shimmie2 sites
2023-08-09 18:36:56 +02:00
Mike Fährmann
089d1a4f67 [twitter] fix 'TweetWithVisibilityResults' (#4369) 2023-08-06 22:08:50 +02:00
Mike Fährmann
a4f7f7da17 add '_dump()' convenience method to Extractor 2023-08-06 17:03:09 +02:00
Mike Fährmann
df5c7ee03e [deviantart] fix search (#4384)
send correct usernames instead of 'u'
2023-08-04 17:16:04 +02:00
Mike Fährmann
a60db454af [sankaku] update/fix API headers
'Referer' and 'Origin' were both empty
2023-08-04 17:14:43 +02:00
Mike Fährmann
fb3f0453db [twitter] improve error messages for single Tweets (#4369)
also fixes '"quoted": false' not having any effect
2023-08-03 22:02:07 +02:00
Mike Fährmann
541bff5a37 [pururin] fix extraction (#4375)
- rename 'title_jp' to 'title_ja'
- change type of 'collection', 'convention',  and 'scanlator' to list
2023-08-03 14:40:44 +02:00
Mike Fährmann
6a87c314af [instagram] fix private posts with long shortcodes (#4362) 2023-08-03 13:51:03 +02:00
Mike Fährmann
f899fac4c5 [giantessbooru] fix extraction (#4373)
This does not fix anything Cloudflare related,
just other things caused by a site update.
2023-08-03 13:40:11 +02:00
Mike Fährmann
136283d402 [shimmie2] update base URL pattern
to match new giantessbooru URLs
2023-08-03 13:34:48 +02:00
Mike Fährmann
c79359eb3a [fantia] improve metadata extraction (#4126)
extract all metadata and URLs before starting to download
2023-07-31 22:31:50 +02:00
Mike Fährmann
48ef062867 fix issues with 'Extractor.finalize()'
- prevent crash in InstagramUserExtractor (#4359)
- call it at the end of every DownloadJob
- add it to tests
2023-07-29 13:43:27 +02:00
Mike Fährmann
ed21908fda initial support for child extractor options
Using "parent-category>child-category" as extractor category in a config
file allows to set options for a child extractor when it was spawned by
that parent.

For example "reddit>gfycat" to set gfycat options for when it was found
in a reddit post.

{
    "extractor": {
        "gfycat": {
            "filename": "regular filename"
        },
        "reddit>gfycat": {
            "filename": "reddit-specific filename"
        }
    }
}

Note: This does currently not work for most imgur links due to how its
extractor hierarchy is structured.
2023-07-28 17:07:25 +02:00
Mike Fährmann
255d08b79e add test for 'Extractor.initialize()' (#4359) 2023-07-28 16:58:16 +02:00
Mike Fährmann
2bcf0a4c49 [instagram] fix initialization order (#4359)
regression caused by the changes in a383eca7
2023-07-28 14:25:37 +02:00
Mike Fährmann
7eab101144 [acidimg] fix extraction
swap ' and " again (2e309a13)
and add a fallback in case this happens yet another time
2023-07-28 14:23:11 +02:00
Mike Fährmann
62fce6a75f [imagehosts] adjust variable names (#4358)
prefix them with underscores to prevent a clash
with the new 'self.cookies' from d97b8c2f
2023-07-28 14:18:47 +02:00
Mike Fährmann
e8299b459a [moebooru] match search URLs with empty 'tags' (#4354) 2023-07-26 18:02:26 +02:00
Mike Fährmann
7fbc304ae9 [twitter] fix crash on private user (#4349) 2023-07-26 17:53:51 +02:00
Mike Fährmann
1ece3b92ff [mangadex] allow multiple values for 'lang' (#4093)
This was already possible by setting 'lang' to a list of strings,
but now it can also be done as a more command-line friendly string.

-o lang=fr,it
2023-07-26 17:39:27 +02:00
Mike Fährmann
52053b58f0 [lensdump] fix extraction (#4352) 2023-07-26 14:24:19 +02:00
Mike Fährmann
11f71a9cba remove 'mememuseum' module
This was forgotten when adding generic Shimmie2 support in 7865067d
2023-07-25 22:22:27 +02:00
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
1baf83a9e5 [hiperdex] fix for unicode titles (#4325) 2023-07-22 16:20:57 +02:00
Mike Fährmann
7da954f810 [flickr] update default API credentials (#4332)
and add a delay between API requests
2023-07-22 15:38:33 +02:00
Mike Fährmann
a45a17ddb7 [pixiv] ignore 'limit_sanity_level' images (#4328) 2023-07-22 14:57:38 +02:00
Mike Fährmann
088e8d5fcf [pornhub] fix extraction (#4301) 2023-07-22 14:05:40 +02:00
Mike Fährmann
d97b8c2fba consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
ceebacc9e1 remove 'pyopenssl' option 2023-07-19 20:44:07 +02:00
Mike Fährmann
3c2c7e21dd merge #4319: [zerochan] fix 'tags' extraction 2023-07-18 18:37:50 +02:00
Mike Fährmann
0ba8d1f168 merge #4312: [redgifs] add 'niches' extractor 2023-07-18 18:36:15 +02:00
Mike Fährmann
c5565f79f7 merge #4096: [danbooru] add support for booru.borvar.art instance 2023-07-18 18:33:08 +02:00
Mike Fährmann
63326e3168 [danbooru] add tests for booruvar 2023-07-18 18:29:57 +02:00
Mike Fährmann
5171d8975c [E621] support 'e6ai.net' (#4320) 2023-07-18 18:16:30 +02:00
Mike Fährmann
a996d936d2 [imagefap] fix pagination (#3013) 2023-07-18 17:56:33 +02:00
Mike Fährmann
22099422ca [deviantart] fix shortened URLs (#4316) 2023-07-18 17:55:13 +02:00
Mike Fährmann
90231f2d5a [twitter] add 'tweet-endpoint' option (#4307)
use the newer TweetResultByRestId only for guests by default
2023-07-18 17:19:32 +02:00