Mike Fährmann
a783c4f0fe
[pornhub] add 'gif' support ( #4463 )
2023-08-29 19:34:27 +02:00
Mike Fährmann
ba842981af
[imagevenue] fix extraction ( #4473 )
2023-08-29 12:06:30 +02:00
Mike Fährmann
7defb24e1e
[reddit] provide video previews if available ( #4322 )
2023-08-28 22:22:10 +02:00
Mike Fährmann
fd65f27ede
[reddit] fix 'preview.redd.it' URLs ( #4470 )
2023-08-28 17:17:03 +02:00
Mike Fährmann
06aaedded5
[twitter] extract 'source' metadata ( #4459 )
2023-08-28 16:31:57 +02:00
Mike Fährmann
14af15bd18
[reddit] download preview for 404ed imgur links ( #4322 )
...
This is a pretty ugly hack as the internal infrastructure doesn't
really support switching from external URL to regular download in
case the former fails, but it kind of works ...
Can be disabled by setting 'reddit.fallback' to 'false'.
2023-08-24 15:41:05 +02:00
Mike Fährmann
d12a5e440a
update docs/supportedsites
2023-08-24 15:01:26 +02:00
Mike Fährmann
3a27150479
[instagram] add 'following' extractor ( #1848 )
2023-08-23 23:58:12 +02:00
Mike Fährmann
e0829ff0fd
[twitter] add 'date_original' metadata for retweets ( #4337 , #4443 )
2023-08-23 23:58:11 +02:00
Mike Fährmann
5ed245317d
[exhentai] add 'fav' option ( #4409 )
...
The name 'favorite' is already taken as extractor subcategory
2023-08-23 23:58:11 +02:00
Mike Fährmann
fd6b413f3c
[exhentai] fix 'domain' option ( #4458 )
...
regression from a383eca7
2023-08-23 23:58:04 +02:00
Mike Fährmann
fdfb22c91f
[instagram] fix video preview archive IDs ( #2135 , #4455 )
2023-08-23 12:29:32 +02:00
Mike Fährmann
2b88ad19e9
[twitter] accept 'x.com' URLs ( #4452 )
2023-08-21 19:47:07 +02:00
Mike Fährmann
8dceea3384
[shimme2] move 'giantessbooru' back into shimmie module ( #4373 )
...
Do the same thing as for 'realbooru' and override 'posts()'
insteadd of using a separate module.
2023-08-18 15:25:28 +02:00
Mike Fährmann
6482f9453b
[behance] fix cookie usage ( #4417 )
2023-08-18 14:48:20 +02:00
Mike Fährmann
d34195b41d
[behance] fix and update 'user' extractor ( #4417 )
2023-08-17 16:06:35 +02:00
Mike Fährmann
4d3cf709da
[behance] add 'date' metadata field ( #4417 )
2023-08-17 15:33:47 +02:00
Mike Fährmann
c689cd9720
[behance] show error for mature content ( #4417 )
2023-08-17 15:31:37 +02:00
Mike Fährmann
33d912490f
merge #4419 : [bunkr] Fix extracting wmv files
2023-08-17 15:28:29 +02:00
Mike Fährmann
01610a6e9e
merge #4412 : [bunkr] fix media domain for cdn9
2023-08-17 15:18:49 +02:00
ClosedPort22
6dc8be5e48
[issuu] fix extraction
2023-08-13 21:13:50 +08:00
Luc Ritchie
85a070b9e6
[bunkr] Fix extracting wmv files
2023-08-12 16:53:14 -04:00
Mike Fährmann
3f8ff692a7
[bunkr] fix media domain for cdn9
...
Fixes #4386
2023-08-11 18:14:47 -04:00
Mike Fährmann
391a7d74c8
[giantessbooru] fix and move to separate module ( #4373 )
...
too many differences to the other shimmie2 sites
2023-08-09 18:36:56 +02:00
Mike Fährmann
089d1a4f67
[twitter] fix 'TweetWithVisibilityResults' ( #4369 )
2023-08-06 22:08:50 +02:00
Mike Fährmann
a4f7f7da17
add '_dump()' convenience method to Extractor
2023-08-06 17:03:09 +02:00
Mike Fährmann
df5c7ee03e
[deviantart] fix search ( #4384 )
...
send correct usernames instead of 'u'
2023-08-04 17:16:04 +02:00
Mike Fährmann
a60db454af
[sankaku] update/fix API headers
...
'Referer' and 'Origin' were both empty
2023-08-04 17:14:43 +02:00
Mike Fährmann
fb3f0453db
[twitter] improve error messages for single Tweets ( #4369 )
...
also fixes '"quoted": false' not having any effect
2023-08-03 22:02:07 +02:00
Mike Fährmann
541bff5a37
[pururin] fix extraction ( #4375 )
...
- rename 'title_jp' to 'title_ja'
- change type of 'collection', 'convention', and 'scanlator' to list
2023-08-03 14:40:44 +02:00
Mike Fährmann
6a87c314af
[instagram] fix private posts with long shortcodes ( #4362 )
2023-08-03 13:51:03 +02:00
Mike Fährmann
f899fac4c5
[giantessbooru] fix extraction ( #4373 )
...
This does not fix anything Cloudflare related,
just other things caused by a site update.
2023-08-03 13:40:11 +02:00
Mike Fährmann
136283d402
[shimmie2] update base URL pattern
...
to match new giantessbooru URLs
2023-08-03 13:34:48 +02:00
Mike Fährmann
c79359eb3a
[fantia] improve metadata extraction ( #4126 )
...
extract all metadata and URLs before starting to download
2023-07-31 22:31:50 +02:00
Mike Fährmann
48ef062867
fix issues with 'Extractor.finalize()'
...
- prevent crash in InstagramUserExtractor (#4359 )
- call it at the end of every DownloadJob
- add it to tests
2023-07-29 13:43:27 +02:00
Mike Fährmann
ed21908fda
initial support for child extractor options
...
Using "parent-category>child-category" as extractor category in a config
file allows to set options for a child extractor when it was spawned by
that parent.
For example "reddit>gfycat" to set gfycat options for when it was found
in a reddit post.
{
"extractor": {
"gfycat": {
"filename": "regular filename"
},
"reddit>gfycat": {
"filename": "reddit-specific filename"
}
}
}
Note: This does currently not work for most imgur links due to how its
extractor hierarchy is structured.
2023-07-28 17:07:25 +02:00
Mike Fährmann
255d08b79e
add test for 'Extractor.initialize()' ( #4359 )
2023-07-28 16:58:16 +02:00
Mike Fährmann
2bcf0a4c49
[instagram] fix initialization order ( #4359 )
...
regression caused by the changes in a383eca7
2023-07-28 14:25:37 +02:00
Mike Fährmann
7eab101144
[acidimg] fix extraction
...
swap ' and " again (2e309a13 )
and add a fallback in case this happens yet another time
2023-07-28 14:23:11 +02:00
Mike Fährmann
62fce6a75f
[imagehosts] adjust variable names ( #4358 )
...
prefix them with underscores to prevent a clash
with the new 'self.cookies' from d97b8c2f
2023-07-28 14:18:47 +02:00
Mike Fährmann
e8299b459a
[moebooru] match search URLs with empty 'tags' ( #4354 )
2023-07-26 18:02:26 +02:00
Mike Fährmann
7fbc304ae9
[twitter] fix crash on private user ( #4349 )
2023-07-26 17:53:51 +02:00
Mike Fährmann
1ece3b92ff
[mangadex] allow multiple values for 'lang' ( #4093 )
...
This was already possible by setting 'lang' to a list of strings,
but now it can also be done as a more command-line friendly string.
-o lang=fr,it
2023-07-26 17:39:27 +02:00
Mike Fährmann
52053b58f0
[lensdump] fix extraction ( #4352 )
2023-07-26 14:24:19 +02:00
Mike Fährmann
11f71a9cba
remove 'mememuseum' module
...
This was forgotten when adding generic Shimmie2 support in 7865067d
2023-07-25 22:22:27 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
1baf83a9e5
[hiperdex] fix for unicode titles ( #4325 )
2023-07-22 16:20:57 +02:00
Mike Fährmann
7da954f810
[flickr] update default API credentials ( #4332 )
...
and add a delay between API requests
2023-07-22 15:38:33 +02:00
Mike Fährmann
a45a17ddb7
[pixiv] ignore 'limit_sanity_level' images ( #4328 )
2023-07-22 14:57:38 +02:00
Mike Fährmann
088e8d5fcf
[pornhub] fix extraction ( #4301 )
2023-07-22 14:05:40 +02:00