Commit Graph

3842 Commits

Author SHA1 Message Date
Mike Fährmann
8a70b94245 [twitter] implement constant 'user' for tweet URLs 2022-07-29 19:44:29 +02:00
Mike Fährmann
8cf5981ded [skeb] add option to download 'article' images (#1031) 2022-07-29 16:32:00 +02:00
Mike Fährmann
43ec315a7f [deviantart] use public access token for journals (#2702)
and retry with a private token if needed
2022-07-29 16:18:09 +02:00
Mike Fährmann
3f08a91131 [bunkr] fix extraction (#2788)
... again
2022-07-29 15:53:46 +02:00
Mike Fährmann
5038893cdd [blogger] emit metadata for posts without files (#2789) 2022-07-29 13:38:39 +02:00
Mike Fährmann
98af5a0409 [zerochan] implement login with username & password (#1434) 2022-07-29 12:56:20 +02:00
Mike Fährmann
3a8addfe45 [zerochan] add 'tag' and 'image' extractors (#1434) 2022-07-27 22:58:23 +02:00
Mike Fährmann
e660e48a60 [vk] prevent exceptions for broken/invalid photos (#2774) 2022-07-27 18:52:43 +02:00
Mike Fährmann
f559943d77 [instagram] fix empty 'params' in '_pagination_api()' 2022-07-27 13:02:24 +02:00
Mike Fährmann
1540d0e695 [twitter] use filter:links (#2766) 2022-07-27 12:17:43 +02:00
Mike Fährmann
8d0801ad8e [twitter] fall back to unfiltered search (#2766) 2022-07-27 12:16:53 +02:00
Marius Kaufmann
0aa8345a13 [mastodon] allow downloading without access token (#2782)
Most mastodon instances allow accessing /api/v1/accounts/XXXX/statuses and api/v1/statuses/XXXX without an API access token.
This commit allows users to download at least some links from such a mastodon instance that does not already have access tokens hard-coded into the extractor.
User extractor only works on links that include the user id such as https://mastodon.tld/@id:12345. Status links work as-is.
2022-07-27 12:07:06 +02:00
thatfuckingbird
ea5ffb19a6 fanbox: download cover images in original size (#2784) 2022-07-27 10:53:04 +02:00
Chew Shee Yang
977d53b640 [Instagram] Add support for user's saved collection (#2769)
* [Instagram] Add support for user's saved collection

* [Instagram] Run formatter

* [Instagram] Simplify collection_id retrieval and add metadata

* [Instagram] Fix bug when params is not passed to _pagination_api
2022-07-27 10:49:45 +02:00
blankie
5b63df46c0 [tumblr] attempt to get higher-quality images (#2761) 2022-07-27 10:47:43 +02:00
blankie
59b16b3f70 [artstation] add 'num' and 'count' metadata fields (#2764) 2022-07-19 14:25:07 +02:00
Mike Fährmann
0c73914848 [postprocessor:metadata] implement 'mode: modify' (#2640) 2022-07-19 12:24:26 +02:00
Mike Fährmann
f3de6b7a87 [postprocessor:metadata] implement 'mode: delete' (#2640) 2022-07-19 00:57:29 +02:00
Mike Fährmann
eb68d45544 add global 'warnings' option (#2762) 2022-07-18 22:20:30 +02:00
Mike Fährmann
f225247670 [gelbooru] add support for api_key and user_id (#2767) 2022-07-18 18:46:31 +02:00
Mike Fährmann
77bdd8fe0f [twitter] implement constant 'user' for 'from:…' searches 2022-07-17 19:14:32 +02:00
Mike Fährmann
a267a05a3f [twitter] update 'quote_id' and 'quote_by'
- 'quote_id' is now non-null for quoted Tweets and has the ID of the
  quoting Tweet, instead the other way round like before
- 'quote_by' is now the 'screen_name' of the quoting user
  (was the same the new 'quote_id' is now)
2022-07-17 18:50:21 +02:00
Mike Fährmann
749802c7bd [twitter] update 'user' and 'author' fields
- 'author' is always the user who authored a tweet
- 'user' is always the user specified in the input URL
  or equal to 'author' when the former is not given
2022-07-17 17:04:24 +02:00
Mike Fährmann
51b1999d4b release version 1.22.4 2022-07-15 19:30:48 +02:00
Mike Fährmann
a566e63cdf [tumblr] support '/blog/view' URLs (#2760) 2022-07-15 15:22:54 +02:00
Mike Fährmann
46f11a3118 [bunkr] fix extraction (#2732)
move bunkr.is code to its own module
2022-07-15 13:00:57 +02:00
Mike Fährmann
baf3815ebd [nozomi] small code optimizations 2022-07-14 14:59:11 +02:00
Mike Fährmann
9704c04172 [postprocessor:zip] ensure target directory exists (#2758) 2022-07-14 11:55:39 +02:00
blankie
836402bf58 [twitter] unescape content (#2756) (#2757)
Fixes #2756
2022-07-13 19:45:14 +02:00
Mike Fährmann
62cc47755b [nozomi] reduce memory consumption during searches (#2754)
only load and use the entire 'index.nozomi' database
if there are only negative search terms
2022-07-13 17:16:10 +02:00
Mike Fährmann
467a2a4d35 [instagram] add 'pinned' metadata field (#2752)
'pinned' is a list of user IDs for which a post is pinned
and empty if not pinned anywhere.
2022-07-13 15:54:08 +02:00
Mike Fährmann
fe2b3d57d4 [komikcast] update domain 2022-07-12 23:07:58 +02:00
Mike Fährmann
4e11ca737e [hentaifoundry] fix metadata extraction 2022-07-12 22:19:22 +02:00
Mike Fährmann
f2e59cc906 [slideshare] fix 'description' extraction 2022-07-12 18:38:44 +02:00
Mike Fährmann
31e868fca1 [khinsider] extract 'platform' metadata 2022-07-12 18:31:31 +02:00
Mike Fährmann
c6a9bab019 update extractor test results 2022-07-12 15:49:22 +02:00
Mike Fährmann
539e3bbed9 [weibo] handle invalid/broken status objects 2022-07-12 15:49:09 +02:00
Mike Fährmann
32c75d12e8 [sankaku] rewrite URLs to s.sankakucomplex.com (#2746) 2022-07-11 12:46:04 +02:00
Mike Fährmann
d5ded11aa8 [pixiv] fix default filenames for backgrounds 2022-07-11 12:45:38 +02:00
Mike Fährmann
e1f501ed14 [mangakakalot] update domain 2022-07-11 00:29:25 +02:00
Mike Fährmann
2dc57637cf [foolfuuka] remove archive.wakarimasen.moe 2022-07-10 23:13:49 +02:00
Mike Fährmann
98744977cf [itaku] fix 'date' parsing 2022-07-10 20:45:51 +02:00
Mike Fährmann
b590774f67 [twitter] add 'count' metadata field (#2741) 2022-07-10 14:37:04 +02:00
Mike Fährmann
7c0505868c [kemonoparty] ensure all files have an 'extension' (#2740) 2022-07-10 13:53:07 +02:00
Mike Fährmann
74865adae5 implement 'format-separator' option (#2737)
a global option, that servers as a workaround for shortcomings due to
lack of a proper format string parser
2022-07-10 13:31:43 +02:00
bradenhilton
117eeefda0 [postprocessor:mtime] add 'value' option (#2739) 2022-07-08 20:56:01 +02:00
Mike Fährmann
90ae48c40c [formatter] implement 'O' format specifier (#2736)
to apply a UTC offset to 'date' values and other datetime objects
2022-07-08 12:51:03 +02:00
Mike Fährmann
e4f48cc810 make it easier to disable default 'browser' settings
Previously it was necessary to set 'browser' to a non-empty, non-string
value to disable any default 'browser' value.
Now '-o browser=' or '-o browser=false' is enough.
2022-07-07 11:17:43 +02:00
Mike Fährmann
92b75bcdce limit path length for --write-pages output on Windows (#2733) 2022-07-06 18:56:23 +02:00
Mike Fährmann
04bed1eba3 [formatter] allow for custom "format" functions (#2721) 2022-07-05 12:22:01 +02:00