Commit Graph

747 Commits

Author SHA1 Message Date
Mike Fährmann
6894e0bc70 [arcalive] extend 'gifs' option
use fallback URLs by default
2025-03-26 20:59:18 +01:00
Mike Fährmann
24bbcbcfa3 [danbooru] add 'favgroup' extractor 2025-03-26 20:58:49 +01:00
Mike Fährmann
e1aabf01e4 merge #7220: [deviantart] add subfolder support (#4988 #7185) 2025-03-24 18:43:11 +01:00
Mike Fährmann
fd8f652490 [hitomi] fix extractors (#7230) 2025-03-23 20:32:27 +01:00
Mike Fährmann
b52c21186b [deviantart] add 'subfolders' option 2025-03-23 17:58:58 +01:00
Mike Fährmann
4a74bc6e30 [kemonoparty] extract 'archives' metadata (#7195)
add 'archives' option for additional data
2025-03-22 18:38:21 +01:00
Mike Fährmann
f8ef9a7b35 [kemonoparty] enable 'username'/'user_profile' metadata by default 2025-03-21 20:01:06 +01:00
Mike Fährmann
dbe8820b9e [arcalife] add 'gifs' option (#5657) 2025-03-14 22:34:45 +01:00
hdk5
d900e868e4 [arcalive] add support (#5657 #7100)
* [arca.live] Add extractor skeleton

* [arcalive] update names and formatting

* [arcalive] implement initial file extraction code

* [arcalive] improve '_extract_media()' performance

compile and cache regex on demand

* [arcalive] improve image extraction

- extract 'data-originalurl' URLs if available
- replace URL query strings with 'type=orig'
- ignore emoticons by default

* [arcalive] update defaults

- include 'title' in filenames
- use 0.5-1.5s delay between requests

* [arcalive] use ext from 'data-orig' if available

* [arcalive] update docs/supportedsites

* [arcalive] add tests

* [arcalive] update 'board' extractor pattern

so it doesn't also match 'post' URLs

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2025-03-14 10:52:21 +01:00
Mike Fährmann
22d46f2462 [batoto] add 'domain' option (#7174)
allow legacy domains by default
2025-03-14 10:31:49 +01:00
Mike Fährmann
cd1ddb0a67 [wikimedia] add 'subcategories' option (#2340)
https://github.com/mikf/gallery-dl/pull/2340#issuecomment-2707177295
2025-03-12 22:05:44 +01:00
Mike Fährmann
3e64ec4f15 [tiktok] implement audio extraction without ytdl 2025-03-12 22:05:24 +01:00
Mike Fährmann
486e307ecd [reddit] add 'selftext' option (#7111) 2025-03-08 09:01:50 +01:00
Mike Fährmann
3b81d89fef [bunkr] add 'endpoint' option (#7097) 2025-03-08 08:50:16 +01:00
Mike Fährmann
2f3265a8ae [tenor] add initial support (#6075) 2025-03-03 19:04:50 +01:00
Mike Fährmann
f232a07faf [danbooru:pool] download posts in pool order (#7091)
- add 'order-posts' option
- add 'num' metadata field for pool position
- update default filenames to order by pool position
2025-03-03 16:46:43 +01:00
Mike Fährmann
4ecc40ce4b [docs] fix 'tiktok-range' default value (#7098) 2025-03-02 14:34:30 +01:00
Mike Fährmann
afde4ad343 [tiktok] add 'avatar' option 2025-02-26 21:09:57 +01:00
Mike Fährmann
5e87aee32d [tiktok] add 'audio' option (#7060) 2025-02-26 21:02:33 +01:00
Mike Fährmann
13c3fa45f7 [docs] add 'tiktok' options (#7060) 2025-02-26 20:45:25 +01:00
Mike Fährmann
203c2e3492 [vipergirls] change default 'domain' to 'viper.click' (4166)
https://github.com/mikf/gallery-dl/issues/4166#issuecomment-2684014628

and update general 'domain' handling
2025-02-26 10:41:51 +01:00
Mike Fährmann
876169ded5 [furaffinity] use a 1s delay between requests by default (#7054) 2025-02-25 20:12:54 +01:00
Luca Russo
95c446fcd1 [discord] add support (#6836)
* first commit

* add --

* skip video embeds

* fix typo

* removed ambiguity

* add category support

* code tweaks

* more reliable embed extraction

* handle 403 errors (testing done)

* added "parent_id" keyword

* added "parent", "parent_type" keywords

the extractor should be now ready to merge!

* removed unnecessary dict unpacking

* added empty text messages extraction

* added "channel_topic"

* even more metadata extraction

can now extract all embeds images & text, as well as server banners. also code is much better.

* added user avatar and banner

* better pagination

* fix regression

* minor tweaks

* Made requested changes
2025-02-18 18:45:39 +01:00
Mike Fährmann
fd4de02e67 [archive] support PostgreSQL archives for post processors (#6152) 2025-02-17 14:58:14 +01:00
Mike Fährmann
8daf496a22 [archive] add 'archive-table' option (#6152) 2025-02-17 11:41:13 +01:00
Mike Fährmann
841bc9f66f [archive] implement support for PostgreSQL databases (#6152) 2025-02-16 17:56:52 +01:00
Mike Fährmann
35307608f2 [dl:http] add 'sleep-429' option (#6996) 2025-02-15 17:42:03 +01:00
Mike Fährmann
182b544217 [ytdl] support specifying filesystem paths as 'module' (#6991) 2025-02-14 19:58:25 +01:00
Mike Fährmann
51f978e027 [weibo] add 'movies' option (#6988)
disable 'movie' downloads by default
2025-02-13 18:01:20 +01:00
Mike Fährmann
873cbf6b36 [docs] add more details to 'user-agent' and 'browser' docs (#6917) 2025-02-03 20:54:27 +01:00
Mike Fährmann
4874c8e1d1 [artstation] restore 'browser' and 'tls12' defaults
partially revert 954796a466
2025-01-28 11:36:06 +01:00
Mike Fährmann
954796a466 [artstation] prevent CF challenges (#5817, #5658, #5564, #5554) 2025-01-26 16:00:16 +01:00
Mike Fährmann
438c61601b [xfolio] add initial support (#5514, #6351, #6837) 2025-01-18 15:57:56 +01:00
Mike Fährmann
dc7b46be21 [khinsider] add 'covers' option (#6844) 2025-01-18 15:57:56 +01:00
Mike Fährmann
bde99cc6ce [cohost] remove module
cohost.org  now redirects to archive.org
2025-01-13 14:38:35 +01:00
Mike Fährmann
91bd3e37f2 [pexels] add support (#2286, #4214, #6769) 2025-01-12 16:50:12 +01:00
Mike Fährmann
1d75c8308c [weebcentral] add support (#6778) 2025-01-10 23:04:51 +01:00
Mike Fährmann
b1ffb62644 [docs] update 'sleep-request' value for 'wallhaven' 2025-01-06 17:24:04 +01:00
Mike Fährmann
2dd2c71c53 [docs] update configuration.rst 2025-01-02 17:54:47 +01:00
Mike Fährmann
998f949db1 [civitai] add 'user-videos' extractor (#6644) 2024-12-26 10:18:54 +01:00
Mike Fährmann
7f6a53c347 [cohost] add 'avatar' and 'background' options (#6656) 2024-12-14 20:16:28 +01:00
Mike Fährmann
94d7df186f [bluesky] default to /posts if reposts/quoted is enabled (#6583) 2024-12-13 22:24:37 +01:00
Mike Fährmann
7091904b20 [common] restore using environment proxies by default (#6553, #6609)
change 'proxy-env' default to 'true'
2024-12-07 17:38:44 +01:00
Mike Fährmann
34e157e166 [zerochan] download webp and gif files, add 'extensions' option (#6576) 2024-12-05 21:25:44 +01:00
Mike Fährmann
a526a3d00d [patreon] add 'format-images' option (#6569) 2024-12-04 21:38:01 +01:00
Luca Russo
e9370b7b8a merge #5626: [facebook] add support (#470, #2612)
* [facebook] add initial support

* renamed extractors & subcategories

* better stability, modularity & naming

* added single photo extractor, warnings & retries

* more metadata + extract author followups

* renamed "album" mentions to "set" for consistency

* cookies are now only used when necessary

also added author followups for singular images

* removed f-strings

* added way to continue extraction from where it left off

also fixed some bugs

* fixed bug wrong subcategory

* added individual video extraction

* extract audio + added ytdl option

* updated setextract regex

* added option to disable start warning

the extractor should be ready :)

* fixed description metadata bug

* removed cookie "safeguard" + fixed for private profiles

I have removed the cookie "safeguard" (not using cookies until they are necessary) as I've come to the conclusion that it does more harm than good. There is no way to detect whether the extractor has skipped private images, that could have been possibly extracted otherwise. Also, doing this provides little to no advantages.

* fixed a few bugs regarding profile parsing

* a few bugfixes

Fixed some metadata attributes from not decoding correctly from non-latin languages, or not showing at all.
Also improved few patterns.

* retrigger checks

* Final cleanups

-Added tests
-Fixed video extractor giving incorrect URLs
-Removed start warning
-Listed supported site correctly

* fixed regex

* trigger checks

* fixed livestream playback extraction + bugfixes

I've chosen to remove the "reactions", "comments" and "views" attributes as I've felt that they require additional maintenance even though nobody would ever actually use them to order files.
I've also removed the "title" and "caption" video attributes for their inconsistency across different videos.
Feel free to share your thoughts.

* fixed regex

* fixed filename fallback

* fixed retrying when a photo url is not found

* fixed end line

* post url fix + better naming

* fix posts

* fixed tests

* added profile.php url

* made most of the requested changes

* flake

* archive: false

* removed unnecessary url extract

* [facebook] update

- more 'Sec-Fetch-…' headers
- simplify 'text.nameext_from_url()' calls
- replace 'sorted(…)[-1]' with 'max(…)'
- fix '_interval_429' usage
- use replacement fields in logging messages

* [facebook] update URL patterns

get rid of '.*' and '.*?'

* added few remaining tests

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2024-11-26 21:49:11 +01:00
Mike Fährmann
cb09273670 [koharu] implement 'tags' option 2024-11-15 23:49:58 +01:00
Mike Fährmann
c82f3db098 [common] add 'proxy-env' option
(#6134, #6455)
disable using environment proxies by default
2024-11-15 18:03:56 +01:00
Mike Fährmann
e763efd36c [bilibili] add workarounds for getting rate-limited (#6443)
- set 3-6 second request_interval by default
- retry request after waiting 5 minutes
2024-11-14 23:06:26 +01:00
Mike Fährmann
0b99d9e6b9 [util] add "defaultdict" filters-environment
allows accessing undefined values without raising an exception,
but preserves other errors like TypeError, AttributeError, etc
2024-11-14 22:47:25 +01:00