Commit Graph

1260 Commits

Author SHA1 Message Date
Mike Fährmann
d746e025a0 [zerochan] parse JSON-LD data (#7178) 2025-03-17 19:59:44 +01:00
Mike Fährmann
6532cf9075 [deviantart] match '/gallery/recommended-for-you' URLs (#7168) 2025-03-17 09:49:11 +01:00
Mike Fährmann
8bdd543935 [deviantart:stash] fix legacy sta.sh links (#7181)
follow redirect instead of rewriting them to deviantart.com/stash/…
2025-03-16 19:38:56 +01:00
Mike Fährmann
bf927cbd4f [config] fix using same key multiple times with 'apply' (#7127) 2025-03-16 19:37:04 +01:00
Mike Fährmann
dbe8820b9e [arcalife] add 'gifs' option (#5657) 2025-03-14 22:34:45 +01:00
Mike Fährmann
5fa5a45f03 [tests] improve error message of multi type/value tests
improvement of 42070240ae
2025-03-14 18:59:57 +01:00
Mike Fährmann
31e57bafab [arcalive] add 'user' extractor (#5657) 2025-03-14 18:58:10 +01:00
hdk5
d900e868e4 [arcalive] add support (#5657 #7100)
* [arca.live] Add extractor skeleton

* [arcalive] update names and formatting

* [arcalive] implement initial file extraction code

* [arcalive] improve '_extract_media()' performance

compile and cache regex on demand

* [arcalive] improve image extraction

- extract 'data-originalurl' URLs if available
- replace URL query strings with 'type=orig'
- ignore emoticons by default

* [arcalive] update defaults

- include 'title' in filenames
- use 0.5-1.5s delay between requests

* [arcalive] use ext from 'data-orig' if available

* [arcalive] update docs/supportedsites

* [arcalive] add tests

* [arcalive] update 'board' extractor pattern

so it doesn't also match 'post' URLs

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2025-03-14 10:52:21 +01:00
Mike Fährmann
22d46f2462 [batoto] add 'domain' option (#7174)
allow legacy domains by default
2025-03-14 10:31:49 +01:00
Mike Fährmann
f395a3ec79 [sankaku] fix potential infinite loop (#7155)
https://github.com/mikf/gallery-dl/issues/7155#issuecomment-2723019761
2025-03-14 08:35:54 +01:00
Mike Fährmann
cd1ddb0a67 [wikimedia] add 'subcategories' option (#2340)
https://github.com/mikf/gallery-dl/pull/2340#issuecomment-2707177295
2025-03-12 22:05:44 +01:00
Mike Fährmann
898a09bf7f [sankaku] fix 'tags' metadata (#7155)
rename 'tag_names' to 'tags'
2025-03-12 17:07:40 +01:00
Mike Fährmann
d40f8a82be [tests] add support for skipping an extractor result test 2025-03-12 16:41:46 +01:00
Mike Fährmann
e1bdcd97e1 [furaffinity] extract 'scraps' metadata (#7015)
boolean value indicating whether a post is part of a user's Scraps
folder or the main gallery
2025-03-12 16:29:16 +01:00
Mike Fährmann
a12ff281e7 merge #7159: [furaffinity] add 'folder' extractor (#1817) 2025-03-12 14:08:32 +01:00
Deer-Spangle
859f1e7d04 [furaffinity] Adding a FuraffinityFolderExtractor, which extracts a single folder
- Ensure FuraffinityGalleryExtractor doesn't detect folder links
- Fix example URL for folder extractor
- Reordering classes a bit
- Another tweak of the regex
- One more go at the regex..
- cleanup
2025-03-12 14:00:50 +01:00
Mike Fährmann
94bbbbb16b [sankaku] fix categorized tags for posts with >100 tags (#7155) 2025-03-11 21:01:46 +01:00
Mike Fährmann
1254c4e3d9 [sankaku] update API URLs (#7154 #7155)
and fix errors due to other changes
2025-03-11 18:45:45 +01:00
Mike Fährmann
518865c7de [civitai] fix/improve query parameter handling (#7138) 2025-03-10 20:18:57 +01:00
Mike Fährmann
ce01835995 [facebook] improve 'date' extraction (#7151)
use 'created_time' as alternative when 'publish_time' isn't available
2025-03-10 17:35:32 +01:00
Mike Fährmann
04464b6cf0 [text] add second argument to 'parse_query_list()' (#7138)
return only values whose name is in 'as_list' as a list
2025-03-10 09:36:50 +01:00
Mike Fährmann
d6281b5685 [tenor] relax '/view/' URL pattern (#6075) 2025-03-08 15:54:46 +01:00
Mike Fährmann
52aa5bad4f [tenor] rename 'content_description' to just 'description' 2025-03-08 10:05:49 +01:00
Mike Fährmann
486e307ecd [reddit] add 'selftext' option (#7111) 2025-03-08 09:01:50 +01:00
Mike Fährmann
18b9ffe8c3 [redgifs:search] support '/search?query=...' URLs (#7118) 2025-03-07 12:35:35 +01:00
Mike Fährmann
8582af3483 [tenor] support '/official/' user URLs (#6075) 2025-03-07 11:34:33 +01:00
Mike Fährmann
639ddc95e7 [tenor] support URLs with language codes (#6075) 2025-03-07 11:19:18 +01:00
Mike Fährmann
984116ada7 [furaffinity] improve 'artist_url' extraction (#7115 #7123) 2025-03-06 14:15:11 +01:00
Mike Fährmann
f5073605f6 [tenor] add 'user' extractor (#6075) 2025-03-04 21:47:16 +01:00
Mike Fährmann
198593bf46 [vsco] fix extracting videos from '/gallery' results (#7113) 2025-03-04 16:10:48 +01:00
Mike Fährmann
3a5adbf644 [vsco] fix 'video' extractor (#7113)
fixes regression introduced in 6420210b0f
2025-03-04 09:43:05 +01:00
Mike Fährmann
2f3265a8ae [tenor] add initial support (#6075) 2025-03-03 19:04:50 +01:00
Mike Fährmann
f232a07faf [danbooru:pool] download posts in pool order (#7091)
- add 'order-posts' option
- add 'num' metadata field for pool position
- update default filenames to order by pool position
2025-03-03 16:46:43 +01:00
Mike Fährmann
db19990a82 [text] allow calling 'extract_iter' with invalid arguments 2025-03-02 10:44:06 +01:00
Luca Russo
95c1feab1c [discord] add single message support 2025-02-26 22:16:53 +01:00
Mike Fährmann
5e87aee32d [tiktok] add 'audio' option (#7060) 2025-02-26 21:02:33 +01:00
Mike Fährmann
d2cad599f7 [twitter] support 'grok' cards content (#7040) 2025-02-25 20:47:31 +01:00
CasualYouTuber31
daac2c6e04 [tiktok] add support (#3061 #4177 #5646 #6878 #6708)
* Add TikTok photo support

#3061
#4177

* Address linting errors

* Fix more test failures

* Forgot to update category names in tests

* Looking into re issue

* Follow default yt-dlp output template

* Fix format string error on 3.5

* Support downloading videos and audio

Respond to comments
Improve archiving and file naming

* Forgot to update supportedsites.md

* Support user profiles

* Fix indentation

* Prevent matching with more than one TikTok extractor

* Fix TikTok regex

* Support TikTok profile avatars

* Fix supportedsites.md

* TikTok: Ignore no formats error

In my limited experience, this doesn't mean that gallery-dl can't download the photo post (but this could mean that you can't download the audio)

* Fix error reporting message

* TikTok: Support more URL formats

vt.tiktok.com
www.tiktok.com/t/

* TikTok: Only download avatar when extracting user profile

* TikTok: Document profile avatar limitation

* TikTok: Add support for www.tiktokv.com/share links

* Address Share -> Sharepost issue

* TikTok: Export post's creation date in JSON (ISO 8601)

* [tiktok] update

* [tiktok] update 'vmpost' handling

just perform a HEAD request and handle its response

* [tiktok] build URLs from post IDs

instead of reusing unchanged input URLs

* [tiktok] combine 'post' and 'sharepost' extractors

* [tiktok] update default filenames

put 'id' and 'num' first to ensure better file order

* [tiktok] improve ytdl usage

- speed up extraction by passing '"extract_flat": True'
- pass more user options and cookies
- pre-define 'TikTokUser' extractor usage

* [tiktok] Add _COOKIES entry to AUTH_MAP

* [tiktok] Always download user avatars

* [tiktok] Add more documentation to supportedsites.md

* [tiktok] Address review comments

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2025-02-25 20:10:48 +01:00
Mike Fährmann
a9853cd273 merge #6781: [bilibili] add 'user-articles-favorite' extractor (#6725) 2025-02-23 18:19:51 +01:00
mmmpipi
e4cc3419c5 add bilibili User Articles FavList support
- fix whitespace
- fix extractor names
- Add favlist url user check
- apply changes
- add test
- update docs/supportedsites
2025-02-23 18:18:45 +01:00
Mike Fährmann
fe958ed5d9 merge #6768: [boosty] add 'direct-messages' extractor 2025-02-23 18:17:10 +01:00
Dominik Prange
ff5f6fe70f [boosty] added new direct message extractor
- formatting
- fixed linting formatting errors
- fixed E999 SyntaxError: invalid syntax
- fixed class naming
- fixed mandatory extractor.boosty.metadata as true requirement
- update
  - apply changes
  - add test
  - update docs/supportedsites
- improve 'dialog' pagination logic
2025-02-23 18:14:59 +01:00
Mike Fährmann
613f05afa3 fix cmdline arguments not overriding extractor-downloader options 2025-02-22 17:40:27 +01:00
Mike Fährmann
18ed39c1cf implement 'downloader' options per extractor category
by setting options inside 'http' or 'ytdl' inside extractor options
or inside subcategory options

{
    "extractor": {
        "mastodon": {
            "http": {
                "rate": "10k"
            }
        },
        "mastodon.social": {
            "http": {
                "rate": "100k"
            }
        }
    },
    "downloader": {
        "rate": "100m"
    }
}

Sets download speed to
-  10k for mastodon.social URLs
- 100k for mastodon sites in general
- 100m for all other sites
2025-02-22 10:08:59 +01:00
Mike Fährmann
52d4e1a100 [imhentai] inherit from BaseExtractor
combine all imhentai-like sites into one module
2025-02-19 22:14:52 +01:00
Mike Fährmann
7a11d02e7a [reddit] restrict subreddit search results (#7025) 2025-02-19 20:05:48 +01:00
Mike Fährmann
d4c56b08d7 [hentaiera] add support (#3046 #6952 #7020) 2025-02-19 17:42:04 +01:00
Mike Fährmann
4396029d36 [furry34] add support (#1078 #7018) 2025-02-19 16:35:48 +01:00
Mike Fährmann
82493a6672 [hentairox] add support (#7003) 2025-02-18 21:45:30 +01:00
Luca Russo
95c446fcd1 [discord] add support (#6836)
* first commit

* add --

* skip video embeds

* fix typo

* removed ambiguity

* add category support

* code tweaks

* more reliable embed extraction

* handle 403 errors (testing done)

* added "parent_id" keyword

* added "parent", "parent_type" keywords

the extractor should be now ready to merge!

* removed unnecessary dict unpacking

* added empty text messages extraction

* added "channel_topic"

* even more metadata extraction

can now extract all embeds images & text, as well as server banners. also code is much better.

* added user avatar and banner

* better pagination

* fix regression

* minor tweaks

* Made requested changes
2025-02-18 18:45:39 +01:00