Commit Graph

1226 Commits

Author SHA1 Message Date
Luca Russo
95c1feab1c [discord] add single message support 2025-02-26 22:16:53 +01:00
Mike Fährmann
5e87aee32d [tiktok] add 'audio' option (#7060) 2025-02-26 21:02:33 +01:00
Mike Fährmann
d2cad599f7 [twitter] support 'grok' cards content (#7040) 2025-02-25 20:47:31 +01:00
CasualYouTuber31
daac2c6e04 [tiktok] add support (#3061 #4177 #5646 #6878 #6708)
* Add TikTok photo support

#3061
#4177

* Address linting errors

* Fix more test failures

* Forgot to update category names in tests

* Looking into re issue

* Follow default yt-dlp output template

* Fix format string error on 3.5

* Support downloading videos and audio

Respond to comments
Improve archiving and file naming

* Forgot to update supportedsites.md

* Support user profiles

* Fix indentation

* Prevent matching with more than one TikTok extractor

* Fix TikTok regex

* Support TikTok profile avatars

* Fix supportedsites.md

* TikTok: Ignore no formats error

In my limited experience, this doesn't mean that gallery-dl can't download the photo post (but this could mean that you can't download the audio)

* Fix error reporting message

* TikTok: Support more URL formats

vt.tiktok.com
www.tiktok.com/t/

* TikTok: Only download avatar when extracting user profile

* TikTok: Document profile avatar limitation

* TikTok: Add support for www.tiktokv.com/share links

* Address Share -> Sharepost issue

* TikTok: Export post's creation date in JSON (ISO 8601)

* [tiktok] update

* [tiktok] update 'vmpost' handling

just perform a HEAD request and handle its response

* [tiktok] build URLs from post IDs

instead of reusing unchanged input URLs

* [tiktok] combine 'post' and 'sharepost' extractors

* [tiktok] update default filenames

put 'id' and 'num' first to ensure better file order

* [tiktok] improve ytdl usage

- speed up extraction by passing '"extract_flat": True'
- pass more user options and cookies
- pre-define 'TikTokUser' extractor usage

* [tiktok] Add _COOKIES entry to AUTH_MAP

* [tiktok] Always download user avatars

* [tiktok] Add more documentation to supportedsites.md

* [tiktok] Address review comments

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2025-02-25 20:10:48 +01:00
Mike Fährmann
a9853cd273 merge #6781: [bilibili] add 'user-articles-favorite' extractor (#6725) 2025-02-23 18:19:51 +01:00
mmmpipi
e4cc3419c5 add bilibili User Articles FavList support
- fix whitespace
- fix extractor names
- Add favlist url user check
- apply changes
- add test
- update docs/supportedsites
2025-02-23 18:18:45 +01:00
Mike Fährmann
fe958ed5d9 merge #6768: [boosty] add 'direct-messages' extractor 2025-02-23 18:17:10 +01:00
Dominik Prange
ff5f6fe70f [boosty] added new direct message extractor
- formatting
- fixed linting formatting errors
- fixed E999 SyntaxError: invalid syntax
- fixed class naming
- fixed mandatory extractor.boosty.metadata as true requirement
- update
  - apply changes
  - add test
  - update docs/supportedsites
- improve 'dialog' pagination logic
2025-02-23 18:14:59 +01:00
Mike Fährmann
613f05afa3 fix cmdline arguments not overriding extractor-downloader options 2025-02-22 17:40:27 +01:00
Mike Fährmann
18ed39c1cf implement 'downloader' options per extractor category
by setting options inside 'http' or 'ytdl' inside extractor options
or inside subcategory options

{
    "extractor": {
        "mastodon": {
            "http": {
                "rate": "10k"
            }
        },
        "mastodon.social": {
            "http": {
                "rate": "100k"
            }
        }
    },
    "downloader": {
        "rate": "100m"
    }
}

Sets download speed to
-  10k for mastodon.social URLs
- 100k for mastodon sites in general
- 100m for all other sites
2025-02-22 10:08:59 +01:00
Mike Fährmann
52d4e1a100 [imhentai] inherit from BaseExtractor
combine all imhentai-like sites into one module
2025-02-19 22:14:52 +01:00
Mike Fährmann
7a11d02e7a [reddit] restrict subreddit search results (#7025) 2025-02-19 20:05:48 +01:00
Mike Fährmann
d4c56b08d7 [hentaiera] add support (#3046 #6952 #7020) 2025-02-19 17:42:04 +01:00
Mike Fährmann
4396029d36 [furry34] add support (#1078 #7018) 2025-02-19 16:35:48 +01:00
Mike Fährmann
82493a6672 [hentairox] add support (#7003) 2025-02-18 21:45:30 +01:00
Luca Russo
95c446fcd1 [discord] add support (#6836)
* first commit

* add --

* skip video embeds

* fix typo

* removed ambiguity

* add category support

* code tweaks

* more reliable embed extraction

* handle 403 errors (testing done)

* added "parent_id" keyword

* added "parent", "parent_type" keywords

the extractor should be now ready to merge!

* removed unnecessary dict unpacking

* added empty text messages extraction

* added "channel_topic"

* even more metadata extraction

can now extract all embeds images & text, as well as server banners. also code is much better.

* added user avatar and banner

* better pagination

* fix regression

* minor tweaks

* Made requested changes
2025-02-18 18:45:39 +01:00
Mike Fährmann
7ae09c6b29 [imgur] add support for (hidden) personal posts (#6990)
https://imgur.com/user/me
https://imgur.com/user/me/hidden
2025-02-14 19:28:55 +01:00
Mike Fährmann
cd9fa1ef75 [bunkr] implement fast '--range' support (#6985) 2025-02-14 18:21:32 +01:00
Mike Fährmann
195b52284a [tests] move 'e621:frontend' tests into regular results/e621.py
having both e621.py and E621.py in the same directory messes with
Windows

6e919a3695 (commitcomment-152557303)
2025-02-14 17:44:14 +01:00
Mike Fährmann
51f978e027 [weibo] add 'movies' option (#6988)
disable 'movie' downloads by default
2025-02-13 18:01:20 +01:00
Mike Fährmann
b8b541fded [itaku] support gallery section URLs (#6951) 2025-02-13 14:29:36 +01:00
Mike Fährmann
1cf2870f81 [patreon] extract 'campaign' metadata (#6989) 2025-02-13 14:13:43 +01:00
Mike Fährmann
f1f27eb2ab [vsco] support '/video/' URLs (#4295 #6973)
requires yt-dlp/youtube-dl to handle m3u8 manifests
2025-02-12 19:12:00 +01:00
Mike Fährmann
d1a8142dcf [bunkr] provide fallback URLs for 403 download links (#6732 #6972) 2025-02-12 19:12:00 +01:00
Mike Fährmann
55034d9638 [imhentai] add support (#1660 #3046 #3824 #4338 #5936) 2025-02-10 21:42:07 +01:00
Mike Fährmann
be77465e1b [weebcentral] fix extracting wrong number of chapter pages (#6966)
When downloading multiple chapters at once, all chapters after the first
one would download only as many pages per chapter as the first one had,
due to reusing a cached/shared dict in the wrong way.
2025-02-10 16:00:13 +01:00
Mike Fährmann
587205bf68 [pixiv] prevent exceptions during 'comments' extraction (#6965)
- wrap in try-except block
- do not attempt to fetch comments for 'sanity_level' works
2025-02-10 09:56:09 +01:00
Mike Fährmann
6c2b6d50cc [patreon] support '/profile/creators' URLs 2025-02-09 15:52:54 +01:00
Mike Fährmann
23c4bc8ac5 [b4k] keep support for previous 'arch.b4k.co' domain 2025-02-09 11:11:38 +01:00
NecRaul
dae82f1519 [b4k] update domain to arch.b4k.dev 2025-02-09 01:28:23 +04:00
Mike Fährmann
28385bec7a [bunkr] extract 'id_url' metadata (#6935)
and use it as 'id' alternative instead of 'name' in default archive IDs
2025-02-06 20:40:35 +01:00
Mike Fährmann
b9675ea764 [bunkr] update default archive ID format (#6935)
use 'name' when there is no proper 'id' value available.
2025-02-05 21:54:43 +01:00
Mike Fährmann
5807daa19a [issuu] unescape HTML entities 2025-02-02 18:33:18 +01:00
Mike Fährmann
6c9b20fe45 [philomena] download 'full' URLs (#6922)
'view_url' URLs sometimes result in 404 errors
2025-02-02 18:23:46 +01:00
Mike Fährmann
463e123283 [twibooru] match URLs with 'www' subdomain (#6903) 2025-01-30 19:20:02 +01:00
Mike Fährmann
de81f8e7c7 merge #6891: [vsco] fix 'JSONDecodeError' (#6887) 2025-01-28 14:46:31 +01:00
CasualYT31
a8c4665b5a VSCO: prevPageToken Bugfix
#6887
2025-01-28 11:50:31 +00:00
Mike Fährmann
1b5e0c0e87 [issuu] fix 'user' extractor 2025-01-27 21:56:11 +01:00
Mike Fährmann
d110dfd2da [tests] update extractor results 2025-01-27 17:15:32 +01:00
Mike Fährmann
bf361ec7d3 [urlgalleries] support new URL format
... but the site itself is broken, i.e. image pages are empty.
2025-01-26 17:20:28 +01:00
Mike Fährmann
254ffd3fcd [shimmie2] remove 'tentaclerape.net'
"Site Not Found"
2025-01-26 17:02:07 +01:00
Mike Fährmann
d2164af63d [komikcast] update domain to 'komikcast.la' 2025-01-26 16:54:14 +01:00
Mike Fährmann
804fd048ef [szurubooru] remove 'booru.foalcon.com'
DNS record of foalcon.com no longer exists
2025-01-26 16:42:49 +01:00
Mike Fährmann
b271a874ed [fanleaks] remove module
DNS record of fanleaks.club no longer exists
2025-01-26 16:35:46 +01:00
Mike Fährmann
f5add4048e [nekohouse] fix pagination (#6871)
use distinct names for URL values
2025-01-24 10:18:52 +01:00
Mike Fährmann
4d609e284a merge #6833: [kemonoparty] Support /posts endpoint and Creator Tag Calls 2025-01-21 19:22:00 +01:00
BishopRed
b11434a069 [kemonoparty] Support /posts endpoint and Creator Tag Calls
- Adding support for calling a creator with a tag selected.
    It is using a legacy endpoint but there is no other way currently
    documented to get the users post filtered by a tag.
- Fixing the User Tags feature to be paginated
    offset is not defined in the API but it is supported.
- Fixed the `/posts` endpoint not working:
    1. Added check along with metadata to make sure there is a
       creator/service information as that is a requirement
    2. Fixed the parameter from tags -> tag.
    3. Fixed the _paginate call to exit correctly when there is
       a key required for the data (it was prematurely exiting)
- Adding a type of caching mechanism for the metadata/user information.
    The current logic would work just fine if looking up for a
    singular user, however for the multiple posts via normal
    filtering would cause it to either:
    This builds a local cache during the process so it should
    only make a call for the user info once during the process.
- Updating to meet standards
    Fixes
      1. Reset formatting for unnecessary line changes
      2. Removed Type Hinting
      3.Replaced f-string with "".format
   Updates
     Renamed function creator_posts_tags -> creator_tagged_posts
     for clarity of what it does (get posts tags vs get tagged posts)
- Fixing check for the length of response:
    1. If it is list - just check len
    2. If there is a key - check that the key length is less
       than the batch.
- add test for '?tag=...' user URLs
    plus some code simplifications
2025-01-21 19:20:22 +01:00
Mike Fährmann
05fa6dd354 [nekohouse] add initial support (#5241, #6738) 2025-01-20 20:15:34 +01:00
Mike Fährmann
f867e690c1 merge #6855: [turboimagehost] add support for galleries 2025-01-19 17:51:48 +01:00
arebokert
556fbb1a44 [turboimagehost] add support for galleries
- added support
- raise error if gallery not found
- fix test
- fix lint issues
- simplify
2025-01-19 17:28:45 +01:00