[tiktok] remove yt-dlp dependency & add support for more post types (#8715)

#7246 #8035 #8466 #8730

* [tiktok] support extracting videos directly without yt-dlp
* [tiktok] support extracting users directly without yt-dlp
* [tiktok] fixing logic, tests, linting errors
* [tiktok] implement tiktok-range support for non-yt-dlp user extractor
* [tiktok] Skip range filter if no ranges are given
* [tiktok] Remove debug code
* [tiktok] only check for faulty device IDs during the first couple of passes
    I think the original yt-dlp solution assumes that if a device ID works once, it will always work.
    Plus, my approach would cause needless retries in certain cases if hasMorePrevious does end up being wrong like the original algorithm accounts for. So let's copy the original algorithm here, too.
* [tiktok] support stories
* [tiktok] you can now extract audio without extracting photos
* [tiktok] add TiktokFollowingExtractor
* [tiktok] update supportedsites to include stories
* [tiktok] Keep tiktok-range option for no content user account test
    It acts as a nice guard against that account suddenly having lots of posts to extract
* [tiktok] TiktokUserExtractor and TiktokFollowingExtractor rewrite
* [tiktok] Fix avatar naming convention to match that of posts
* [tiktok] remove type hints for compatibility with older Python versions
* [tiktok] Improve performance of TiktokFollowingExtractor
    This was largely achieved using the story/batch/item_list endpoint
* [tiktok] Forgot to run flake8
* [tiktok] remove old constant
* [tiktok] Support order-posts config item
* [tiktok] flake8
* [tiktok] Older Python versions don't support match
* [tiktok] always ask for posts in chronological order when in "desc" mode
    We should aim to avoid having pinned posts returned before non-pinned ones
* [tiktok] Add liked posts extraction
* [tiktok] Add reposts extraction
* [tiktok] Add saved posts extraction

* cleanup imports
* remove '# MARK:' comments
* remove & simplify 'except' statements
    KeyboardInterrupt & SystemExit inherit from BaseException (not Exception)
    and therefore don't need special handling
* split 'user' extractor
* move PATTERNs into their respective functions
* use dict comprehensions
* add only-matching test URLs for split user extractors
* update config docs
    rename 'tiktok-user-extractor' to 'ytdl'
* document '"popular"' 'order-posts' value
* inline and remove 'util.chunk()'
This commit is contained in:
CasualYouTuber31
2025-12-30 16:17:57 +00:00
committed by GitHub
parent c8c4575c7f
commit a6c845bdc8
6 changed files with 1299 additions and 112 deletions

View File

@@ -5845,6 +5845,16 @@ Description
Download video covers.
extractor.tiktok.photos
-----------------------
Type
``bool``
Default
``true``
Description
Download photos.
extractor.tiktok.videos
-----------------------
Type
@@ -5855,18 +5865,52 @@ Description
Download videos using |ytdl|.
extractor.tiktok.user.avatar
----------------------------
extractor.tiktok.tiktok-range
-----------------------------
Type
``string``
Default
``""``
Example
``"1-20"``
Description
Range or playlist indices of ``tiktok`` posts to extract.
When using `ytdl`, see
`ytdl/playlist_items <https://github.com/yt-dlp/yt-dlp/blob/3042afb5fe342d3a00de76704cd7de611acc350e/yt_dlp/YoutubeDL.py#L289>`__
for details.
extractor.tiktok.posts.order-posts
----------------------------------
Type
``string``
Default
``"desc"``
Description
Controls the order in which
posts are processed.
``"asc"`` | ``"reverse"``
Ascending order (oldest first)
``"desc"``
Descending order (newest first)
``"popular"``
*Popular* order
extractor.tiktok.posts.ytdl
---------------------------
Type
``bool``
Default
``true``
``false``
Description
Download user avatars.
Extract user posts with |ytdl|
extractor.tiktok.user.module
----------------------------
extractor.tiktok.posts.module
-----------------------------
Type
|Module|_
Default
@@ -5878,20 +5922,25 @@ Description
See `extractor.ytdl.module`_.
extractor.tiktok.user.tiktok-range
----------------------------------
extractor.tiktok.user.include
-----------------------------
Type
``string``
* ``string``
* ``list`` of ``strings``
Default
``""``
Example
``"1-20"``
``["avatar", "posts"]``
Description
Range or playlist indices of ``tiktok`` user posts to extract.
See
`ytdl/playlist_items <https://github.com/yt-dlp/yt-dlp/blob/3042afb5fe342d3a00de76704cd7de611acc350e/yt_dlp/YoutubeDL.py#L289>`__
for details.
A (comma-separated) list of subcategories to include
when processing a user profile.
Supported Values
* ``avatar``
* ``posts``
* ``reposts``
* ``stories``
* ``likes``
* ``saved``
Note
It is possible to use ``"all"`` instead of listing all values separately.
extractor.tumblr.avatar

View File

@@ -819,13 +819,18 @@
"tiktok":
{
"audio" : true,
"videos": true,
"covers": false,
"photos": true,
"videos": true,
"tiktok-range": "",
"posts": {
"order-posts": "desc",
"ytdl" : false,
"module": null
},
"user": {
"avatar": true,
"module": null,
"tiktok-range": ""
"include": ["avatar", "posts"]
}
},
"tsumino":

View File

@@ -1090,7 +1090,7 @@ Consider all listed sites to potentially be NSFW.
<tr id="tiktok" title="tiktok">
<td>TikTok</td>
<td>https://www.tiktok.com/</td>
<td>Posts, User Profiles, VM Posts</td>
<td>Avatars, Followed Users (Stories Only), Likes, Posts, User Posts, Reposts, Saved Posts, Stories, User Profiles, VM Posts</td>
<td><a href="https://github.com/mikf/gallery-dl#cookies">Cookies</a></td>
</tr>
<tr id="tmohentai" title="tmohentai">