[tiktok] remove yt-dlp dependency & add support for more post types (#8715)
#7246 #8035 #8466 #8730 * [tiktok] support extracting videos directly without yt-dlp * [tiktok] support extracting users directly without yt-dlp * [tiktok] fixing logic, tests, linting errors * [tiktok] implement tiktok-range support for non-yt-dlp user extractor * [tiktok] Skip range filter if no ranges are given * [tiktok] Remove debug code * [tiktok] only check for faulty device IDs during the first couple of passes I think the original yt-dlp solution assumes that if a device ID works once, it will always work. Plus, my approach would cause needless retries in certain cases if hasMorePrevious does end up being wrong like the original algorithm accounts for. So let's copy the original algorithm here, too. * [tiktok] support stories * [tiktok] you can now extract audio without extracting photos * [tiktok] add TiktokFollowingExtractor * [tiktok] update supportedsites to include stories * [tiktok] Keep tiktok-range option for no content user account test It acts as a nice guard against that account suddenly having lots of posts to extract * [tiktok] TiktokUserExtractor and TiktokFollowingExtractor rewrite * [tiktok] Fix avatar naming convention to match that of posts * [tiktok] remove type hints for compatibility with older Python versions * [tiktok] Improve performance of TiktokFollowingExtractor This was largely achieved using the story/batch/item_list endpoint * [tiktok] Forgot to run flake8 * [tiktok] remove old constant * [tiktok] Support order-posts config item * [tiktok] flake8 * [tiktok] Older Python versions don't support match * [tiktok] always ask for posts in chronological order when in "desc" mode We should aim to avoid having pinned posts returned before non-pinned ones * [tiktok] Add liked posts extraction * [tiktok] Add reposts extraction * [tiktok] Add saved posts extraction * cleanup imports * remove '# MARK:' comments * remove & simplify 'except' statements KeyboardInterrupt & SystemExit inherit from BaseException (not Exception) and therefore don't need special handling * split 'user' extractor * move PATTERNs into their respective functions * use dict comprehensions * add only-matching test URLs for split user extractors * update config docs rename 'tiktok-user-extractor' to 'ytdl' * document '"popular"' 'order-posts' value * inline and remove 'util.chunk()'
This commit is contained in:
@@ -5845,6 +5845,16 @@ Description
|
||||
Download video covers.
|
||||
|
||||
|
||||
extractor.tiktok.photos
|
||||
-----------------------
|
||||
Type
|
||||
``bool``
|
||||
Default
|
||||
``true``
|
||||
Description
|
||||
Download photos.
|
||||
|
||||
|
||||
extractor.tiktok.videos
|
||||
-----------------------
|
||||
Type
|
||||
@@ -5855,18 +5865,52 @@ Description
|
||||
Download videos using |ytdl|.
|
||||
|
||||
|
||||
extractor.tiktok.user.avatar
|
||||
----------------------------
|
||||
extractor.tiktok.tiktok-range
|
||||
-----------------------------
|
||||
Type
|
||||
``string``
|
||||
Default
|
||||
``""``
|
||||
Example
|
||||
``"1-20"``
|
||||
Description
|
||||
Range or playlist indices of ``tiktok`` posts to extract.
|
||||
|
||||
When using `ytdl`, see
|
||||
`ytdl/playlist_items <https://github.com/yt-dlp/yt-dlp/blob/3042afb5fe342d3a00de76704cd7de611acc350e/yt_dlp/YoutubeDL.py#L289>`__
|
||||
for details.
|
||||
|
||||
|
||||
extractor.tiktok.posts.order-posts
|
||||
----------------------------------
|
||||
Type
|
||||
``string``
|
||||
Default
|
||||
``"desc"``
|
||||
Description
|
||||
Controls the order in which
|
||||
posts are processed.
|
||||
|
||||
``"asc"`` | ``"reverse"``
|
||||
Ascending order (oldest first)
|
||||
``"desc"``
|
||||
Descending order (newest first)
|
||||
``"popular"``
|
||||
*Popular* order
|
||||
|
||||
|
||||
extractor.tiktok.posts.ytdl
|
||||
---------------------------
|
||||
Type
|
||||
``bool``
|
||||
Default
|
||||
``true``
|
||||
``false``
|
||||
Description
|
||||
Download user avatars.
|
||||
Extract user posts with |ytdl|
|
||||
|
||||
|
||||
extractor.tiktok.user.module
|
||||
----------------------------
|
||||
extractor.tiktok.posts.module
|
||||
-----------------------------
|
||||
Type
|
||||
|Module|_
|
||||
Default
|
||||
@@ -5878,20 +5922,25 @@ Description
|
||||
See `extractor.ytdl.module`_.
|
||||
|
||||
|
||||
extractor.tiktok.user.tiktok-range
|
||||
----------------------------------
|
||||
extractor.tiktok.user.include
|
||||
-----------------------------
|
||||
Type
|
||||
``string``
|
||||
* ``string``
|
||||
* ``list`` of ``strings``
|
||||
Default
|
||||
``""``
|
||||
Example
|
||||
``"1-20"``
|
||||
``["avatar", "posts"]``
|
||||
Description
|
||||
Range or playlist indices of ``tiktok`` user posts to extract.
|
||||
|
||||
See
|
||||
`ytdl/playlist_items <https://github.com/yt-dlp/yt-dlp/blob/3042afb5fe342d3a00de76704cd7de611acc350e/yt_dlp/YoutubeDL.py#L289>`__
|
||||
for details.
|
||||
A (comma-separated) list of subcategories to include
|
||||
when processing a user profile.
|
||||
Supported Values
|
||||
* ``avatar``
|
||||
* ``posts``
|
||||
* ``reposts``
|
||||
* ``stories``
|
||||
* ``likes``
|
||||
* ``saved``
|
||||
Note
|
||||
It is possible to use ``"all"`` instead of listing all values separately.
|
||||
|
||||
|
||||
extractor.tumblr.avatar
|
||||
|
||||
@@ -819,13 +819,18 @@
|
||||
"tiktok":
|
||||
{
|
||||
"audio" : true,
|
||||
"videos": true,
|
||||
"covers": false,
|
||||
"photos": true,
|
||||
"videos": true,
|
||||
"tiktok-range": "",
|
||||
|
||||
"posts": {
|
||||
"order-posts": "desc",
|
||||
"ytdl" : false,
|
||||
"module": null
|
||||
},
|
||||
"user": {
|
||||
"avatar": true,
|
||||
"module": null,
|
||||
"tiktok-range": ""
|
||||
"include": ["avatar", "posts"]
|
||||
}
|
||||
},
|
||||
"tsumino":
|
||||
|
||||
@@ -1090,7 +1090,7 @@ Consider all listed sites to potentially be NSFW.
|
||||
<tr id="tiktok" title="tiktok">
|
||||
<td>TikTok</td>
|
||||
<td>https://www.tiktok.com/</td>
|
||||
<td>Posts, User Profiles, VM Posts</td>
|
||||
<td>Avatars, Followed Users (Stories Only), Likes, Posts, User Posts, Reposts, Saved Posts, Stories, User Profiles, VM Posts</td>
|
||||
<td><a href="https://github.com/mikf/gallery-dl#cookies">Cookies</a></td>
|
||||
</tr>
|
||||
<tr id="tmohentai" title="tmohentai">
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -234,6 +234,7 @@ SUBCATEGORY_MAP = {
|
||||
"media" : "Media Files",
|
||||
"popular": "Popular Images",
|
||||
"recent" : "Recent Images",
|
||||
"saved" : "Saved Posts",
|
||||
"search" : "Search Results",
|
||||
"status" : "Images from Statuses",
|
||||
"tag" : "Tag Searches",
|
||||
@@ -333,7 +334,6 @@ SUBCATEGORY_MAP = {
|
||||
},
|
||||
"instagram": {
|
||||
"posts": "",
|
||||
"saved": "Saved Posts",
|
||||
"tagged": "Tagged Posts",
|
||||
"stories-tray": "Stories Home Tray",
|
||||
},
|
||||
@@ -422,7 +422,9 @@ SUBCATEGORY_MAP = {
|
||||
"asset": "Individual Assets",
|
||||
},
|
||||
"tiktok": {
|
||||
"posts": "User Posts",
|
||||
"vmpost": "VM Posts",
|
||||
"following": "Followed Users (Stories Only)",
|
||||
},
|
||||
"tumblr": {
|
||||
"day": "Days",
|
||||
|
||||
@@ -8,6 +8,9 @@ from gallery_dl.extractor import tiktok
|
||||
|
||||
PATTERN = r"https://p1[69]-[^/?#.]+\.tiktokcdn[^/?#.]*\.com/[^/?#]+/\w+~.*\.jpe?g"
|
||||
PATTERN_WITH_AUDIO = r"(?:" + PATTERN + r"|https://v\d+m?\.tiktokcdn[^/?#.]*\.com/[^?#]+\?[^/?#]+)"
|
||||
VIDEO_PATTERN = r"https://v1[69]-webapp-prime.tiktok.com/video/tos/[^?#]+\?[^/?#]+"
|
||||
OLD_VIDEO_PATTERN = r"https://www.tiktok.com/aweme/v1/play/\?[^/?#]+"
|
||||
COMBINED_VIDEO_PATTERN = r"(?:" + VIDEO_PATTERN + r")|(?:" + OLD_VIDEO_PATTERN + r")"
|
||||
USER_PATTERN = r"(https://www.tiktok.com/@([\w_.-]+)/video/(\d+)|" + PATTERN + r")"
|
||||
|
||||
|
||||
@@ -40,7 +43,7 @@ __tests__ = (
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@d4vinefem/photo/7449575367024626974",
|
||||
"#url" : "https://www.tiktok.com/@hullcity/photo/7557376330036153622",
|
||||
"#comment" : "/photo/ link: single photo",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
@@ -49,7 +52,7 @@ __tests__ = (
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@d4vinefem/video/7449575367024626974",
|
||||
"#url" : "https://www.tiktok.com/@hullcity/video/7557376330036153622",
|
||||
"#comment" : "/video/ link: single photo",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
@@ -58,7 +61,7 @@ __tests__ = (
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktokv.com/share/video/7449575367024626974",
|
||||
"#url" : "https://www.tiktokv.com/share/video/7557376330036153622",
|
||||
"#comment" : "www.tiktokv.com link: single photo",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
@@ -67,7 +70,7 @@ __tests__ = (
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@.mcfc.central/photo/7449701420934122785",
|
||||
"#url" : "https://www.tiktok.com/@hullcity/photo/7553302113757990166",
|
||||
"#comment" : "/photo/ link: few photos",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
@@ -76,7 +79,7 @@ __tests__ = (
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@.mcfc.central/video/7449701420934122785",
|
||||
"#url" : "https://www.tiktok.com/@hullcity/video/7553302113757990166",
|
||||
"#comment" : "/video/ link: few photos",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
@@ -85,7 +88,7 @@ __tests__ = (
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktokv.com/share/video/7449701420934122785",
|
||||
"#url" : "https://www.tiktokv.com/share/video/7553302113757990166",
|
||||
"#comment" : "www.tiktokv.com link: few photos",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
@@ -94,12 +97,12 @@ __tests__ = (
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@ughuwhguweghw/video/1",
|
||||
"#comment" : "deleted post",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
"#options" : {"videos": False, "audio": False},
|
||||
"count" : 0,
|
||||
"#url" : "https://www.tiktok.com/@ughuwhguweghw/video/1",
|
||||
"#comment" : "deleted post",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
"#options" : {"videos": False, "audio": False},
|
||||
"#count" : 0,
|
||||
},
|
||||
|
||||
{
|
||||
@@ -107,10 +110,19 @@ __tests__ = (
|
||||
"#comment" : "Video post",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
"#results" : "ytdl:https://www.tiktok.com/@memezar/video/7449708266168274208",
|
||||
"#pattern" : COMBINED_VIDEO_PATTERN,
|
||||
"#options" : {"videos": True, "audio": True},
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@memezar/video/7449708266168274208",
|
||||
"#comment" : "Video post (via yt-dlp)",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
"#results" : "ytdl:https://www.tiktok.com/@memezar/video/7449708266168274208",
|
||||
"#options" : {"videos": "ytdl", "audio": True},
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@memezar/video/7449708266168274208",
|
||||
"#comment" : "video post cover image",
|
||||
@@ -126,7 +138,7 @@ __tests__ = (
|
||||
"#comment" : "Video post as a /photo/ link",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
"#results" : "ytdl:https://www.tiktok.com/@memezar/video/7449708266168274208",
|
||||
"#pattern" : COMBINED_VIDEO_PATTERN,
|
||||
"#options" : {"videos": True, "audio": True},
|
||||
},
|
||||
|
||||
@@ -155,7 +167,7 @@ __tests__ = (
|
||||
"#comment" : "Video post as a share link",
|
||||
"#category" : ("", "tiktok", "post"),
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
"#results" : "ytdl:https://www.tiktok.com/@/video/7449708266168274208",
|
||||
"#pattern" : COMBINED_VIDEO_PATTERN,
|
||||
"#options" : {"videos": True},
|
||||
},
|
||||
|
||||
@@ -196,6 +208,7 @@ __tests__ = (
|
||||
"#comment" : "no 'author' (#8189)",
|
||||
"#class" : tiktok.TiktokPostExtractor,
|
||||
"#results" : "ytdl:https://www.tiktok.com/@veronicaperasso_1/video/7212008840433274118",
|
||||
"#options" : {"videos": "ytdl"},
|
||||
},
|
||||
|
||||
{
|
||||
@@ -260,9 +273,50 @@ __tests__ = (
|
||||
"#category" : ("", "tiktok", "user"),
|
||||
"#class" : tiktok.TiktokUserExtractor,
|
||||
"#pattern" : USER_PATTERN,
|
||||
"#count" : 11, # 10 posts + 1 avatar
|
||||
"#options" : {"videos": True, "audio": True, "tiktok-range": "1-10"},
|
||||
},
|
||||
|
||||
# order-posts currently has no effect if logged-in cookies aren't used.
|
||||
|
||||
# {
|
||||
# "#url" : "https://www.tiktok.com/@chillezy",
|
||||
# "#comment" : "User profile ascending order",
|
||||
# "#category" : ("", "tiktok", "user"),
|
||||
# "#class" : tiktok.TiktokUserExtractor,
|
||||
# "#results" : "https://www.tiktok.com/@chillezy/video/7112145009356344622",
|
||||
# "#options" : {"videos": True, "audio": True, "avatar": False, "tiktok-range": "1", "order-posts": "asc"},
|
||||
# },
|
||||
|
||||
# {
|
||||
# "#url" : "https://www.tiktok.com/@chillezy",
|
||||
# "#comment" : "User profile popular order",
|
||||
# "#category" : ("", "tiktok", "user"),
|
||||
# "#class" : tiktok.TiktokUserExtractor,
|
||||
# "#results" : "https://www.tiktok.com/@chillezy/video/7240568259186019630",
|
||||
# "#options" : {"videos": True, "audio": True, "avatar": False, "tiktok-range": "1", "order-posts": "popular"},
|
||||
# },
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@chillezy",
|
||||
"#comment" : "User profile via yt-dlp",
|
||||
"#category" : ("", "tiktok", "user"),
|
||||
"#class" : tiktok.TiktokUserExtractor,
|
||||
"#pattern" : USER_PATTERN,
|
||||
"#count" : 11, # 10 posts + 1 avatar
|
||||
"#options" : {"videos": True, "audio": True, "tiktok-range": "1-10", "tiktok-user-extractor": "ytdl"},
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@chillezy",
|
||||
"#comment" : "User profile without avatar",
|
||||
"#category" : ("", "tiktok", "user"),
|
||||
"#class" : tiktok.TiktokUserExtractor,
|
||||
"#pattern" : USER_PATTERN,
|
||||
"#count" : 10, # 10 posts
|
||||
"#options" : {"videos": True, "audio": True, "avatar": False, "tiktok-range": "1-10"},
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@joeysc14/",
|
||||
"#comment" : "Public user profile with no content",
|
||||
@@ -270,7 +324,37 @@ __tests__ = (
|
||||
"#class" : tiktok.TiktokUserExtractor,
|
||||
"#pattern" : PATTERN,
|
||||
"#options" : {"videos": False, "tiktok-range": "1"},
|
||||
"#count" : 1,
|
||||
"#count" : 1, # 1 avatar
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@chillezy/avatar",
|
||||
"#class" : tiktok.TiktokAvatarExtractor,
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@chillezy/posts",
|
||||
"#class" : tiktok.TiktokPostsExtractor,
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@chillezy/reposts",
|
||||
"#class" : tiktok.TiktokRepostsExtractor,
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@chillezy/stories",
|
||||
"#class" : tiktok.TiktokStoriesExtractor,
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@chillezy/likes",
|
||||
"#class" : tiktok.TiktokLikesExtractor,
|
||||
},
|
||||
|
||||
{
|
||||
"#url" : "https://www.tiktok.com/@chillezy/saved",
|
||||
"#class" : tiktok.TiktokSavedExtractor,
|
||||
},
|
||||
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user