Commit Graph

1911 Commits

Author SHA1 Message Date
Mike Fährmann
98eb857794 [pp:exec] use non-UNC path replacements (#8879)
provide '{_path_unc}' and '{_directory_unc}' replacement fields
2026-02-14 12:09:54 +01:00
Mike Fährmann
d99c8c1320 [manganelo] fix 'manga' extractor (#9059) 2026-02-14 09:17:48 +01:00
Mike Fährmann
f1da162d72 [common] include duration in 'wait()' output 2026-02-13 20:44:46 +01:00
Mike Fährmann
34e402a01d [koofr] improve subdirectory handling - re-add 'num' & 'count' 2026-02-12 21:37:18 +01:00
Mike Fährmann
eb4e44401b [util] implement 'build_duration_func_ex()' 2026-02-12 17:28:46 +01:00
Mike Fährmann
136b7d40b5 [pp:ugoira] fix processing '.gif' frames
use 'concat' demuxer to combine frames for mkvmerge
https://github.com/danbooru/danbooru/pull/6103
https://github.com/danbooru/danbooru/pull/6241
2026-02-11 16:42:22 +01:00
Mike Fährmann
04905ff7a2 [weebdex] fix 'chapter-reverse' (#9041)
fixes regression introduced in 56168fbc87
2026-02-11 09:15:56 +01:00
Mike Fährmann
448ec12b8b [tests/extractor] test 'extractor.find()' results 2026-02-10 20:54:54 +01:00
Mike Fährmann
102f8da294 [reddit] fix '/external-preview' embed downloads (#9037)
don't strip URL parameters
2026-02-10 20:45:51 +01:00
Mike Fährmann
ace8c50278 [imagefap] handle '/galleries?folderid=0' URLs (#9034) 2026-02-10 10:56:30 +01:00
Mike Fährmann
ce8d61df66 [imagefap] don't return anything for empty profiles (#9034) 2026-02-10 10:28:49 +01:00
Mike Fährmann
640d5f1621 [fikfap] improve URL patterns
use '[^/?#]+' for names
2026-02-10 07:56:39 +01:00
Mike Fährmann
52a5e39fc6 [reddit:user] fix user lookup when using sub view (#8228 #9032)
e.g. USER/submitted or USER/comments
fixes regression introduced in c16892a150
2026-02-09 18:57:00 +01:00
Mike Fährmann
b769dc76f4 [pornpics] fix 'search' extractor pagination (#9022)
make stop condition more lenient
2026-02-09 18:57:00 +01:00
wise-immersion
d77078d853 [fikfap] support main page post URLs (#9026)
* Update fikfap.py to allow for extracting a single post from the main page
    Current post extractor only works on links to posts
    on user pages but not on direct links to posts
* include 'singlepost' logic into existing 'post' extractor

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2026-02-09 18:54:33 +01:00
Mike Fährmann
d3c4328078 [imagefap:user] support multiple pages (#9016) 2026-02-08 11:49:11 +01:00
wise-immersion
a8636e75a1 [fikfap] add 'hashtag' extractor (#9018)
Added functionality to extract by hashtag and save to directory named after the hashtag.
2026-02-08 11:42:48 +01:00
wise-immersion
5d9b607158 [fikfap] allow for dash in usernames (#9019) 2026-02-08 11:07:00 +01:00
Mike Fährmann
2d64e76223 [job] implement 'follow' option (#8752)
Follow and process URLs found in the given format string result.
2026-02-07 21:47:17 +01:00
Mike Fährmann
c978fe18d4 [text] add 'extract_urls()' helper 2026-02-07 21:47:17 +01:00
Mike Fährmann
7a98a93a8e [common] only call 'skip()' & 'finalize()' when defined 2026-02-07 21:47:17 +01:00
Mike Fährmann
22b12a1798 [tests:job] test 'parent-metadata' / '_extractor' handling 2026-02-05 22:37:30 +01:00
Mike Fährmann
f046529f28 [tests:job] add tests for DataJob 'resolve' 2026-02-05 22:37:30 +01:00
Mike Fährmann
d3adfd603b [artstation] fix & update 'challenge' extractor 2026-02-05 22:37:10 +01:00
Mike Fährmann
04442e262e [artstation] download '/8k/' images (#9003) 2026-02-05 17:32:55 +01:00
Mike Fährmann
fdc59efdda [pixiv] fix errors when using metadata options for avatar/background
(#9002)
2026-02-05 12:07:42 +01:00
Mike Fährmann
42407afb6d [xenforo] implement '"order-posts": "reaction"' (#8997) 2026-02-04 21:57:30 +01:00
Mike Fährmann
9958678af1 [simpcity] extract 'reddit' media embeds (#8994) 2026-02-04 11:50:07 +01:00
Mike Fährmann
9379397eec [simpcity] extract 'tiktok' media embeds (#8994) 2026-02-04 11:20:52 +01:00
Mike Fährmann
f0f9575406 [job] fix 'AttributeError' when enabling 'init' for non-DownloadJob
fixes bug in 56dcd00391
2026-02-03 19:00:45 +01:00
Mike Fährmann
0be3383110 [formatter] add 'q' & 'Q' conversions - URL-en/decode values 2026-02-03 17:35:05 +01:00
Mike Fährmann
17e1d25784 [scrolller] add 'user' extractor (#8961) 2026-02-02 09:09:50 +01:00
Mike Fährmann
44e18f9b2f [tsumino] remove module
" Tsumino - The End
  We're shutting Tsumino down. "
2026-02-01 22:15:06 +01:00
Mike Fährmann
1286839037 [socialmediagirlsforum] add tests 2026-01-31 09:55:45 +01:00
bassberry
fd5f5611f6 [tiktok] extract subtitles and all cover types (#8805)
* Make sure that `img_id`, `audio_id` and `cover_id` fields are always available.
    The values are set '' where they are not applicable.
    Having `img_id` is necessary for the default `archive_fmt`, the other fields are handled for consistency.
* Allow downloading more than one cover.
    The previous behavior is kept as-is, but setting the "covers" option to "all" now grabs all available covers.
* Add support for downloading subtitles
    Allows filtering subtitles by source type (ASR, MT) and language.
* Ensure archive uniqueness for covers and subtitles.
* Update the URL test pattern to include the `image` extension.
    Although Tiktok may serve the covers with jpeg content, the file ending can be `.image`.
    The test before 0c14b164 failed because the asserted URL did not match all cover types, but the now used pattern needs the mentioned file ending.
* Add support for "creator_caption" subtitles in "LC" format.
    These subtitles have the keys "Format" set to "creator_caption" and "Source" to "LC".
* Add "LC" (Local Captions) as a subtitle source type in the documentation
* Code deduplication and renaming subtitle metadata
    Changed the item type from singular `subtitle` to `subtitles`.
    Removed the wrong descriptor `cover` from the subtitles fallback title.
* Refactor subtitle filtering
    The filter is now prepared in `_init` to prevent parsing the same config parameter for every item.
    The `_extract_subtitles` function will still extract if either filter (source or language) matches.
* Generate a `file_id` for subtitles
    Subtitles have multiple fields that determine the unique file, so these are simply concatenated.
    This is similar to the cover types, only with more variations.
* Added tests for subtitles
* fix docs entries
* fix '"covers": "all"'
* simplify some code
* Fix fallback title for subtitles
    Added the missing "f" to the f-string and added "subtitle" to the title.
    The resulting title will look like "TikTok video subtitle #1234567"
2026-01-30 21:01:06 +01:00
Mike Fährmann
3445c51ca4 [job] add 'output.jsonl' option (#8953) 2026-01-30 09:36:28 +01:00
Mike Fährmann
532ab7112e [discord] add 'server-search' extractor
requested on Discord

https://discord.com/channels/SERVER_ID/search?from=USER_ID
2026-01-30 07:58:14 +01:00
Mike Fährmann
56168fbc87 [weebdex] add 'lang' option, support query params (#8957)
for example '?order=asc&group=j0fsj3oem3&tlang=en'
2026-01-29 17:01:02 +01:00
Mike Fährmann
a3f164aa50 [weebdex] make metadata extraction non-fatal no2 (#8954)
9a102039fc
2026-01-28 19:48:38 +01:00
Mike Fährmann
feef91bf09 [exhentai] implement Multi-Page Viewer support (#2616 #5268) 2026-01-28 19:37:40 +01:00
Mike Fährmann
d9917ec630 [xenforo] improve 'attachment' extraction (#8947) 2026-01-28 11:57:17 +01:00
Mike Fährmann
aa8610c11c [imhentai] prevent exceptions for galleries without image data (#8951) 2026-01-28 10:40:22 +01:00
SubmarineScurvy
ef8f2869e7 [listal] add 'image' & 'people' extractors (#1589 #8921)
* listal extractor
* add listal to init
* fix flake8 & formatting & extractor names/subcategories

* remove 're' import
* remove 'datetime' import
* update & simplify extractors
* update supportedsites
* add tests

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2026-01-27 18:26:41 +01:00
Mike Fährmann
b67e3c15ff [xenforo] support 'titsintops.com' (#8945) 2026-01-27 10:31:26 +01:00
Mike Fährmann
f6ce8c8579 [mangataro] fix 'manga' extractor (#8930) 2026-01-27 10:03:33 +01:00
Mike Fährmann
9a102039fc [weebdex] make metadata extraction non-fatal (#8939) 2026-01-26 16:44:29 +01:00
Mike Fährmann
7784aed74e [kemono] prevent 'revisions' API requests when possible
posts from '/v1/{service}/user/{creator_id}/post/{post_id}' already
include their revisions and don't need an additional API request
2026-01-26 10:00:32 +01:00
Mike Fährmann
7ac9ad1cbf [kemono] fix possible 'AttributeError' for revisions (#8929)
some revisions have string values for 'file' and 'attachments'
instead of the regular dicts
2026-01-26 10:00:32 +01:00
Mike Fährmann
93bf4ccc18 merge #8928: [mangafreak] add support 2026-01-25 19:52:34 +01:00
Mike Fährmann
4e71e2f7e7 [mangafreak] update & fix
- fix manga and title extraction
- fix 'chapter_minor'
- extend test results
2026-01-25 19:49:56 +01:00