Mike Fährmann
44e18f9b2f
[tsumino] remove module
...
" Tsumino - The End
We're shutting Tsumino down. "
2026-02-01 22:15:06 +01:00
Mike Fährmann
1286839037
[socialmediagirlsforum] add tests
2026-01-31 09:55:45 +01:00
bassberry
fd5f5611f6
[tiktok] extract subtitles and all cover types ( #8805 )
...
* Make sure that `img_id`, `audio_id` and `cover_id` fields are always available.
The values are set '' where they are not applicable.
Having `img_id` is necessary for the default `archive_fmt`, the other fields are handled for consistency.
* Allow downloading more than one cover.
The previous behavior is kept as-is, but setting the "covers" option to "all" now grabs all available covers.
* Add support for downloading subtitles
Allows filtering subtitles by source type (ASR, MT) and language.
* Ensure archive uniqueness for covers and subtitles.
* Update the URL test pattern to include the `image` extension.
Although Tiktok may serve the covers with jpeg content, the file ending can be `.image`.
The test before 0c14b164 failed because the asserted URL did not match all cover types, but the now used pattern needs the mentioned file ending.
* Add support for "creator_caption" subtitles in "LC" format.
These subtitles have the keys "Format" set to "creator_caption" and "Source" to "LC".
* Add "LC" (Local Captions) as a subtitle source type in the documentation
* Code deduplication and renaming subtitle metadata
Changed the item type from singular `subtitle` to `subtitles`.
Removed the wrong descriptor `cover` from the subtitles fallback title.
* Refactor subtitle filtering
The filter is now prepared in `_init` to prevent parsing the same config parameter for every item.
The `_extract_subtitles` function will still extract if either filter (source or language) matches.
* Generate a `file_id` for subtitles
Subtitles have multiple fields that determine the unique file, so these are simply concatenated.
This is similar to the cover types, only with more variations.
* Added tests for subtitles
* fix docs entries
* fix '"covers": "all"'
* simplify some code
* Fix fallback title for subtitles
Added the missing "f" to the f-string and added "subtitle" to the title.
The resulting title will look like "TikTok video subtitle #1234567 "
2026-01-30 21:01:06 +01:00
Mike Fährmann
3445c51ca4
[job] add 'output.jsonl' option ( #8953 )
2026-01-30 09:36:28 +01:00
Mike Fährmann
532ab7112e
[discord] add 'server-search' extractor
...
requested on Discord
https://discord.com/channels/SERVER_ID/search?from=USER_ID
2026-01-30 07:58:14 +01:00
Mike Fährmann
56168fbc87
[weebdex] add 'lang' option, support query params ( #8957 )
...
for example '?order=asc&group=j0fsj3oem3&tlang=en'
2026-01-29 17:01:02 +01:00
Mike Fährmann
a3f164aa50
[weebdex] make metadata extraction non-fatal no2 ( #8954 )
...
9a102039fc
2026-01-28 19:48:38 +01:00
Mike Fährmann
feef91bf09
[exhentai] implement Multi-Page Viewer support ( #2616 #5268 )
2026-01-28 19:37:40 +01:00
Mike Fährmann
d9917ec630
[xenforo] improve 'attachment' extraction ( #8947 )
2026-01-28 11:57:17 +01:00
Mike Fährmann
aa8610c11c
[imhentai] prevent exceptions for galleries without image data ( #8951 )
2026-01-28 10:40:22 +01:00
SubmarineScurvy
ef8f2869e7
[listal] add 'image' & 'people' extractors ( #1589 #8921 )
...
* listal extractor
* add listal to init
* fix flake8 & formatting & extractor names/subcategories
* remove 're' import
* remove 'datetime' import
* update & simplify extractors
* update supportedsites
* add tests
---------
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de >
2026-01-27 18:26:41 +01:00
Mike Fährmann
b67e3c15ff
[xenforo] support 'titsintops.com' ( #8945 )
2026-01-27 10:31:26 +01:00
Mike Fährmann
f6ce8c8579
[mangataro] fix 'manga' extractor ( #8930 )
2026-01-27 10:03:33 +01:00
Mike Fährmann
9a102039fc
[weebdex] make metadata extraction non-fatal ( #8939 )
2026-01-26 16:44:29 +01:00
Mike Fährmann
7784aed74e
[kemono] prevent 'revisions' API requests when possible
...
posts from '/v1/{service}/user/{creator_id}/post/{post_id}' already
include their revisions and don't need an additional API request
2026-01-26 10:00:32 +01:00
Mike Fährmann
7ac9ad1cbf
[kemono] fix possible 'AttributeError' for revisions ( #8929 )
...
some revisions have string values for 'file' and 'attachments'
instead of the regular dicts
2026-01-26 10:00:32 +01:00
Mike Fährmann
93bf4ccc18
merge #8928 : [mangafreak] add support
2026-01-25 19:52:34 +01:00
Mike Fährmann
4e71e2f7e7
[mangafreak] update & fix
...
- fix manga and title extraction
- fix 'chapter_minor'
- extend test results
2026-01-25 19:49:56 +01:00
Mike Fährmann
7026611f31
merge #8925 : [mangatown] add support
2026-01-25 18:35:39 +01:00
Mike Fährmann
bf3ee5e9f7
[mangatown] fix & update
...
- use BASE_PATTERN
- fix manga, manga_id, chapter_id extraction
- fix & extend 'manga' metadata results
- extend test results
2026-01-25 18:32:17 +01:00
Duy Nguyen
8b0e8c656d
feat(mangafreak): add support for MangaFreak
...
Add chapter and manga extractors for ww2.mangafreak.me with support
for bonus chapters (e.g., 167e suffix).
2026-01-25 15:56:52 +01:00
Mike Fährmann
adca123646
[weibo:user] add 'subalbums' include ( #8792 )
2026-01-25 11:16:41 +01:00
Duy Nguyen
4d8f61ad76
[mangatown] add support
2026-01-25 00:02:36 +01:00
Mike Fährmann
291fb78995
[pp:mtime] fix '_mtime_meta' for invalid values ( #8918 )
...
fixes regression introduced in d57dc48dcd
also prevents previous _mtime_meta entries from affecting new files
2026-01-24 18:58:24 +01:00
Duy Nguyen
0b0bcb1640
feat(kaliscan): add extractor for kaliscan.me
...
Support chapter and manga extractors with metadata extraction.
2026-01-23 20:22:50 +01:00
Mike Fährmann
f869085476
[weebdex] add 'data-saver' option ( #8914 )
2026-01-23 09:22:41 +01:00
Mike Fährmann
72322deaee
[mangafire] generate 'vrf' tokens ( #8400 #8906 )
...
26fc9e9649
2026-01-23 09:16:58 +01:00
Mike Fährmann
fb0d639f68
[xenforo] add 'media-album' extractor ( #8902 )
2026-01-22 09:10:31 +01:00
Mike Fährmann
18fabb9605
[batoto] remove module ( #8908 )
...
"Bato.to has shut down."
There are mirror sites, but they are unscrapeable
due to heavily obfuscated HTML and JS
2026-01-21 20:33:08 +01:00
Mike Fährmann
9ca45aae73
[nitter] re-add instances
2026-01-21 20:32:58 +01:00
Mike Fährmann
774d885a86
[kemono:discord] extract 'archives' metadata ( #8898 )
...
4a74bc6e30
2026-01-20 17:58:36 +01:00
Mike Fährmann
efcbde7dcd
[kemono:discord] support server URLs with trailing '/'
2026-01-20 17:29:00 +01:00
Mike Fährmann
2b42766956
[turbo] add '/v/' URL test
2026-01-20 17:16:46 +01:00
brerk
e00c717b15
[turbo] update 'saint' extractors ( #8893 #8896 )
...
* Implements turbo.py & remove from domain pattern from saints.py
* Remove leftover commented pattern from saints.py
* Make turbo.py comply with flake8
* Add album support
* Improved metadata extracion for albums and single files & created turbo.py tests using saints.py test
* Align turbo.py extractor with flake8 rules
* Fix #class name on turbo.py tests
* Fix #category test
* Fix #category test x2
* Fix #category tests
* Fix #category tests
* Fix TurboMediaExtractor self.groups unpacking
* update basic module formatting
* replace 'saint' with 'turbo' in modules list
* remove saint extractors and tests
* update & simplify 'media' extractor
* update & simplify 'album' extractor
* update tests
* update supportedsites
* update 'category-map' & 'config-map'
---------
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de >
2026-01-19 22:20:13 +01:00
Mike Fährmann
cc5bfa6eb0
[xenforo] support 'celebforum.to' ( #8902 )
2026-01-19 16:04:33 +01:00
Mike Fährmann
09635352d0
[imagebam] raise 'NotFoundError' for deleted galleries
2026-01-19 11:19:35 +01:00
Mike Fährmann
8a481a5126
[tests/results] allow using exception names for '#exception'
2026-01-19 11:19:12 +01:00
Mike Fährmann
8c9ca609ea
[imagebam] raise 'NotFoundError' for deleted images ( #8890 )
2026-01-18 21:27:27 +01:00
Mike Fährmann
c23beee57c
[util] use functions for predicates
...
more lightweight and faster than classes
2026-01-18 20:32:36 +01:00
AngeredBacterium
d64ca94361
[saint] support alternate turbovid domain ( #8888 )
...
* Add alternate turbovid domain
* simplify regex pattern
* add tests
2026-01-16 09:37:52 +01:00
Stephon Parker
43387c535d
[thefap] add support ( #8821 #8822 )
...
* adding site support for thefap.com
* fixing typo in url tld
* improve & simplify 'model' extractor
* update 'post' extractor
* update docs/supportedsites
* add tests
---------
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de >
2026-01-14 21:11:56 +01:00
Mike Fährmann
812482e53e
[xenforo] extract 'author_slug' metadata ( #8785 )
2026-01-13 21:57:04 +01:00
Mike Fährmann
c4040bb45b
[rule34xyz] support URLs with 'www' subdomain ( #8875 )
2026-01-13 12:06:45 +01:00
Mike Fährmann
f1fd83e87e
[xenforo] fix/improve 'bb*Wrapper' extraction ( #8868 )
2026-01-12 20:48:14 +01:00
Mike Fährmann
200ad21cbd
[scripts/generate_result] fix 'small()' for empty objects
2026-01-11 22:55:19 +01:00
Mike Fährmann
d8128fbd4c
[booth:item] support URLs with language codes
2026-01-11 22:17:36 +01:00
Mike Fährmann
d7c1c30c62
[booth] add 'category' extractor ( #8867 )
2026-01-11 22:17:01 +01:00
Mike Fährmann
a79a945494
[formatter] overload '.' operator
...
implement generic access of
* list items (L[1] -> L.1)
* dict vslues (D[key] -> D.key)
* object attributes (O.attr -> O.attr)
in standard format strings
2026-01-11 17:49:55 +01:00
camellia2077
084a6d73e0
[bilibili] add support for Live Photo (video) downloads ( #8860 )
...
* bilibili: add support for live photo downloads
* fix: resolve flake8 linting errors (whitespace and line length)
* fix: resolve flake8 E302 and W293 linting errors
* fix: resolve flake8 W293 and E302 linting errors
* simplify syntax
* add 'livephoto' option
* add tests
2026-01-10 19:27:34 +01:00
Mike Fährmann
18b4c67c65
[ahottie:album] support multiple pages ( #8862 )
2026-01-10 18:08:13 +01:00