Commit Graph

7752 Commits

Author SHA1 Message Date
Mike Fährmann
eaaa25b6e4 [job] enable all 'parent-…' options for parent extractors by default
- parent-directory
- parent-metadata
- parent-session
- parent-skip

- add general 'parent' option
2026-01-27 12:05:19 +01:00
Mike Fährmann
250fbd3294 [erome] mark as 'parent' extractor 2026-01-27 11:13:14 +01:00
Mike Fährmann
b67e3c15ff [xenforo] support 'titsintops.com' (#8945) 2026-01-27 10:31:26 +01:00
Mike Fährmann
105e2379d4 [pornhub] fix '400 Bad Request' when logged in (#8942)
extract 'token' from a different location
2026-01-27 10:04:24 +01:00
Mike Fährmann
f6ce8c8579 [mangataro] fix 'manga' extractor (#8930) 2026-01-27 10:03:33 +01:00
CasualYouTuber31
4fab8e0dd8 [tiktok] do not fail story extraction if user has no stories (#8938) 2026-01-26 16:50:50 +01:00
Mike Fährmann
9a102039fc [weebdex] make metadata extraction non-fatal (#8939) 2026-01-26 16:44:29 +01:00
Mike Fährmann
7784aed74e [kemono] prevent 'revisions' API requests when possible
posts from '/v1/{service}/user/{creator_id}/post/{post_id}' already
include their revisions and don't need an additional API request
2026-01-26 10:00:32 +01:00
Mike Fährmann
7ac9ad1cbf [kemono] fix possible 'AttributeError' for revisions (#8929)
some revisions have string values for 'file' and 'attachments'
instead of the regular dicts
2026-01-26 10:00:32 +01:00
CasualYouTuber31
702814654a [tiktok] solve JS challenges (#8850)
* [tiktok] First draft of a challenge resolver
* use stdlib sha256 implementation
* simplify 'resolve_challenge()' code
* set cookie domain and expires timestamp
* base64 -> binascii
* Avoid incorrect padding exceptions

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2026-01-26 09:55:53 +01:00
CasualYouTuber31
d19d5c8b6e [tiktok] extract more story item list pages (#8932)
* [tiktok] extractor more story item list pages
* [tiktok] invert hasMore logic
2026-01-26 08:57:23 +01:00
CasualYouTuber31
f80e294132 [tiktok] Fix account extraction (#8931)
Was inadvertently caused by recent changes to range predicates
Fixes regression introduced in c23beee57c
2026-01-26 08:54:42 +01:00
Mike Fährmann
93bf4ccc18 merge #8928: [mangafreak] add support 2026-01-25 19:52:34 +01:00
Mike Fährmann
4e71e2f7e7 [mangafreak] update & fix
- fix manga and title extraction
- fix 'chapter_minor'
- extend test results
2026-01-25 19:49:56 +01:00
Mike Fährmann
7026611f31 merge #8925: [mangatown] add support 2026-01-25 18:35:39 +01:00
Mike Fährmann
bf3ee5e9f7 [mangatown] fix & update
- use BASE_PATTERN
- fix manga, manga_id, chapter_id extraction
- fix & extend 'manga' metadata results
- extend test results
2026-01-25 18:32:17 +01:00
Duy Nguyen
58662f900a fix(mangafreak): fix image extraction and simplify code
- Fix image URL extraction pattern to handle img tags with id attribute
- Use self.groups pattern instead of custom __init__ methods
- Fix chapter list extraction to use correct table structure
2026-01-25 17:24:05 +01:00
Duy Nguyen
8b0e8c656d feat(mangafreak): add support for MangaFreak
Add chapter and manga extractors for ww2.mangafreak.me with support
for bonus chapters (e.g., 167e suffix).
2026-01-25 15:56:52 +01:00
Duy Nguyen
befa9b8a3e [mangatown] fix base url and simplify image extraction 2026-01-25 11:40:15 +01:00
Mike Fährmann
adca123646 [weibo:user] add 'subalbums' include (#8792) 2026-01-25 11:16:41 +01:00
Mike Fährmann
cd83be41c5 [common] allow Dispatch 'alt' extractors to use custom URLs 2026-01-25 11:15:30 +01:00
Mike Fährmann
37176da511 [hentaifoundry:user] use f-strings 2026-01-25 10:10:37 +01:00
Mike Fährmann
baafc64714 [weibo:album] fix "KeyError - 'pid'" (#8792)
add workaround for (sub)album items without 'pid' field
2026-01-25 09:23:46 +01:00
Duy Nguyen
9f2d5cbd5d docs: add mangatown to supported sites 2026-01-25 00:04:23 +01:00
Duy Nguyen
4d8f61ad76 [mangatown] add support 2026-01-25 00:02:36 +01:00
Mike Fährmann
4d1b5fc139 [xenforo] fix cookies check before login (#8919)
check for all (sub)domains and not only '.site.tld'
2026-01-24 21:18:31 +01:00
Mike Fährmann
93edff6872 [xenforo] improve error message extraction (#8919) 2026-01-24 20:58:24 +01:00
Mike Fährmann
291fb78995 [pp:mtime] fix '_mtime_meta' for invalid values (#8918)
fixes regression introduced in d57dc48dcd
also prevents previous _mtime_meta entries from affecting new files
2026-01-24 18:58:24 +01:00
Mike Fährmann
3836c2a99f release version 1.31.4 2026-01-24 12:01:35 +01:00
Mike Fährmann
1530778bfb merge #8917: [kaliscan] add support 2026-01-23 21:04:48 +01:00
Mike Fährmann
180b29197b [kaliscan] update/simplify 2026-01-23 20:58:34 +01:00
Duy Nguyen
5c71993e0b docs: add kaliscan to supported sites 2026-01-23 20:29:24 +01:00
Duy Nguyen
15a93795fc refactor(kaliscan): simplify code style
Remove unused variable, use ternary expressions, deduplicate strip calls.
2026-01-23 20:22:51 +01:00
Duy Nguyen
0b0bcb1640 feat(kaliscan): add extractor for kaliscan.me
Support chapter and manga extractors with metadata extraction.
2026-01-23 20:22:50 +01:00
Mike Fährmann
e93cfa3348 [twitter] implement '"ratelimit": "abort:N"' (#5251 #8864) 2026-01-23 19:54:28 +01:00
CasualYouTuber31
be9b4cf24f [tiktok] prefer "legacy" endpoint over the "newer" endpoint for user extraction (#8812 #8847) 2026-01-23 19:23:22 +01:00
CasualYouTuber31
a3d7af66a1 [tiktok] fix following extractor (#8849) 2026-01-23 19:19:27 +01:00
CasualYouTuber31
88a3153df8 [tiktok] download best quality videos (#8846)
* [tiktok] download best quality videos
* [tiktok] code formatting fix
* simplify sorting in '_extract_video_urls'
2026-01-23 19:09:53 +01:00
Mike Fährmann
f869085476 [weebdex] add 'data-saver' option (#8914) 2026-01-23 09:22:41 +01:00
Mike Fährmann
72322deaee [mangafire] generate 'vrf' tokens (#8400 #8906)
26fc9e9649
2026-01-23 09:16:58 +01:00
Mike Fährmann
396334e66e [hentainexus] improve 'rc4' performance
- & 255
- "".join()
2026-01-22 17:36:44 +01:00
Mike Fährmann
16a59140c5 [xenforo:media-album] extract 'album' metadata (#8902)
- add 'callback' argument to _pagination()
- generalize 'author' metadata collection
2026-01-22 15:04:25 +01:00
Mike Fährmann
fb0d639f68 [xenforo] add 'media-album' extractor (#8902) 2026-01-22 09:10:31 +01:00
Mike Fährmann
18fabb9605 [batoto] remove module (#8908)
"Bato.to has shut down."

There are mirror sites, but they are unscrapeable
due to heavily obfuscated HTML and JS
2026-01-21 20:33:08 +01:00
Mike Fährmann
4798ac4836 [common] implement 'parent-session' 2026-01-21 20:33:08 +01:00
Mike Fährmann
9ca45aae73 [nitter] re-add instances 2026-01-21 20:32:58 +01:00
Mike Fährmann
63df6423bf [nitter] use 'gallery-dl/<version>' User-Agent (#7045 #8130 #8409) 2026-01-21 18:07:47 +01:00
Mike Fährmann
78da7edde8 [common] add 'googlebot' User-Agent preset 2026-01-21 17:57:26 +01:00
Mike Fährmann
e2a17a58f0 [docker] build from 'python:3.14-alpine' 2026-01-21 17:32:48 +01:00
Mike Fährmann
6765f4c77e [kemono:discord] improve 'filename' parsing 2026-01-21 17:01:18 +01:00