Commit Graph

4986 Commits

Author SHA1 Message Date
NecRaul
3a4e19d284 [archivedmoe] Simplify board extraction from url 2025-06-13 18:44:02 +04:00
NecRaul
a7aa18a8c1 [archivedmoe] remove unnecessary logging 2025-06-13 18:28:21 +04:00
NecRaul
8b2adeb41e [archivedmoe] simplify board URL redirection logic 2025-06-13 18:26:39 +04:00
NecRaul
05081dea2e Lint with flake8 2025-06-13 17:56:43 +04:00
NecRaul
223fe960a0 [archivedmoe] redirect URL changes (again)
Redirects to warosu.org instead of 4chan's cdn for certain boards
Redirects to archive.4plebs.org instead of 4chan's cdn for /tg/
Slices the filename only if it's redirecting to certain archives
2025-06-13 17:43:16 +04:00
Mike Fährmann
e2d104a110 [twitter] extract 'source_id' and 'source_user' metadata (#7470 #7640) 2025-06-12 18:59:22 +02:00
Mike Fährmann
06e2f2cd91 [twitter] restructure media data extraction 2025-06-12 18:53:15 +02:00
Mike Fährmann
56ea27c474 [blogger] move original/s0 URL code into a separate function 2025-06-12 17:07:56 +02:00
Mike Fährmann
16fc5e0d68 [batoto] fix downloading manga with alerts/notices (#7657)
and improve alert message extraction
2025-06-12 08:26:26 +02:00
Mike Fährmann
a14671992c [sexcom] prevent '.css' file downloads (#7632)
by detecting homepage redirects
and improve redirect handling in general
2025-06-11 22:32:08 +02:00
Mike Fährmann
0df083b208 [vk] prevent '404 Not Found' errors for file downloads
only strip query parameters when regex substitution applies
2025-06-11 22:32:08 +02:00
Mike Fährmann
d065452ba3 merge #7653: [archivedmoe] fix redirection issue (#7652) 2025-06-11 20:04:42 +02:00
Mike Fährmann
80599fa610 [vk] fix 'user' metadata extraction
add boolean 'group' field
2025-06-11 20:01:27 +02:00
NecRaul
e3df99dbb9 Apply mikf's diff regarding Archived.moe
Moved (and refactored) code into remote()
Added a check for fixup_timestamp
2025-06-11 21:51:03 +04:00
Mike Fährmann
85931185a6 [vk] add continuation message (#7650) 2025-06-11 18:07:39 +02:00
Mike Fährmann
8287a1b372 [vk] detect redirects to 'challenge' pages (#7650) 2025-06-11 18:02:14 +02:00
NecRaul
4370654532 Simplify remote_media_link assignment 2025-06-11 04:49:21 +04:00
NecRaul
cb74d0f2f3 Lint with flake8 2025-06-11 04:46:18 +04:00
NecRaul
96bb2b1630 Fix Archived.moe redirection issue
Unless the board is /b/ (in which case redirection works fine),
remove the characters of the filename portion of the url until
filename portion of the url is 13 characters long (epoch millis).
2025-06-11 04:42:03 +04:00
Mike Fährmann
b4aed5e2c9 [common] allow overriding 'user-agent' when 'browser' is used (#7647) 2025-06-10 22:05:28 +02:00
Mike Fährmann
8e698d1a64 [ytdl] set domain as subcategory when using Generic extractor (#6582)
https://github.com/mikf/gallery-dl/issues/6582#issuecomment-2959879730
2025-06-10 21:35:15 +02:00
Mike Fährmann
4cfddc144a [common] import 'datetime' class directly 2025-06-10 21:35:15 +02:00
Mike Fährmann
e68555defa [common] improve cookie-related logging messages 2025-06-10 21:34:27 +02:00
Mike Fährmann
511cf2363c [common] update expired cookie messages (#7644)
- prefix with 'cookies:'
- include domain
- include exact time when it expired
2025-06-09 18:48:04 +02:00
Mike Fährmann
5f41ac4257 [4archive] fix 'thread' extractor 2025-06-08 21:52:54 +02:00
Mike Fährmann
827eeca0bc [paheal] fix '404 Not Found' for tags with URL encoded characters (#7642) 2025-06-08 16:23:11 +02:00
Mike Fährmann
17d39c06e3 [exhentai] implement '"source": "metadata"' (#4902) 2025-06-08 12:57:23 +02:00
Mike Fährmann
967af5eede [exhentai] add 'limits-action' option (#6504)
https://github.com/mikf/gallery-dl/issues/6504#issuecomment-2949551532
2025-06-08 12:56:56 +02:00
Mike Fährmann
3b75b195c1 [exhentai] detect HTML downloads (#4798) 2025-06-07 22:06:53 +02:00
Mike Fährmann
27c48ad317 [exhentai] ensure file signature bytes aren't all zero (#4902) 2025-06-07 20:34:05 +02:00
Mike Fährmann
8227e21257 [deviantart:tiptap] fix TypeError when 'textAlign' is null (#7639) 2025-06-07 19:06:43 +02:00
Mike Fährmann
6e120f2551 [danbooru] fix Ugoira for instances without 'Ugoira:FrameMimeType'
(#7630)

fixes regression introduced in 1866f8b97b
2025-06-07 07:47:03 +02:00
Mike Fährmann
3e423937d2 [misskey] implement 'include' option (#5347) 2025-06-06 20:52:03 +02:00
Mike Fährmann
5cd3f3977e [misskey] add 'info' extractor (#5347) 2025-06-06 20:21:52 +02:00
Mike Fährmann
ac09cac978 [misskey] add 'avatar' and 'background' extractors (#5347) 2025-06-06 20:14:05 +02:00
Mike Fährmann
9c4cef822e [komikcast] update domain to 'komikcast02.com' 2025-06-06 20:14:02 +02:00
Mike Fährmann
15f5e567ec [mangaread] fix 'manga_alt' metadata 2025-06-06 13:25:29 +02:00
Mike Fährmann
b5c88b3d3e replace standard library 're' uses with 'util.re()' 2025-06-06 13:24:52 +02:00
Mike Fährmann
8dace96af3 [twitter] simplify 'expand' & 'unique' init code 2025-06-05 15:33:47 +02:00
Mike Fährmann
72a01bc4d4 [common] use util.re_compile() in _dump_response 2025-06-05 15:24:22 +02:00
Mike Fährmann
d7d99d5606 [behance] fix '403 Forbidden' errors 2025-06-05 14:25:07 +02:00
Mike Fährmann
efd49aef73 allow using predefined Firefox/Chrome 'headers' & 'ciphers' 2025-06-05 14:24:38 +02:00
Mike Fährmann
1866f8b97b [danbooru] fix Ugoira conversions for posts without 'ZIP:ZipFileName'
get frame extension from 'Ugoira:FrameMimeType' instead

(#7630)
5919696271
2025-06-05 09:13:25 +02:00
Mike Fährmann
85124fe251 [common] add 'request_json()' convenience function 2025-06-05 09:13:25 +02:00
Mike Fährmann
a7bbccbd7b [common] add 'request_xml()' convenience function 2025-06-04 23:10:16 +02:00
Mike Fährmann
685836f6fd [dynastyscans] add 'anthology' extractor (#7627) 2025-06-04 21:23:49 +02:00
Mike Fährmann
b5334f5837 [everia] prevent redirect when fetching a post page 2025-06-04 11:09:40 +02:00
missionfloyd
72e1a4a0cb [everia] unquote URLs (#7620)
* [everia.club] unescape URLs

* add test
2025-06-04 09:38:06 +02:00
Mike Fährmann
3c6c40d4ed [nijie] fix file extraction (#7624)
ignore empty URLs / URLs with no 'src="'
2025-06-04 07:57:27 +02:00
Mike Fährmann
75b6c8f3d8 re-implement 'category-map' (#7612) 2025-06-04 07:57:27 +02:00