Dominik Prange
ff5f6fe70f
[boosty] added new direct message extractor
...
- formatting
- fixed linting formatting errors
- fixed E999 SyntaxError: invalid syntax
- fixed class naming
- fixed mandatory extractor.boosty.metadata as true requirement
- update
- apply changes
- add test
- update docs/supportedsites
- improve 'dialog' pagination logic
2025-02-23 18:14:59 +01:00
Mike Fährmann
18ed39c1cf
implement 'downloader' options per extractor category
...
by setting options inside 'http' or 'ytdl' inside extractor options
or inside subcategory options
{
"extractor": {
"mastodon": {
"http": {
"rate": "10k"
}
},
"mastodon.social": {
"http": {
"rate": "100k"
}
}
},
"downloader": {
"rate": "100m"
}
}
Sets download speed to
- 10k for mastodon.social URLs
- 100k for mastodon sites in general
- 100m for all other sites
2025-02-22 10:08:59 +01:00
Mike Fährmann
4906541f7d
[generic] fix config lookups by subcategory
...
'subcategory' needs to be set before Extractor.__init__() runs
to be included in '_cfgpath'
2025-02-22 10:08:59 +01:00
Mike Fährmann
79dc04d87c
[subscribestar] fix 'post' extractor ( #6582 )
...
https://github.com/mikf/gallery-dl/issues/6582#issuecomment-2675939669
2025-02-22 10:08:59 +01:00
Mike Fährmann
57937c3b68
[newgrounds] provide 'comment_html' metadata ( #7038 )
2025-02-22 10:08:58 +01:00
Mike Fährmann
196aa263c2
[imhentai] improve pagination duplicate filtering
2025-02-20 20:53:35 +01:00
Mike Fährmann
52d4e1a100
[imhentai] inherit from BaseExtractor
...
combine all imhentai-like sites into one module
2025-02-19 22:14:52 +01:00
Mike Fährmann
7a11d02e7a
[reddit] restrict subreddit search results ( #7025 )
2025-02-19 20:05:48 +01:00
Mike Fährmann
d4c56b08d7
[hentaiera] add support ( #3046 #6952 #7020 )
2025-02-19 17:42:04 +01:00
Mike Fährmann
4396029d36
[furry34] add support ( #1078 #7018 )
2025-02-19 16:35:48 +01:00
Mike Fährmann
67937d33e3
[archive] fix NameError when SQLite database path doesn't exist
...
fixes regression introduced in 841bc9f6
2025-02-18 22:09:07 +01:00
Mike Fährmann
82493a6672
[hentairox] add support ( #7003 )
2025-02-18 21:45:30 +01:00
Luca Russo
95c446fcd1
[discord] add support ( #6836 )
...
* first commit
* add --
* skip video embeds
* fix typo
* removed ambiguity
* add category support
* code tweaks
* more reliable embed extraction
* handle 403 errors (testing done)
* added "parent_id" keyword
* added "parent", "parent_type" keywords
the extractor should be now ready to merge!
* removed unnecessary dict unpacking
* added empty text messages extraction
* added "channel_topic"
* even more metadata extraction
can now extract all embeds images & text, as well as server banners. also code is much better.
* added user avatar and banner
* better pagination
* fix regression
* minor tweaks
* Made requested changes
2025-02-18 18:45:39 +01:00
Mike Fährmann
fd4de02e67
[archive] support PostgreSQL archives for post processors ( #6152 )
2025-02-17 14:58:14 +01:00
Mike Fährmann
8daf496a22
[archive] add 'archive-table' option ( #6152 )
2025-02-17 11:41:13 +01:00
Mike Fährmann
dac0c4ac10
[docs] add 'psycopg' to optional dependencies
2025-02-17 10:59:15 +01:00
Mike Fährmann
841bc9f66f
[archive] implement support for PostgreSQL databases ( #6152 )
2025-02-16 17:56:52 +01:00
Mike Fährmann
b4eae65965
[imhentai] avoid unnecessary HTTP request
...
no need to fetch a gallery's '/view/' page when the main page contains
all the same data as well
2025-02-16 15:04:24 +01:00
Mike Fährmann
800cf5beb5
replace 'print()' with 'output.stderr_write("\n")'
2025-02-15 18:01:05 +01:00
Mike Fährmann
35307608f2
[dl:http] add 'sleep-429' option ( #6996 )
2025-02-15 17:42:03 +01:00
Mike Fährmann
046ebb5590
[imgur] replace AuthorizationError exception with logging message
2025-02-15 15:36:39 +01:00
Mike Fährmann
182b544217
[ytdl] support specifying filesystem paths as 'module' ( #6991 )
2025-02-14 19:58:25 +01:00
Mike Fährmann
7ae09c6b29
[imgur] add support for (hidden) personal posts ( #6990 )
...
https://imgur.com/user/me
https://imgur.com/user/me/hidden
2025-02-14 19:28:55 +01:00
Mike Fährmann
cd9fa1ef75
[bunkr] implement fast '--range' support ( #6985 )
2025-02-14 18:21:32 +01:00
Mike Fährmann
195b52284a
[tests] move 'e621:frontend' tests into regular results/e621.py
...
having both e621.py and E621.py in the same directory messes with
Windows
6e919a3695 (commitcomment-152557303)
2025-02-14 17:44:14 +01:00
Mike Fährmann
51f978e027
[weibo] add 'movies' option ( #6988 )
...
disable 'movie' downloads by default
2025-02-13 18:01:20 +01:00
Mike Fährmann
b8b541fded
[itaku] support gallery section URLs ( #6951 )
2025-02-13 14:29:36 +01:00
Mike Fährmann
1cf2870f81
[patreon] extract 'campaign' metadata ( #6989 )
2025-02-13 14:13:43 +01:00
Mike Fährmann
6420210b0f
[vsco] improve 'm3u8' handling
2025-02-12 20:44:43 +01:00
Mike Fährmann
f1f27eb2ab
[vsco] support '/video/' URLs ( #4295 #6973 )
...
requires yt-dlp/youtube-dl to handle m3u8 manifests
2025-02-12 19:12:00 +01:00
Mike Fährmann
d1a8142dcf
[bunkr] provide fallback URLs for 403 download links ( #6732 #6972 )
2025-02-12 19:12:00 +01:00
Mike Fährmann
55034d9638
[imhentai] add support ( #1660 #3046 #3824 #4338 #5936 )
2025-02-10 21:42:07 +01:00
Mike Fährmann
be77465e1b
[weebcentral] fix extracting wrong number of chapter pages ( #6966 )
...
When downloading multiple chapters at once, all chapters after the first
one would download only as many pages per chapter as the first one had,
due to reusing a cached/shared dict in the wrong way.
2025-02-10 16:00:13 +01:00
Mike Fährmann
587205bf68
[pixiv] prevent exceptions during 'comments' extraction ( #6965 )
...
- wrap in try-except block
- do not attempt to fetch comments for 'sanity_level' works
2025-02-10 09:56:09 +01:00
Mike Fährmann
6c2b6d50cc
[patreon] support '/profile/creators' URLs
2025-02-09 15:52:54 +01:00
Mike Fährmann
3282025749
merge #6956 : [b4k] update domain to 'arch.b4k.dev' ( #6955 )
2025-02-09 11:13:53 +01:00
Mike Fährmann
23c4bc8ac5
[b4k] keep support for previous 'arch.b4k.co' domain
2025-02-09 11:11:38 +01:00
NecRaul
dae82f1519
[b4k] update domain to arch.b4k.dev
2025-02-09 01:28:23 +04:00
Mike Fährmann
93adc86dca
improve '\f' format string handling for --print
...
add a newline only for f-string / \fF format strings,
as it would break any of the others
2025-02-08 21:42:31 +01:00
Mike Fährmann
e2134b349d
replace '\f' in --print arguments with form feed character
...
to make it easier to use special type format strings on command-line
(#6938 )
2025-02-07 19:37:33 +01:00
Mike Fährmann
28385bec7a
[bunkr] extract 'id_url' metadata ( #6935 )
...
and use it as 'id' alternative instead of 'name' in default archive IDs
2025-02-06 20:40:35 +01:00
Mike Fährmann
b9675ea764
[bunkr] update default archive ID format ( #6935 )
...
use 'name' when there is no proper 'id' value available.
2025-02-05 21:54:43 +01:00
Mike Fährmann
873cbf6b36
[docs] add more details to 'user-agent' and 'browser' docs ( #6917 )
2025-02-03 20:54:27 +01:00
Mike Fährmann
5807daa19a
[issuu] unescape HTML entities
2025-02-02 18:33:18 +01:00
Mike Fährmann
6c9b20fe45
[philomena] download 'full' URLs ( #6922 )
...
'view_url' URLs sometimes result in 404 errors
2025-02-02 18:23:46 +01:00
Mike Fährmann
4ab9237f1d
[philomena] fix 'date' values without UTC offset ( #6921 )
...
Some instances do not include a UTC offset or 'Z' in their datetime
values, e.g. 2024-03-14T13:46:46 compared to 2024-03-14T13:46:46Z
2025-02-02 16:32:28 +01:00
Mike Fährmann
1a9138f25e
[aes] handle errors during 'Cryptodome' import ( #6906 )
2025-02-02 15:01:17 +01:00
Mike Fährmann
7c96c2368f
[subscribestar] detect and handle redirects ( #6916 )
2025-02-01 21:03:24 +01:00
Mike Fährmann
52ac3a7802
[release] build 'gallery-dl.exe' on Python 3.13 ( #6684 )
...
and rename the former Python 3.8 version to 'gallery-dl_x86.exe'.
Currently building with PyInstaller, as I wasn't able to get py2exe to
work in this environment, but the startup times are noticeably longer.
Considering switching to nuitka, maybe even for all standalone builds.
2025-02-01 19:58:51 +01:00
Mike Fährmann
ddb2c4d69d
[executables] fix SSLError when using HTTPAdapter ( #6393 )
...
always load certifi certificates instead of relying on
'load_default_certs()', which might load no certs at all
2025-01-31 20:36:41 +01:00