41 Commits

Author SHA1 Message Date
Mike Fährmann
4e2987e007 [path] implement conditional 'part-directory' (#8329) 2025-12-03 11:19:44 +01:00
Mike Fährmann
a097a373a9 simplify if statements by using walrus operators (#7671) 2025-07-22 20:57:54 +02:00
Mike Fährmann
b5c1bf3f59 [dl] improve invalid 'subcategory' value warning (#7103) 2025-03-04 16:17:53 +01:00
Mike Fährmann
4d2037f6c6 [dl] warn about invalid 'subcategory' values (#7103)
prevent fatal exception when collecting downloader options
2025-03-03 16:51:13 +01:00
Mike Fährmann
613f05afa3 fix cmdline arguments not overriding extractor-downloader options 2025-02-22 17:40:27 +01:00
Mike Fährmann
18ed39c1cf implement 'downloader' options per extractor category
by setting options inside 'http' or 'ytdl' inside extractor options
or inside subcategory options

{
    "extractor": {
        "mastodon": {
            "http": {
                "rate": "10k"
            }
        },
        "mastodon.social": {
            "http": {
                "rate": "100k"
            }
        }
    },
    "downloader": {
        "rate": "100m"
    }
}

Sets download speed to
-  10k for mastodon.social URLs
- 100k for mastodon sites in general
- 100m for all other sites
2025-02-22 10:08:59 +01:00
Mike Fährmann
47cf05c4ab refactor proxy handling code (#2357)
- allow gallery-dl proxy settings to overwrite environment proxies
- allow specifying different proxies for data extraction and download
  - add 'downloader.proxy' option
  - '-o extractor.proxy=–PROXY_URL -o downloader.proxy=null'
    now has the same effect as youtube-dl's '--geo-verification-proxy'
2022-03-10 23:55:35 +01:00
Mike Fährmann
34929f673f readd 'session' to base downloader class (fixes #768) 2020-05-20 20:04:46 +02:00
Mike Fährmann
ece73b5b2a make 'path' and 'keywords' available in logging messages
Wrap all loggers used by job, extractor, downloader, and postprocessor
objects into a (custom) LoggerAdapter that provides access to the
underlying job, extractor, pathfmt, and kwdict objects and their
properties.

__init__() signatures for all downloader and postprocessor classes have
been changed to take the current Job object as their first argument,
instead of the current extractor or pathfmt.

(#574, #575)
2020-05-18 19:04:51 +02:00
Mike Fährmann
200aea308a [downloader:common] enable 'job'/'extractor' for logging messages
(#574)
2020-01-12 21:41:16 +01:00
Mike Fährmann
f5604492c3 update interface of config functions 2019-11-24 00:42:28 +01:00
Mike Fährmann
179d112083 [downloader] overhaul http and text modules
Get rid of the modular structure and simplify/specialize those modules.
2019-06-19 22:56:11 +02:00
Mike Fährmann
c14d44e1bc [downloader:common] retry downloads on SSL errors (#130) 2018-12-14 16:33:04 +01:00
Mike Fährmann
655549df7c [downloader:ytdl] add several options
The "default" downloader options (rate, retries, timeout, verify) are
mapped to corresponding youtube-dl options.

downloader.ytdl.logging tells the downloader to pass youtube-dl's output
to a Logger object.

downloader.ytdl.raw-options allows to pass arbitrary options to the
YoutubeDL constructor.
2018-10-20 18:26:49 +02:00
Mike Fährmann
4a348990f4 adjust value resolution for retries/timeout/verify options
This change introduces 'extractor.*.retries/timeout/verify' options
as a general way to set these values for all HTTP requests.

'downloader.http.retries/timeout/verify' is a way to override these
options for file downloads only and will fall back to 'extractor.*.…*
values if they haven't been explicitly set.

Also: downloader classes now take an extractor object as first argument
instead of a requests.session.
2018-10-07 21:13:39 +02:00
Mike Fährmann
973cf98e88 fix download skip for files without extension 2018-06-27 17:16:07 +02:00
Mike Fährmann
821535b458 adjust PathFormat class 2018-06-06 20:17:17 +02:00
Mike Fährmann
1d54a8e07d fix logging output during downloads
from:
filename.ext[download][warning] ...

to:
filename.ext
[download][warning] ...
2018-03-01 18:43:43 +01:00
Mike Fährmann
915807dd77 log HTTP errors as warnings 2018-01-29 21:55:46 +01:00
Mike Fährmann
f94e3706a8 use logging module for error messages during downloads 2018-01-26 18:11:13 +01:00
Mike Fährmann
b837420291 fix minor urllist issues 2018-01-19 22:54:15 +01:00
Mike Fährmann
6174a5c4ef [download] adjust filename extension on filetype mismatch
(closes #63)
2018-01-17 18:37:06 +01:00
Mike Fährmann
ebe9b0a04c another attempt at downloader retry behavior
This commit changes the general behavior from
'Retry on every exception and abort on DownloadError' to
'Only retry on DownloadRetry exceptions and abort on every other one'

The previous version would have retried on several states which
would have no chance of ever succeeding (invalid URLs, etc.)
2017-12-07 15:31:14 +01:00
Mike Fährmann
79bcaa8726 improve downloader retry behavior
- only retry download on 5xx and 429 status codes
- immediately fail on 4xx status codes
2017-11-10 21:46:18 +01:00
Mike Fährmann
42e948584d fix downloader error handling
RequestException being a subclass of OSError caused all exceptions
during file downloads to be ignored/re-raised.
2017-11-07 15:23:07 +01:00
Mike Fährmann
707b15b586 create missing directories for 'part-directory'
also some code improvements regarding downloader config values
2017-10-27 12:22:45 +02:00
Mike Fährmann
caf26412dd add option to set alternate location of .part files (#29)
Note: The path set for 'downloader.*.part-directory' needs to point to an
already existing directory.
2017-10-26 00:16:48 +02:00
Mike Fährmann
9a41002b77 fix partial downloads for 'text:' URLs
Using a filesize in bytes as offset into a Python string is not
a good idea if said file contains non-ASCII characters.
2017-10-25 15:04:45 +02:00
Mike Fährmann
963670d73b add options to control usage of .part files (#29)
- '--no-part' command line option to disable them
- 'downloader.http.part' and 'downloader.text.part' config options

Disabling .part files restores the behaviour of the old downloader
implementation.
2017-10-24 23:33:44 +02:00
Mike Fährmann
b0353aa02d rewrite download modules (#29)
- use '.part' files during file-download
- implement continuation of incomplete downloads
- check if file size matches the one reported by server
2017-10-24 12:53:03 +02:00
Mike Fährmann
e2b5cd9918 change config-path for 'retries' and 'timeout' 2017-03-26 18:24:46 +02:00
Mike Fährmann
0b5076815d always delete incompletely downloaded files 2017-03-21 15:53:43 +01:00
Mike Fährmann
22910f9562 improve error handling of http file downloads
(#10)
2017-03-16 04:17:35 +01:00
Mike Fährmann
4f123b8513 code adjustments according to pep8 2017-01-30 19:40:15 +01:00
Mike Fährmann
3c1daef839 don't delete downloaded files in certain edge cases 2016-11-27 23:43:25 +01:00
Mike Fährmann
29692c5784 get extension from Content-Type header if not provided 2016-09-30 12:32:48 +02:00
Mike Fährmann
ecc6542fc8 change required parameter type to file-like objects 2015-12-21 22:46:49 +01:00
Mike Fährmann
a8c0b4531d fix issue with Ctrl+c on windows 2015-12-02 01:01:33 +01:00
Mike Fährmann
4b377ccc09 use output-module during downloads 2015-12-01 21:22:58 +01:00
Mike Fährmann
28fa7c53b4 docstrings and other small fixes for downloaders 2015-04-10 21:45:41 +02:00
Mike Fährmann
deef91eddc initial commit 2014-10-12 21:56:44 +02:00