Commit Graph

135 Commits

Author SHA1 Message Date
Mike Fährmann
67ec91cdbd [downloader:http] change '_http_retry' to accept a Python function
and rename '_http_retry_codes' to '_http_retry'

(#3569)
2023-03-09 23:30:15 +01:00
Mike Fährmann
8148c2a097 [downloader:ytdl] prevent exception on empty results
a7c7953107 (commitcomment-92042240)
2023-03-06 12:25:12 +01:00
Mike Fährmann
d16873941c [downloader:http] use 'time.monotonic()' 2023-01-31 15:32:12 +01:00
Mike Fährmann
ec9ff7640d merge #3535: [downloader:http] add signature checks for .blend, .obj, and .clip files 2023-01-16 15:09:10 +01:00
ClosedPort22
b6706b373a [downloader:http] add signature checks for some formats
also add the MIME type for .obj files
2023-01-15 23:40:55 +08:00
Mike Fährmann
c881548a27 add 'extractor.retry-codes' option (#3313)
do not retry 429 and 430 by default
2023-01-14 17:25:30 +01:00
Mike Fährmann
c0d7d2be35 [downloader:http] add 'validate' option 2023-01-11 15:37:40 +01:00
Mike Fährmann
80102fa367 [downloader:http] add 'retry-codes' option (#3313) 2022-12-01 11:08:23 +01:00
Mike Fährmann
b4253f69c9 [downloader:http] fix ZeroDivisionError (#3328)
ensure 'time_elapsed' only get used as divisor
when it is greater than zero
2022-11-30 21:56:18 +01:00
Mike Fährmann
f87cfa5f66 [downloader:http] add signature check for .mp4 files 2022-11-16 21:45:26 +01:00
Mike Fährmann
a4ff20cf16 [downloader:http] fix issues from inaccurate 'time.sleep()'
(#3143)

Reverts part of c59b98c8 by going back to using a global timer
instead of a per-chunk one.

Reintroduces the issue of ignoring rate limits after
suspending and resuming the process.
2022-11-10 13:24:02 +01:00
Mike Fährmann
550f90ab56 delay enabling .part files when 'http-metadata' is set
otherwise 'build_path' gets called before all metadata is collected
2022-11-09 13:23:52 +01:00
Mike Fährmann
8124c16a50 split 'build_path' from 'set_filename' and 'set_extension'
Do not automatically build a new path
when setting file metadata or updating its extension.
2022-11-08 17:03:24 +01:00
Mike Fährmann
39d9c362e4 include 'http-metadata' in '-K' output 2022-11-07 16:33:26 +01:00
Mike Fährmann
870e6a48a0 implement 'http-metadata' option
or at least attempt to.
2022-11-05 18:29:29 +01:00
Mike Fährmann
bca9f965e5 [downloader:http] add 'chunk-size' option (#3143)
and double the previous default from 16384 (2**14) to 32768 (2**15)
2022-11-02 16:50:26 +01:00
Mike Fährmann
0059e2bfe7 [downloader:http] add MIME type and signature for .avif files 2022-11-01 17:25:21 +01:00
Mike Fährmann
f687e64513 [downloader:http] refactor file signature checks
use functions/lambdas instead of startswith()
2022-11-01 17:09:13 +01:00
Mike Fährmann
1aae9f2b71 [downloader:ytdl] update _set_outtmpl() (fixes #2692)
bf1824b391
2022-06-20 11:32:02 +02:00
Mike Fährmann
c0c1277c5f [downloader:http] support sending POST data (#2433)
by setting the '_http_data' metadata field for a file

needed in addition to be3492776b
to download files with POST requests
2022-03-23 21:48:38 +01:00
Mike Fährmann
be3492776b [downloader:http] support using a different method than GET (#2433)
by setting the '_http_method' metadata field for a file
2022-03-20 10:09:05 +01:00
Mike Fährmann
47cf05c4ab refactor proxy handling code (#2357)
- allow gallery-dl proxy settings to overwrite environment proxies
- allow specifying different proxies for data extraction and download
  - add 'downloader.proxy' option
  - '-o extractor.proxy=–PROXY_URL -o downloader.proxy=null'
    now has the same effect as youtube-dl's '--geo-verification-proxy'
2022-03-10 23:55:35 +01:00
Mike Fährmann
c0fddcefc5 [downloader:ytdl] make ImportErrors non-fatal (#2273) 2022-02-08 19:30:29 +01:00
Mike Fährmann
ebd3d5c1cc [bunkr] fix .mp4 downloads (closes #2239) 2022-01-28 23:21:16 +01:00
Mike Fährmann
f4e3cee6ac use yt-dlp by default (#1850, #2028) 2021-11-29 18:24:26 +01:00
Mike Fährmann
19403a7fff [downloader:ytdl] prevent crash in '_progress_hook()' (#1680)
'speed' is not guaranteed to be defined or convertible to 'int'
2021-11-12 18:54:04 +01:00
Mike Fährmann
efa178cc91 [ytdl] implement parsing ytdl command-line options (#1680)
- adds 'config-file' and 'cmdline-args' options
  for both ytdl downloader and extractor
- create 'ytdl' helper module, which combines YoutubeDL creation
  and option parsing.
- most likely a buggy mess due to incompatibilities between the
  original youtube-dl and yt-dlp.
2021-11-07 02:44:11 +01:00
Mike Fährmann
232ab626a7 [downloader:ytdl] prevent crash in '_progress_hook()'
https://github.com/mikf/gallery-dl/discussions/1964#discussioncomment-1516702
2021-10-21 22:57:04 +02:00
Mike Fährmann
d0761454b1 implement a download progress indicator (#1519) 2021-09-28 22:48:58 +02:00
Mike Fährmann
b5b1cf22b7 [downloader:http] reorder HTTP header sources
so that any header can be overwritten by a user, except Range
2021-08-05 23:01:54 +02:00
Mike Fährmann
f5b097165e [ytdl] transfer YoutubeDL objects to downloader (#1680)
allows specifying downloader-specific options per subcategory
but overwrites all downloader.ytdl settings
2021-07-16 15:40:54 +02:00
Mike Fährmann
fc19010808 [downloader:ytdl] fix 'outtmpl' setting for yt_dlp (#1680)
yt_dlp supports multiple outtmpl settings for different file types and
uses its 'outtmpl_dict' for that.
2021-07-16 15:05:16 +02:00
Mike Fährmann
e622e004f0 [ytdl] improve module imports (#1680)
Apply 'extractor.ytdl.module' for every URL, not just the first.
2021-07-14 03:08:00 +02:00
Mike Fährmann
36ac2197db [ytdl] add extractor for sites supported by youtube-dl
(#1680, #878)

Can be used by prefixing any URL with 'ytdl:',
or by setting 'extractor,ytdl.enabled' to 'true'.
2021-07-10 20:55:47 +02:00
Mike Fährmann
221015e586 [downloader:http] disable filename extension changes for ugoira
(#1507)
2021-04-27 01:29:09 +02:00
Mike Fährmann
1a38fae785 add option to use different youtube-dl modules (fixes #1330)
by setting the 'downloader.ytdl.module' value. For example

{
    "downloader": {
        "ytdl": {
            "module": "yt_dlp"
        }
    }
}

or '-o module=yt_dlp'
2021-03-01 03:10:42 +01:00
Mike Fährmann
8821dceb79 use __import__() to dynamically load modules 2021-03-01 01:27:02 +01:00
Mike Fährmann
cf5fa75d4c add 'browser' option (#1117)
- change default user agent to Firefox ESR 78 on Windows 10
- remove 'ciphers' option
2021-02-26 13:41:27 +01:00
Mike Fährmann
560277394e [downloader:http] add 'headers' option (#1322) 2021-02-21 19:13:39 +01:00
Mike Fährmann
a228bb3a5f [downloader:http] support callbacks to validate responses 2021-01-29 22:15:21 +01:00
Mike Fährmann
0594821fcd [downloader:http] add MIME type and signature for .ico files
(closes #1211)
2021-01-01 16:07:33 +01:00
Mike Fährmann
476d563ec2 [downloader:http] add MIME type and signature for .swf files 2020-12-11 14:21:04 +01:00
Mike Fährmann
fe0265c7a5 [downloader.http] small improvements to file signature list
- specify multiple entries for gif, mp3, zip
- add entries for pdf
2020-12-08 21:20:18 +01:00
Mike Fährmann
1a4b61f7eb [downloader:http] fix issues with chunked transfer encoding
(fixes #1144)
2020-11-30 01:10:45 +01:00
Mike Fährmann
536c088462 [downloader:http] improve 'adjust-extensions' (#776)
Check file headers against a list of file signatures before
downloading the whole file and writing it to disk.

The file signature check needs some improvements (*),
but it produces usable results for the most part.

(*)
- 'webp', 'wav', and others start with 'RFFI'
- 'svg' uses the same "signature" as all XML documents
- 'webm' has the same signature as 'mkv' files
- only 'mp3' files in an ID3v2 container get recognized
2020-11-29 20:55:35 +01:00
Mike Fährmann
f6fd449b59 reduce wait time growth rate from exponential to linear
Waiting for 2**N seconds after each error grows too fast.
Simply waiting N seconds seems far more reasonable.
2020-09-06 22:38:25 +02:00
Mike Fährmann
ac3036ef56 add 'filesize-min' and 'filesize-max' options (closes #780) 2020-09-03 18:21:04 +02:00
Mike Fährmann
34929f673f readd 'session' to base downloader class (fixes #768) 2020-05-20 20:04:46 +02:00
Mike Fährmann
ece73b5b2a make 'path' and 'keywords' available in logging messages
Wrap all loggers used by job, extractor, downloader, and postprocessor
objects into a (custom) LoggerAdapter that provides access to the
underlying job, extractor, pathfmt, and kwdict objects and their
properties.

__init__() signatures for all downloader and postprocessor classes have
been changed to take the current Job object as their first argument,
instead of the current extractor or pathfmt.

(#574, #575)
2020-05-18 19:04:51 +02:00
Mike Fährmann
f8661c6578 [downloader:ytdl] fix file extensions when merging into mkv 2020-05-13 22:35:33 +02:00