Commit Graph

97 Commits

Author SHA1 Message Date
ClosedPort22
5448268d5c [downloader:http] close connection when file already exists (#3748) 2023-08-08 23:35:43 +08:00
Mike Fährmann
c182094ebf merge #3748: [downloader:http] add 'consume-content' option 2023-04-26 23:03:18 +02:00
ClosedPort22
6f4a843fba [downloader:http] release connection before logging messages
This allows connections to be properly released when using 'actions'
feature.
2023-04-24 23:59:36 +08:00
Mike Fährmann
2edcdee32f [downloader:http] add MIME type and signature for .heic files
(#3915)
https://github.com/strukturag/libheif/issues/83
2023-04-15 17:09:22 +02:00
ClosedPort22
775d2ac999 [downloader:http] improve error logging when releasing connection 2023-03-31 20:08:38 +08:00
ClosedPort22
1a977f0f62 [downloader:http] handle exceptions in 'validate'
This isn't strictly necessary for 'exhentai.py', but it improves
efficiency when the adapter is reused
2023-03-23 19:57:13 +08:00
ClosedPort22
fcaeaf539c [downloader:http] handle exceptions while consuming content 2023-03-11 21:36:37 +08:00
Mike Fährmann
67ec91cdbd [downloader:http] change '_http_retry' to accept a Python function
and rename '_http_retry_codes' to '_http_retry'

(#3569)
2023-03-09 23:30:15 +01:00
ClosedPort22
df77271438 [downloader:http] add 'consume-content' option
* fix connection not being released when the response is neither
  successful nor retried
* add the ability to consume the HTTP response body instead of closing
  the connection

reference:

https://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow
2023-03-09 21:07:10 +08:00
Mike Fährmann
d16873941c [downloader:http] use 'time.monotonic()' 2023-01-31 15:32:12 +01:00
Mike Fährmann
ec9ff7640d merge #3535: [downloader:http] add signature checks for .blend, .obj, and .clip files 2023-01-16 15:09:10 +01:00
ClosedPort22
b6706b373a [downloader:http] add signature checks for some formats
also add the MIME type for .obj files
2023-01-15 23:40:55 +08:00
Mike Fährmann
c881548a27 add 'extractor.retry-codes' option (#3313)
do not retry 429 and 430 by default
2023-01-14 17:25:30 +01:00
Mike Fährmann
c0d7d2be35 [downloader:http] add 'validate' option 2023-01-11 15:37:40 +01:00
Mike Fährmann
80102fa367 [downloader:http] add 'retry-codes' option (#3313) 2022-12-01 11:08:23 +01:00
Mike Fährmann
b4253f69c9 [downloader:http] fix ZeroDivisionError (#3328)
ensure 'time_elapsed' only get used as divisor
when it is greater than zero
2022-11-30 21:56:18 +01:00
Mike Fährmann
f87cfa5f66 [downloader:http] add signature check for .mp4 files 2022-11-16 21:45:26 +01:00
Mike Fährmann
a4ff20cf16 [downloader:http] fix issues from inaccurate 'time.sleep()'
(#3143)

Reverts part of c59b98c8 by going back to using a global timer
instead of a per-chunk one.

Reintroduces the issue of ignoring rate limits after
suspending and resuming the process.
2022-11-10 13:24:02 +01:00
Mike Fährmann
550f90ab56 delay enabling .part files when 'http-metadata' is set
otherwise 'build_path' gets called before all metadata is collected
2022-11-09 13:23:52 +01:00
Mike Fährmann
8124c16a50 split 'build_path' from 'set_filename' and 'set_extension'
Do not automatically build a new path
when setting file metadata or updating its extension.
2022-11-08 17:03:24 +01:00
Mike Fährmann
39d9c362e4 include 'http-metadata' in '-K' output 2022-11-07 16:33:26 +01:00
Mike Fährmann
870e6a48a0 implement 'http-metadata' option
or at least attempt to.
2022-11-05 18:29:29 +01:00
Mike Fährmann
bca9f965e5 [downloader:http] add 'chunk-size' option (#3143)
and double the previous default from 16384 (2**14) to 32768 (2**15)
2022-11-02 16:50:26 +01:00
Mike Fährmann
0059e2bfe7 [downloader:http] add MIME type and signature for .avif files 2022-11-01 17:25:21 +01:00
Mike Fährmann
f687e64513 [downloader:http] refactor file signature checks
use functions/lambdas instead of startswith()
2022-11-01 17:09:13 +01:00
Mike Fährmann
c0c1277c5f [downloader:http] support sending POST data (#2433)
by setting the '_http_data' metadata field for a file

needed in addition to be3492776b
to download files with POST requests
2022-03-23 21:48:38 +01:00
Mike Fährmann
be3492776b [downloader:http] support using a different method than GET (#2433)
by setting the '_http_method' metadata field for a file
2022-03-20 10:09:05 +01:00
Mike Fährmann
47cf05c4ab refactor proxy handling code (#2357)
- allow gallery-dl proxy settings to overwrite environment proxies
- allow specifying different proxies for data extraction and download
  - add 'downloader.proxy' option
  - '-o extractor.proxy=–PROXY_URL -o downloader.proxy=null'
    now has the same effect as youtube-dl's '--geo-verification-proxy'
2022-03-10 23:55:35 +01:00
Mike Fährmann
ebd3d5c1cc [bunkr] fix .mp4 downloads (closes #2239) 2022-01-28 23:21:16 +01:00
Mike Fährmann
d0761454b1 implement a download progress indicator (#1519) 2021-09-28 22:48:58 +02:00
Mike Fährmann
b5b1cf22b7 [downloader:http] reorder HTTP header sources
so that any header can be overwritten by a user, except Range
2021-08-05 23:01:54 +02:00
Mike Fährmann
221015e586 [downloader:http] disable filename extension changes for ugoira
(#1507)
2021-04-27 01:29:09 +02:00
Mike Fährmann
cf5fa75d4c add 'browser' option (#1117)
- change default user agent to Firefox ESR 78 on Windows 10
- remove 'ciphers' option
2021-02-26 13:41:27 +01:00
Mike Fährmann
560277394e [downloader:http] add 'headers' option (#1322) 2021-02-21 19:13:39 +01:00
Mike Fährmann
a228bb3a5f [downloader:http] support callbacks to validate responses 2021-01-29 22:15:21 +01:00
Mike Fährmann
0594821fcd [downloader:http] add MIME type and signature for .ico files
(closes #1211)
2021-01-01 16:07:33 +01:00
Mike Fährmann
476d563ec2 [downloader:http] add MIME type and signature for .swf files 2020-12-11 14:21:04 +01:00
Mike Fährmann
fe0265c7a5 [downloader.http] small improvements to file signature list
- specify multiple entries for gif, mp3, zip
- add entries for pdf
2020-12-08 21:20:18 +01:00
Mike Fährmann
1a4b61f7eb [downloader:http] fix issues with chunked transfer encoding
(fixes #1144)
2020-11-30 01:10:45 +01:00
Mike Fährmann
536c088462 [downloader:http] improve 'adjust-extensions' (#776)
Check file headers against a list of file signatures before
downloading the whole file and writing it to disk.

The file signature check needs some improvements (*),
but it produces usable results for the most part.

(*)
- 'webp', 'wav', and others start with 'RFFI'
- 'svg' uses the same "signature" as all XML documents
- 'webm' has the same signature as 'mkv' files
- only 'mp3' files in an ID3v2 container get recognized
2020-11-29 20:55:35 +01:00
Mike Fährmann
f6fd449b59 reduce wait time growth rate from exponential to linear
Waiting for 2**N seconds after each error grows too fast.
Simply waiting N seconds seems far more reasonable.
2020-09-06 22:38:25 +02:00
Mike Fährmann
ac3036ef56 add 'filesize-min' and 'filesize-max' options (closes #780) 2020-09-03 18:21:04 +02:00
Mike Fährmann
34929f673f readd 'session' to base downloader class (fixes #768) 2020-05-20 20:04:46 +02:00
Mike Fährmann
ece73b5b2a make 'path' and 'keywords' available in logging messages
Wrap all loggers used by job, extractor, downloader, and postprocessor
objects into a (custom) LoggerAdapter that provides access to the
underlying job, extractor, pathfmt, and kwdict objects and their
properties.

__init__() signatures for all downloader and postprocessor classes have
been changed to take the current Job object as their first argument,
instead of the current extractor or pathfmt.

(#574, #575)
2020-05-18 19:04:51 +02:00
Mike Fährmann
19a7afdd9b [downloader:http] add MIME types for .psd files (closes #714) 2020-04-29 23:01:42 +02:00
Mike Fährmann
38bc6430d3 [downloader:http] don't overwrite existing '_mtime' fields 2020-04-10 23:08:03 +02:00
Mike Fährmann
115fd2c6f2 "fix" incomplete MIME types (#632)
e-/exhentai's original image downloads currently send
incomplete/invalid Content-Type headers, "jpg" instead
of "image/jpg" etc, since the last update.
(https://forums.e-hentai.org/index.php?showtopic=236113)

This change prepends any Content-Type value missing a
media type specification with "image/", transforming it
into a valid MIME type.

(A global solution to a local problem, but it shouldn't
 cause any issues anywhere else)
2020-03-03 21:21:57 +01:00
Mike Fährmann
adcd7cb24a [downloader:http] add another MIME type for '.rar' files (#628) 2020-03-01 20:42:13 +01:00
Mike Fährmann
380b693fad [downloader:http] add more MIME types for '.bmp' files (#621) 2020-02-23 16:51:04 +01:00
Mike Fährmann
760b9b4db4 add remove_file() and remove_directory() helpers
these functions call os.unlink() or os.rmdir()
while catching and suppressing potential OSErrors
2020-01-18 00:21:26 +01:00