Commit Graph

61 Commits

Author SHA1 Message Date
Mike Fährmann
dd358b4564 improve cookie handling during logins 2019-01-30 17:09:32 +01:00
Mike Fährmann
06cbf5f9c4 implement 'chapter-reverse' option (#149)
Setting it to `true` will start with the latest chapter instead of the
first one.
2019-01-07 18:22:33 +01:00
Mike Fährmann
9a98b6769d use extractor.request for API calls (#130)
... at least for OAuth1.0 based APIs (flickr, smugmug, tumblr)
2018-12-04 21:29:06 +01:00
Mike Fährmann
b828473aa3 retry HTTP requests for more exception classes 2018-11-19 15:49:13 +01:00
Mike Fährmann
c47482b110 smaller changes, missing docs, etc.
- make 'netrc' extractor-specific
- rename 'downloader.enable' to 'enabled'
- document 'downloader.ytdl.format'
- consistent newlines in configuration.rst
2018-11-16 18:18:07 +01:00
Mike Fährmann
2fa28a2609 update default user-agent string (closes #122) 2018-11-11 10:07:10 +01:00
Mike Fährmann
c9861ca812 adjust message for status_code based exceptions
from: 5xx HTTP Error: Reason
to  : 5xx: Reason

The "HTTP Error" part was in there to emulate Request's error messages
from response.raise_for_status(), but it reads a lot better without.
2018-10-18 15:09:49 +02:00
Mike Fährmann
4a348990f4 adjust value resolution for retries/timeout/verify options
This change introduces 'extractor.*.retries/timeout/verify' options
as a general way to set these values for all HTTP requests.

'downloader.http.retries/timeout/verify' is a way to override these
options for file downloads only and will fall back to 'extractor.*.…*
values if they haven't been explicitly set.

Also: downloader classes now take an extractor object as first argument
instead of a requests.session.
2018-10-07 21:13:39 +02:00
Mike Fährmann
f647f5d9c3 use 'verify' option for regular HTTP requests 2018-10-06 16:38:43 +02:00
Mike Fährmann
68d6033a5d use 'retries' and 'timeout' options for regular HTTP requests 2018-08-02 16:11:54 +02:00
Mike Fährmann
017188d268 improve extractor.request()
Replace the 'fatal' parameter with 'expect', which is a list/range
of HTTP status codes >= 400 that should also be accepted.
2018-06-18 16:29:56 +02:00
Mike Fährmann
2d17a9e07f improve extractor.request()
- better retry behavior
- exponential back-off
- removed 'allow_empty' argument
2018-04-23 18:45:59 +02:00
Mike Fährmann
8704d850bf add explicit proxy support (#76)
- '--proxy' as command-line argument
- 'extractor.*.proxy' as config option
2018-02-19 18:45:06 +01:00
Mike Fährmann
179bcdd349 adjust archive-ids 2018-02-13 04:50:45 +01:00
Mike Fährmann
3cec533c28 Merge branch 'archive' 2018-02-12 18:07:58 +01:00
Mike Fährmann
5b3c34aa96 use generic chapter-extractor in more modules 2018-02-07 12:36:39 +01:00
Mike Fährmann
7a412f5c32 implement generic manga-chapter extractor 2018-02-04 22:02:04 +01:00
Mike Fährmann
84a52a9256 add DownloadArchive class 2018-01-30 15:23:23 +01:00
Mike Fährmann
cc0c2cca57 [reddit] add extractor for reddit-hosted images (closes #68) 2018-01-14 18:55:42 +01:00
Mike Fährmann
e6814aebe2 add 'extractor.*.user-agent' config option 2017-11-15 14:01:33 +01:00
Mike Fährmann
baf8094868 improve Extractor.request()'s retry behavior 2017-11-13 20:37:11 +01:00
Mike Fährmann
16783e327f [common] fix UnboundLocalError in Extractor.request() 2017-10-20 18:51:06 +02:00
Mike Fährmann
9aecc67841 [common] explicitly handle HTTP status code 429 2017-10-14 21:37:59 +02:00
Mike Fährmann
b319f4bab3 smaller code and text changes 2017-10-01 18:23:40 +02:00
Mike Fährmann
26a866e7d8 implement (sub)category-transfer between extractors (#41)
ImageFap- and all Manga-Extractors will transfer their (sub)category
values to other extractors instantiated by them, which will in turn
allow those to use options set for their parents.

Example:
ImagefapGalleryExtractors will use options set under
extractor.imagefap.user, if (and only if) they have been instantiated by
a ImagefapUserExtractor; and options from extractor.imagefap.gallery
otherwise.
2017-09-26 21:05:11 +02:00
Mike Fährmann
9c138dfc1f [common] detect empty HTTP response bodies 2017-09-26 16:49:58 +02:00
Mike Fährmann
deb2e803ba simplify MangaExtractor class 2017-09-24 16:05:43 +02:00
Mike Fährmann
0dedbe759c enable '--chapter-filter'
The same filter infrastructure that can be applied to image URLS now
also works for manga chapters and other delegated URLs.

TODO: actually provide any metadata (currently supported is only
deviantart and imagefap).
2017-09-12 16:19:00 +02:00
Mike Fährmann
be30fb2f98 add common config category for boorus and foolslide 2017-08-29 22:42:48 +02:00
Mike Fährmann
915a0137de improve 'extractor.request'
- add 'fatal' argument
- improve internal logic and flow
- raise known exception on error
- update exception hierarchy
2017-08-05 16:11:46 +02:00
Mike Fährmann
7aa9fa796a code cleanup and fixes 2017-07-25 14:59:41 +02:00
Mike Fährmann
55f048d02b ignore case of cookiejar magic strings 2017-07-24 18:33:42 +02:00
Mike Fährmann
808f67ba7d use 'cookiedomain' for cookies set by object-config-values
otherwise these cookies would not be picked up by the
_check_cookies() method.
2017-07-22 15:43:35 +02:00
Mike Fährmann
0610ae5000 skip login if cookies are present 2017-07-17 10:33:36 +02:00
Mike Fährmann
726c6f01ae allow 'cookies' config option to be a dictionary 2017-07-07 18:01:46 +02:00
Mike Fährmann
a804a42e23 add '--cookies' command-line option 2017-07-03 15:02:19 +02:00
Mike Fährmann
d3b04076f7 add .netrc support (#22)
Use the '--netrc' cmdline option or set the 'netrc' config option
to 'true' to enable the use of .netrc authentication data.

The 'machine' names for the .netrc info are the lowercase extractor
names (or categories): batoto, exhentai, nijie, pixiv, seiga.
2017-06-24 12:17:26 +02:00
Mike Fährmann
c184e47ee3 put common directory- and filename formats in base classes 2017-05-30 12:10:16 +02:00
Mike Fährmann
f226417420 simplify code by using a MangaExtractor base class 2017-05-20 11:27:43 +02:00
Mike Fährmann
4b967fa189 implement and use extractor.config() method 2017-04-25 17:12:48 +02:00
Mike Fährmann
f782282f97 add logger objects to extractors 2017-03-07 23:50:19 +01:00
Mike Fährmann
7a9d66fbce implement basic way to tell extractors to skip ahead 2017-03-03 17:26:50 +01:00
Mike Fährmann
0b59d9f8c7 disable urllib3s InsecureConnectionWarning 2017-02-11 21:21:57 +01:00
Mike Fährmann
37d4d07d9b compatibility fixes to make a standalone exe work 2017-01-23 00:07:36 +01:00
Mike Fährmann
cc0b4f2661 [yomanga] add chapter extractor 2017-01-13 00:03:12 +01:00
Mike Fährmann
ad4b02508f trying to understand travis-ci unit test failures
- added some debug output via logging module
- unit tests work on my machine (tm)
2017-01-12 22:35:42 +01:00
Mike Fährmann
e6d26f0476 don't overwrite a response's encoding with None 2016-10-30 20:38:22 +01:00
Mike Fährmann
f0f7306db6 re-raise async exceptions in main thread 2016-07-24 22:16:59 +02:00
Mike Fährmann
000df8d1fa add 'encoding' argument for Extractor.request 2016-07-12 12:06:17 +02:00
Mike Fährmann
81dcfbec90 initial support for extractor-subcategories 2015-11-30 00:30:02 +01:00