gallery-dl

Author	SHA1	Message	Date
Mike Fährmann	8bf3cdd82b	implement logging options Standard logging to stderr, logfiles, and unsupported URL files (which are now handled through the logging module) can now be configured by setting their respective option keys (log, logfile, unsupportedfile) to a dict and specifying the following options; - format: format string for logging messages available keys: see [1] default: "[{name}][{levelname}] {message}" - format-date: format string for {asctime} fields in logging messages available keys: see [2] default: "%Y-%m-%d %H:%M:%S" - level: the lowercase levelname until which the logger should activate; available levels are debug, info, warning, error, exception default: "info" - path: path of the file to be written to - mode: 'mode' argument when opening the specified file can be either "w" to truncate the file or "a" to append to it (see [3]) If 'output.log', '.logfile', or '.unsupportedfile' is a string, it will be interpreted, as it has been, as the filepath (or as format string for .log) [1] https://docs.python.org/3/library/logging.html#logrecord-attributes [2] https://docs.python.org/3/library/time.html#time.strftime [3] https://docs.python.org/3/library/functions.html#open	2018-05-01 17:54:52 +02:00
Mike Fährmann	95392554ee	use text.urljoin()	2018-04-26 17:00:26 +02:00
Mike Fährmann	2721417dd8	Merge branch 'master' into 1.4-dev	2018-04-24 11:33:02 +02:00
Mike Fährmann	c6d5154fc3	fix flake8 errors, ignore W504 pycodestyle 2.4.0 enforces some new style guidelines	2018-04-24 11:25:32 +02:00
Mike Fährmann	2d17a9e07f	improve extractor.request() - better retry behavior - exponential back-off - removed 'allow_empty' argument	2018-04-23 18:45:59 +02:00
Mike Fährmann	80521ae1f6	[deviantart] improve API error handling The previous implementation would retry requests with 4xx status codes in an infinite loop, which is especially a problem when querying non-existent users or groups. These are now properly handled with a NotFoundError exception.	2018-04-23 10:10:43 +02:00
Mike Fährmann	e54b43be08	[mangadex] add title info for chapter extractors	2018-04-22 16:20:04 +02:00
Mike Fährmann	f471161920	Merge branch 'master' into 1.4-dev	2018-04-21 12:15:40 +02:00
Mike Fährmann	a2020c736e	release version 1.3.4	2018-04-20 18:42:09 +02:00
Mike Fährmann	eb37fbf0e8	[hentaifoundry] improve extractor - use common base class - better pagination - respect '.../page/<num>' - implement skip() / --range support - get YII_CSRF_TOKEN from cookies	2018-04-20 18:26:23 +02:00
Mike Fährmann	80bead739d	[oauth] require custom client-* values for pinterest	2018-04-20 15:31:05 +02:00
Mike Fährmann	cc36f88586	rename safe_int to parse_int; move parse_* to text module	2018-04-20 14:53:21 +02:00
Mike Fährmann	ff643793bd	improve and document cloudflare bypass code	2018-04-19 21:32:10 +02:00
Mike Fährmann	10cc59f3b5	fix extractor names	2018-04-18 18:12:57 +02:00
Mike Fährmann	b1325d4d2c	fix extractor docstrings	2018-04-18 18:03:43 +02:00
Mike Fährmann	df7e18399e	[luscious] fix image order	2018-04-17 17:32:21 +02:00
Mike Fährmann	d10579edb5	[pinterest] improve PinterestAPI code; remove OAuth mentions on another note: access_tokens have been set to only allow for 10 requests per hour (from 200 yesterday)	2018-04-17 17:12:42 +02:00
Mike Fährmann	4bd182c107	[pinterest] implement `oauth:pinterest` (#83 ) Pinterest access tokens are rate limited at 200 requests per hour (or maybe per 2 or 3 hours?) so having just one access token for all users isn't going to work in the long run.	2018-04-16 20:03:28 +02:00
Mike Fährmann	9651f3fce0	[pinterest] improve error messages (#83 )	2018-04-16 19:36:54 +02:00
Mike Fährmann	dbe250f7e5	[pinterest] update access_token (#83 )	2018-04-16 09:46:45 +02:00
Mike Fährmann	dd49127408	[spectrumnexus] remove module Site stopped hosting manga scans (http://view.thespectrum.net/)	2018-04-16 09:45:07 +02:00
Mike Fährmann	5c487300ee	improve 'parse_query()' and add tests - another irrelevant micro-optimization ! - use urllib.parse.parse_qsl directly instead of parse_qs, which just packs the results of parse_qsl in a different data structure - reduced memory requirements since no additional dict and lists are created	2018-04-15 19:05:29 +02:00
Mike Fährmann	728c64a3fb	[tumblr] rename 'offset' to 'num and adjust formats Trying to somehow emulate Tumblr filenames is a bad idea ...	2018-04-15 18:58:32 +02:00
Mike Fährmann	4ffa94f634	remove 'shorten_path()' and 'shorten_filename()'	2018-04-15 18:44:13 +02:00
Mike Fährmann	27eab4e467	rewrite text tests and improve functions - test more edge cases - consistently return an empty string for invalid arguments - remove the ungreedy-flag in 'remove_html()'	2018-04-15 18:13:46 +02:00
Mike Fährmann	e3f2bd4087	add tests for 'text.clean_xml()' and improve it	2018-04-14 22:07:01 +02:00
Mike Fährmann	6d8b191ea7	improve 'parse_query()' and add tests - another irrelevant micro-optimization ! - use urllib.parse.parse_qsl directly instead of parse_qs, which just packs the results of parse_qsl in a different data structure - reduced memory requirements since no additional dict and lists are created	2018-04-13 19:21:32 +02:00
Mike Fährmann	51ea699083	add 'abort()' as function to filter expressions calling 'abort()' in a filter aborts the current extractor run in a cleaner way than using something like 1/0, which causes an error message to be printed	2018-04-12 17:07:12 +02:00
Mike Fährmann	6bd857a319	[tumblr] handle rate limits / 429 errors - wait for the hourly limit to reset - abort upon exceeding the daily limit (it doesn't seem useful to potentially wait for several hours)	2018-04-12 16:25:20 +02:00
Mike Fährmann	7073ab7707	[komikcast] update regex to only match manga pages The 'readerarea' section now includes some (shady) external Javascript file, which got matched as well.	2018-04-11 15:48:17 +02:00
Mike Fährmann	a1fa4b43b0	Revert "[tumblr] add option to sort photosets by upload order" This reverts commit `4a26ae32df`.	2018-04-09 16:08:08 +02:00
Mike Fährmann	48a83a89e9	[loveisover] remove module archive.loveisover.me was shut down on 2018-03-29; https://www.archiveteam.org/index.php?title=4chan#archive.loveisover.me	2018-04-09 16:05:15 +02:00
Mike Fährmann	564e12ca8f	replace 'imgyt' with 'imxto' https://img.yt/ wasn't available for a couple of days, but has now re-emerged as https://imx.to/ with a new web-interface. Links to older images still work (see tests).	2018-04-09 15:53:20 +02:00
Mike Fährmann	1b80fa82a9	[imgur] update URL pattern and tests	2018-04-08 21:06:21 +02:00
Mike Fährmann	4a26ae32df	[tumblr] add option to sort photosets by upload order	2018-04-07 15:57:55 +02:00
Mike Fährmann	6b72be8ee6	[tumblr] add 'hash' keyword 'hash' is the middle part of the filename in a tumblr image URL. For example an image with '.../tumblr_p6tgemp1NZ1wgha4yo1_250.png' as its URL would have 'p6tgemp1NZ1wgha4yo1' as hash.	2018-04-07 15:54:30 +02:00
Mike Fährmann	ffc0c67701	release version 1.3.3	2018-04-06 15:45:45 +02:00
Mike Fährmann	d11fcf4804	smaller changes and fixes - fix the cloudflare challenge result if the last decimal places are zero (JS`s toFixed() removes trailing zeroes) - fix downloading of kissmanga chapter-pages hosted on blogspot (accessing blogspot with "kissmanga.com" as referrer yields a 401) - disable certificate validation for 'mangahere' tests - update flickr test result	2018-04-06 15:30:09 +02:00
Mike Fährmann	f6c95dccf9	[cloudflare] fix bypass procedure Cloudflare challenges, at least for kissmanga and readcomiconline, now use slightly different Javascript expressions. Instead of a single value per expression, they now have a numerator and a denominator of a fractional value, which in the end gets truncated to 10 decimal places.	2018-04-05 20:28:04 +02:00
Mike Fährmann	759ba26fb0	[luscious] proper image order for picture albums ... and (try) to start with the first image instead of somewhere in the middle of an album.	2018-04-05 18:12:01 +02:00
Mike Fährmann	68e9fbee16	[tumblr] check all 4 keys/secrets before using OAuth it was possible to cause a crash by setting api-key or -secret to null. (this commit also slightly improves the blog-cache implementation)	2018-04-05 15:42:23 +02:00
Mike Fährmann	4810d446bb	remove the obsolete safeprint() and error() functions - safeprint() was used to print values which might have caused a UnicodeEncodeError, but that is no longer necessary (`0381ae5`) - errors are now handled via logging output (`f94e370`)	2018-04-05 13:10:33 +02:00
Mike Fährmann	0381ae5318	replace error handlers for stdout and co. Python3.5 and lower throw an UnicodeEncodeError when trying to print not-encodable characters when not using 'utf-8' as encoding. Setting their error handlers to 'replace' should help.	2018-04-04 17:30:42 +02:00
Mike Fährmann	f8168c693e	[tumblr] avoid calls to '/blog/.../info' The same information returned by the 'blog/.../info' API endpoint is also included in the result of every 'blog/.../posts' call.	2018-04-04 14:15:24 +02:00
Mike Fährmann	64d7c85b55	[exhentai] improve metadata - add 'width', 'height' and 'size' (in bytes) for each image - change the former 'size' and 'size_units' into 'gallery_size'	2018-04-03 18:59:53 +02:00
Mike Fährmann	64b22e0fc1	[pawoo] update URL pattern adds support for 'https://pawoo.net/@.../media'	2018-04-02 13:00:59 +02:00
Mike Fährmann	7b562907c3	[nijie] add favorites extractor adds support for 'https://nijie.info/user_like_illust_view.php?id=...'	2018-03-31 18:54:25 +02:00
Mike Fährmann	445db75955	[nijie] improve extraction and metadata - add 'title' and 'description' - split 'artist_id' into 'user_id' and 'artist_id' - 'user_id' is the ID of the user from which the image entry originates from - 'artist_id' is the ID of the actual image artist - improve pagination and URL patterns	2018-03-31 18:48:41 +02:00
Mike Fährmann	a112e3f2a0	[nijie] add doujin extractor adds support for "https://nijie.info/members_dojin.php?id=<artist_id>"	2018-03-31 18:17:41 +02:00
Mike Fährmann	f39153b6e9	[nhentai] add extractor for search results	2018-03-28 17:21:44 +02:00

1 2 3 4 5 ...

1155 Commits