Commit Graph

939 Commits

Author SHA1 Message Date
Mike Fährmann
a718c6c6cd implement 'util.parse_bytes()' 2017-12-02 01:24:49 +01:00
Mike Fährmann
038e3b3369 [kissmanga] handle "AreYouHuman" redirects (#51) 2017-12-01 15:22:50 +01:00
Mike Fährmann
2b9a783fc7 [khinsider] fix extraction 2017-12-01 14:00:37 +01:00
Mike Fährmann
3dc1169736 use own mapping before relying on the 'mimetypes' module 2017-12-01 13:50:31 +01:00
Mike Fährmann
214972bc9a [gelbooru] use manual extraction
... to compensate for their disabled API.
(https://gelbooru.com/index.php?page=forum&s=view&id=3875)

This also adds an extractor for image-pools.
2017-11-29 20:48:17 +01:00
Mike Fährmann
55c64cad4b [khinsider] fix filename extension and test-pattern 2017-11-28 19:35:47 +01:00
Mike Fährmann
c0bcf8e343 release version 1.0.2 2017-11-24 17:24:39 +01:00
Mike Fährmann
28bf25f37d update CHANGELOG 2017-11-24 17:00:45 +01:00
Mike Fährmann
b14de6ffc2 [tumblr] small improvements
- don't transform inline GIF URLs
- set 'type' parameter for API calls if there is only
  one post type selected
2017-11-24 16:51:07 +01:00
Mike Fährmann
9296a26eae [tumblr] add warning messages 2017-11-23 16:12:07 +01:00
Mike Fährmann
65c1c53eb8 [khinsider] fix extraction 2017-11-23 15:33:49 +01:00
Mike Fährmann
12de658937 [tumblr] add options to control extraction behavior (#48)
- posts   : list of post-types to inspect
- inline  : scan post bodies for inline images
- external: follow external links
2017-11-23 15:32:54 +01:00
Mike Fährmann
077f8c12be [tumblr] original video URLs + continuous offset 2017-11-20 20:51:02 +01:00
Mike Fährmann
8eb12ebeae [tumblr] support more post/media types (#48)
This adds support for audio and video posts (most videos are shared
from youtube/instagram which isn't supported -> youtube-dl),
as well as link posts and image-search inside of text posts.

Most of this is just WIP and will need some sort of improvement
and options to enable/disable different media types etc.
2017-11-18 23:11:32 +01:00
Mike Fährmann
6c9da67581 apply selection options (filter, range) when using '-j' 2017-11-18 17:35:57 +01:00
Mike Fährmann
b8cdd42cab [senmanga] fix extraction (again)
this is basically a re-revert of 2ace5c7
2017-11-18 17:23:32 +01:00
Mike Fährmann
e6814aebe2 add 'extractor.*.user-agent' config option 2017-11-15 14:01:33 +01:00
Mike Fährmann
6913eeaa40 [powermanga] replace manga extractor unit test
My Hero Academia is gone
2017-11-15 14:01:24 +01:00
Mike Fährmann
7e0d9257a7 [hbrowse] fix manga extraction 2017-11-15 13:59:50 +01:00
Mike Fährmann
3c576d10c0 [seiga] better metadata + 'skip()' support 2017-11-15 13:58:35 +01:00
Mike Fährmann
f72318e593 [seiga] support more than 200 images
Due to API restrictions and/or missing knowledge about and
documentation of API usage, it was only possible to retrieve the
latest 200 images of a niconico seiga user with said API.

The new approach manually visits each HTML page and gets its
information from there.
2017-11-13 20:46:24 +01:00
Mike Fährmann
baf8094868 improve Extractor.request()'s retry behavior 2017-11-13 20:37:11 +01:00
Mike Fährmann
2457b71633 skip tests on 5xx status codes 2017-11-12 20:51:12 +01:00
Mike Fährmann
7e7b64162b [batoto] handle error 10031 2017-11-12 20:49:37 +01:00
Mike Fährmann
79bcaa8726 improve downloader retry behavior
- only retry download on 5xx and 429 status codes
- immediately fail on 4xx status codes
2017-11-10 21:46:18 +01:00
Mike Fährmann
5ee8ca0319 release version 1.0.1 2017-11-10 08:54:33 +01:00
Mike Fährmann
2c1adda784 update release.sh script
- update CHANGELOG on new releases
  - change issue references to actual links
  - replace "Unreleased" with new version and date
- fix filenames of old Windows executables

[no ci]
2017-11-08 17:47:52 +01:00
Mike Fährmann
e913e5ec77 add a CHANGELOG
This is basically just a copy&paste from the Releases page, but it
has the benefits of (1) better visibility and (2) "forcing" me to
write a changelog section before releasing a new version and not
several days after.
2017-11-08 16:54:13 +01:00
Mike Fährmann
42e948584d fix downloader error handling
RequestException being a subclass of OSError caused all exceptions
during file downloads to be ignored/re-raised.
2017-11-07 15:23:07 +01:00
Mike Fährmann
92027f67f9 use consistent names for URL constants
root := <scheme>://<host>
base_url := <root>/<common path>
2017-11-06 20:56:49 +01:00
Mike Fährmann
69cbc0619f [mangastream] fix 'next-page' URLs (fixes #49) 2017-11-04 11:50:40 +01:00
Mike Fährmann
980fd3616d [tumblr] use API v2 (#48) 2017-11-03 22:16:57 +01:00
Mike Fährmann
d6bed9f36f [tumblr] prevent premature exit to get all images (fixes #48) 2017-11-03 14:59:31 +01:00
Mike Fährmann
305da540c3 [mangahere] fix metadata extraction 2017-11-03 14:54:46 +01:00
Mike Fährmann
2d0cfb33e1 [xvideos] add user profile extractor (#45) 2017-11-02 17:28:35 +01:00
Mike Fährmann
a393e6e538 [xvideos] add gallery extractor (#45) 2017-11-02 15:36:53 +01:00
Mike Fährmann
3a8a0c1f35 [imgbox] rewrite / fix extraction (closes #47) 2017-11-01 13:01:59 +01:00
Mike Fährmann
f97207a8e6 release version 1.0.0 2017-10-27 16:22:51 +02:00
Mike Fährmann
a4bc5a3491 update setup.py and README.rst 2017-10-27 16:08:57 +02:00
Mike Fährmann
707b15b586 create missing directories for 'part-directory'
also some code improvements regarding downloader config values
2017-10-27 12:22:45 +02:00
Mike Fährmann
035ef655f1 [imagefap] update unit tests
old gallery/image has been deleted
2017-10-27 12:22:16 +02:00
Mike Fährmann
caf26412dd add option to set alternate location of .part files (#29)
Note: The path set for 'downloader.*.part-directory' needs to point to an
already existing directory.
2017-10-26 00:16:48 +02:00
Mike Fährmann
ea8ca4cfa4 add 'util.expand_path()' 2017-10-26 00:04:28 +02:00
Mike Fährmann
9a41002b77 fix partial downloads for 'text:' URLs
Using a filesize in bytes as offset into a Python string is not
a good idea if said file contains non-ASCII characters.
2017-10-25 15:04:45 +02:00
Mike Fährmann
239d7afea7 [hosturimage] fix extraction of larger images 2017-10-25 12:56:16 +02:00
Mike Fährmann
27c026543f re-enable download unit tests 2017-10-25 12:55:36 +02:00
Mike Fährmann
963670d73b add options to control usage of .part files (#29)
- '--no-part' command line option to disable them
- 'downloader.http.part' and 'downloader.text.part' config options

Disabling .part files restores the behaviour of the old downloader
implementation.
2017-10-24 23:33:44 +02:00
Mike Fährmann
158e60ee89 [3dbooru] enable download continuation
behoimi.org doesn't respect 'Range' headers and doesn't report
'Content-Length' for compressed content encodings.
2017-10-24 13:05:31 +02:00
Mike Fährmann
b0353aa02d rewrite download modules (#29)
- use '.part' files during file-download
- implement continuation of incomplete downloads
- check if file size matches the one reported by server
2017-10-24 12:53:03 +02:00
Mike Fährmann
c4fcdf2691 Revert "[senmanga] fix extraction and download"
This reverts commit 2ace5c7b3c.
2017-10-24 00:22:05 +02:00