Leonardo Taccari
a8d2dde8b2
[slideshare] Add a new extractor for slideshare.net ( #54 )
2017-12-13 17:38:29 +01:00
Mike Fährmann
19a6ae57b2
[sankaku] add pool extractor
2017-12-12 19:45:10 +01:00
Mike Fährmann
e52f0cc1ed
[sankaku] add post extractor
2017-12-12 18:20:15 +01:00
Mike Fährmann
595593a35e
[sankaku] rewrite
...
- better code structure and extensibility
- better metadata
2017-12-12 18:09:45 +01:00
Mike Fährmann
a3924d2072
[sankaku] fix swf extraction ( closes #52 )
2017-12-07 15:45:43 +01:00
Mike Fährmann
291369eab2
various smaller changes/additions
2017-12-06 21:45:56 +01:00
Mike Fährmann
300346ecdf
[mangazuki] remove extractors
...
This site has been in "rebuild"-mode for a fairly long time and the
current extractor code isn't going to work for the new version either.
2017-12-04 13:36:04 +01:00
Mike Fährmann
d275b1d9a3
[khinsider] fix extraction
...
... again
2017-12-04 12:42:06 +01:00
Mike Fährmann
6b8e3003df
[hentai2read] ensure consistent extraction results
2017-12-03 02:34:35 +01:00
Mike Fährmann
a1980b16f3
[gelbooru] various improvements
...
- better metadata for pools
- map ratings to s/q/e like other boorus do
- skip() support
2017-12-03 01:41:30 +01:00
Mike Fährmann
93482a1f88
implement 'util.advance()'
2017-12-03 01:38:24 +01:00
Mike Fährmann
038e3b3369
[kissmanga] handle "AreYouHuman" redirects ( #51 )
2017-12-01 15:22:50 +01:00
Mike Fährmann
2b9a783fc7
[khinsider] fix extraction
2017-12-01 14:00:37 +01:00
Mike Fährmann
214972bc9a
[gelbooru] use manual extraction
...
... to compensate for their disabled API.
(https://gelbooru.com/index.php?page=forum&s=view&id=3875 )
This also adds an extractor for image-pools.
2017-11-29 20:48:17 +01:00
Mike Fährmann
55c64cad4b
[khinsider] fix filename extension and test-pattern
2017-11-28 19:35:47 +01:00
Mike Fährmann
b14de6ffc2
[tumblr] small improvements
...
- don't transform inline GIF URLs
- set 'type' parameter for API calls if there is only
one post type selected
2017-11-24 16:51:07 +01:00
Mike Fährmann
9296a26eae
[tumblr] add warning messages
2017-11-23 16:12:07 +01:00
Mike Fährmann
65c1c53eb8
[khinsider] fix extraction
2017-11-23 15:33:49 +01:00
Mike Fährmann
12de658937
[tumblr] add options to control extraction behavior ( #48 )
...
- posts : list of post-types to inspect
- inline : scan post bodies for inline images
- external: follow external links
2017-11-23 15:32:54 +01:00
Mike Fährmann
077f8c12be
[tumblr] original video URLs + continuous offset
2017-11-20 20:51:02 +01:00
Mike Fährmann
8eb12ebeae
[tumblr] support more post/media types ( #48 )
...
This adds support for audio and video posts (most videos are shared
from youtube/instagram which isn't supported -> youtube-dl),
as well as link posts and image-search inside of text posts.
Most of this is just WIP and will need some sort of improvement
and options to enable/disable different media types etc.
2017-11-18 23:11:32 +01:00
Mike Fährmann
b8cdd42cab
[senmanga] fix extraction (again)
...
this is basically a re-revert of 2ace5c7
2017-11-18 17:23:32 +01:00
Mike Fährmann
e6814aebe2
add 'extractor.*.user-agent' config option
2017-11-15 14:01:33 +01:00
Mike Fährmann
6913eeaa40
[powermanga] replace manga extractor unit test
...
My Hero Academia is gone
2017-11-15 14:01:24 +01:00
Mike Fährmann
7e0d9257a7
[hbrowse] fix manga extraction
2017-11-15 13:59:50 +01:00
Mike Fährmann
3c576d10c0
[seiga] better metadata + 'skip()' support
2017-11-15 13:58:35 +01:00
Mike Fährmann
f72318e593
[seiga] support more than 200 images
...
Due to API restrictions and/or missing knowledge about and
documentation of API usage, it was only possible to retrieve the
latest 200 images of a niconico seiga user with said API.
The new approach manually visits each HTML page and gets its
information from there.
2017-11-13 20:46:24 +01:00
Mike Fährmann
baf8094868
improve Extractor.request()'s retry behavior
2017-11-13 20:37:11 +01:00
Mike Fährmann
7e7b64162b
[batoto] handle error 10031
2017-11-12 20:49:37 +01:00
Mike Fährmann
92027f67f9
use consistent names for URL constants
...
root := <scheme>://<host>
base_url := <root>/<common path>
2017-11-06 20:56:49 +01:00
Mike Fährmann
69cbc0619f
[mangastream] fix 'next-page' URLs ( fixes #49 )
2017-11-04 11:50:40 +01:00
Mike Fährmann
980fd3616d
[tumblr] use API v2 ( #48 )
2017-11-03 22:16:57 +01:00
Mike Fährmann
d6bed9f36f
[tumblr] prevent premature exit to get all images ( fixes #48 )
2017-11-03 14:59:31 +01:00
Mike Fährmann
305da540c3
[mangahere] fix metadata extraction
2017-11-03 14:54:46 +01:00
Mike Fährmann
2d0cfb33e1
[xvideos] add user profile extractor ( #45 )
2017-11-02 17:28:35 +01:00
Mike Fährmann
a393e6e538
[xvideos] add gallery extractor ( #45 )
2017-11-02 15:36:53 +01:00
Mike Fährmann
3a8a0c1f35
[imgbox] rewrite / fix extraction ( closes #47 )
2017-11-01 13:01:59 +01:00
Mike Fährmann
035ef655f1
[imagefap] update unit tests
...
old gallery/image has been deleted
2017-10-27 12:22:16 +02:00
Mike Fährmann
239d7afea7
[hosturimage] fix extraction of larger images
2017-10-25 12:56:16 +02:00
Mike Fährmann
158e60ee89
[3dbooru] enable download continuation
...
behoimi.org doesn't respect 'Range' headers and doesn't report
'Content-Length' for compressed content encodings.
2017-10-24 13:05:31 +02:00
Mike Fährmann
c4fcdf2691
Revert "[senmanga] fix extraction and download"
...
This reverts commit 2ace5c7b3c .
2017-10-24 00:22:05 +02:00
Mike Fährmann
81a7788b40
replace space characters in unit test URLs
2017-10-23 17:00:53 +02:00
Mike Fährmann
bf82181359
[jaiminisbox] fix extraction
2017-10-22 13:26:09 +02:00
Mike Fährmann
16783e327f
[common] fix UnboundLocalError in Extractor.request()
2017-10-20 18:51:06 +02:00
Mike Fährmann
2ace5c7b3c
[senmanga] fix extraction and download
2017-10-19 18:25:31 +02:00
Mike Fährmann
4d8387f93b
[pixiv] support mobile URLs ( https://touch.pixiv.net/ )
2017-10-17 16:49:42 +02:00
Mike Fährmann
ab2bf0b0dd
[deviantart] replace collection unittest
2017-10-17 15:58:16 +02:00
Mike Fährmann
289d6b65d2
[danbooru] extend and improve URL regex
...
- add support for danbooru mirrors:
- hijiribe.donmai.us
- sonohara.donmai.us
- todo: actually use these domains instead of redirecting everything
to danbooru itself
- improve handling of query string parameters
2017-10-16 21:21:19 +02:00
Mike Fährmann
5fa42336a2
[sankaku] add warning for unauthenticated users
...
also improve URL pattern and add missing options to default config file
2017-10-16 21:21:08 +02:00
Mike Fährmann
6af921a952
[sankaku] rewrite/improve ( fixes #44 )
...
- add wait-time between HTTP requests similar to exhentai
- add 'wait-min' and 'wait-max' options
- increase retry-count for HTTP requests to 10
- implement user authentication (non-authenticated users can only view
images up to page 25)
- implement 'skip()' functionality (only works up to page 50)
- implement image-retrieval for pages >= 51
- fix issue with multiple tags
2017-10-14 23:01:33 +02:00