Commit Graph

128 Commits

Author SHA1 Message Date
Mike Fährmann
28cd78aae0 [kissmanga] extend chapter-string regex (closes #58) 2017-12-24 22:53:10 +01:00
Mike Fährmann
fc7d165c97 [deviantart] add support for OAuth2 authentication
Some user galleries [*] require you to be either logged in or
authenticated via OAuth2 to access their deviations.

[*] e.g. https://polinaegorussia.deviantart.com/gallery/

--------------

known issue:
A deviantart 'refresh_token' can only be used once and gets updated
whenever it is used to request a new 'access_token', so storing its
initial value in a config file and reusing it again and again is not
possible.
2017-12-18 01:16:46 +01:00
Mike Fährmann
0a9a07a6e1 [slideshare] improve metadata; flake8
- added 'views' and 'published' keywords
- fixed longer titles and descriptions
2017-12-13 21:16:49 +01:00
Mike Fährmann
291369eab2 various smaller changes/additions 2017-12-06 21:45:56 +01:00
Mike Fährmann
300346ecdf [mangazuki] remove extractors
This site has been in "rebuild"-mode for a fairly long time and the
current extractor code isn't going to work for the new version either.
2017-12-04 13:36:04 +01:00
Mike Fährmann
93482a1f88 implement 'util.advance()' 2017-12-03 01:38:24 +01:00
Mike Fährmann
a718c6c6cd implement 'util.parse_bytes()' 2017-12-02 01:24:49 +01:00
Mike Fährmann
214972bc9a [gelbooru] use manual extraction
... to compensate for their disabled API.
(https://gelbooru.com/index.php?page=forum&s=view&id=3875)

This also adds an extractor for image-pools.
2017-11-29 20:48:17 +01:00
Mike Fährmann
b14de6ffc2 [tumblr] small improvements
- don't transform inline GIF URLs
- set 'type' parameter for API calls if there is only
  one post type selected
2017-11-24 16:51:07 +01:00
Mike Fährmann
b8cdd42cab [senmanga] fix extraction (again)
this is basically a re-revert of 2ace5c7
2017-11-18 17:23:32 +01:00
Mike Fährmann
6913eeaa40 [powermanga] replace manga extractor unit test
My Hero Academia is gone
2017-11-15 14:01:24 +01:00
Mike Fährmann
f72318e593 [seiga] support more than 200 images
Due to API restrictions and/or missing knowledge about and
documentation of API usage, it was only possible to retrieve the
latest 200 images of a niconico seiga user with said API.

The new approach manually visits each HTML page and gets its
information from there.
2017-11-13 20:46:24 +01:00
Mike Fährmann
2457b71633 skip tests on 5xx status codes 2017-11-12 20:51:12 +01:00
Mike Fährmann
305da540c3 [mangahere] fix metadata extraction 2017-11-03 14:54:46 +01:00
Mike Fährmann
035ef655f1 [imagefap] update unit tests
old gallery/image has been deleted
2017-10-27 12:22:16 +02:00
Mike Fährmann
caf26412dd add option to set alternate location of .part files (#29)
Note: The path set for 'downloader.*.part-directory' needs to point to an
already existing directory.
2017-10-26 00:16:48 +02:00
Mike Fährmann
27c026543f re-enable download unit tests 2017-10-25 12:55:36 +02:00
Mike Fährmann
b0353aa02d rewrite download modules (#29)
- use '.part' files during file-download
- implement continuation of incomplete downloads
- check if file size matches the one reported by server
2017-10-24 12:53:03 +02:00
Mike Fährmann
6af921a952 [sankaku] rewrite/improve (fixes #44)
- add wait-time between HTTP requests similar to exhentai
- add 'wait-min' and 'wait-max' options
- increase retry-count for HTTP requests to 10
- implement user authentication (non-authenticated users can only view
  images up to page 25)
- implement 'skip()' functionality (only works up to page 50)
- implement image-retrieval for pages >= 51
- fix issue with multiple tags
2017-10-14 23:01:33 +02:00
Mike Fährmann
75d3a1f72f [deviantart] always download original images
Deviation-objects returned by the DeviantArt API don't always contain
the URL and metadata of the original image ([1]). Getting this
information requires an additional API call [2], which is indicated by
the 'is_downloadable' and 'download_filesize' metadata within a
deviation-object.

[1] https://myria-moon.deviantart.com/art/Aime-Moi-part-en-vadrouille-261986576
[2] https://www.deviantart.com/developers/http/v1/20160316/deviation_download/bed6982b88949bdb08b52cd6763fcafd
2017-10-07 13:07:34 +02:00
Mike Fährmann
8e6a767109 [util] restructure formatter for better exception propagation 2017-10-06 17:10:35 +02:00
Mike Fährmann
0386503c80 fix (sub)category-transfer for DownloadJob instances (#41)
... and extend "parent" parameters to TestJob- and DataJob-classes
as well.
2017-10-06 15:38:35 +02:00
Mike Fährmann
41adb99e9c [pawoo] fix extraction
- changed access_token
- use account-search instead of general search
2017-10-02 18:33:52 +02:00
Mike Fährmann
b319f4bab3 smaller code and text changes 2017-10-01 18:23:40 +02:00
Mike Fährmann
c1f0afe4c6 add custom string formatter class 2017-09-28 17:12:39 +02:00
Mike Fährmann
85a2b2ae59 [khinsider] fix extraction 2017-09-28 11:47:26 +02:00
Mike Fährmann
8e14714c2b [imgspice] fix extraction 2017-09-26 21:04:48 +02:00
Mike Fährmann
a85f06d2d1 [foolslide] restructure; convert suitable values to int 2017-09-24 16:57:47 +02:00
Mike Fährmann
9fc1d0c901 implement and use 'util.safe_int()'
same as Python's 'int()', except it doesn't raise any exceptions and
accepts a default value
2017-09-24 15:59:25 +02:00
Mike Fährmann
a9e7145651 [hbrowse] extract hmanga metadata & general maintenance 2017-09-20 16:25:25 +02:00
Mike Fährmann
84d4450410 [fallenangels] extract manga metadata 2017-09-15 20:51:40 +02:00
Mike Fährmann
f32b1a0292 [imgyt] fix extraction 2017-09-14 15:04:32 +02:00
Mike Fährmann
31cd5b1c1d [luscious] detect high-load responses 2017-09-12 15:46:21 +02:00
Mike Fährmann
81877bb5f6 add '-K' as shortcut for '--list-keywords' 2017-09-09 18:48:28 +02:00
Mike Fährmann
9b21d3f13c add '--filter' command-line option
This allows for image filtering via Python expressions by the same
metadata that is also used to build filenames (--list-keywords).

The usually shunned eval() function is used to evaluate
filter-expressions, but it seemed quite appropriate in this case and
shouldn't introduce any new security issues, as any attacker that could do
> gallery-dl --filter "delete-everything()" ...
could as well do
> python -c "delete-everything()"
2017-09-08 17:52:00 +02:00
Mike Fährmann
31731cbefe update unittests for util.py 2017-09-06 17:57:19 +02:00
Mike Fährmann
f98e3e8002 [luscious] fix tag extraction 2017-09-01 16:29:52 +02:00
Mike Fährmann
65997d835b replace popular/ranking tests with older ones
Metadata of several year old lists shouldn't change as much as it
would for newer ones, which makes metadata-comparisons of the output
of build_testresult_db.oy easier.
2017-08-31 15:09:18 +02:00
Mike Fährmann
c0755a4d5e [exhentai] revert login-method to its old version (#37)
Additional cookies don't seem to help and have to be manually set
anyway. The older method is more likely to succeed, so I'd rather
use this one.
2017-08-29 22:10:38 +02:00
Mike Fährmann
3ee39ffd93 [exhentai] update login procedure (#37)
This new version behaves pretty much exactly like a browser would and
caches all cookies sent to it and not just "ipb_member_id" and
"ipb_pass_hash".
2017-08-28 21:03:32 +02:00
Mike Fährmann
07214f4007 [booru] place subcategories into base classes 2017-08-26 22:27:55 +02:00
Mike Fährmann
47bcf53ec1 implement support for additional unit test result types
- "pattern" matches all resulting URLs against the given regex
- "count" allows to specify the amount of returned URLs
2017-08-25 22:01:14 +02:00
Mike Fährmann
c7ec103e15 [batoto] fix extraction of chapter URLs 2017-08-25 16:34:42 +02:00
Mike Fährmann
f7cdfd4c25 add a simplified version of 'parse_qs'
This version only returns a dict of plain string to string key-value
pairs and ignores multiple values for the same query variable.
2017-08-24 20:55:58 +02:00
Mike Fährmann
d70c66c516 fix "text:" downloader 2017-08-16 12:11:47 +02:00
Mike Fährmann
9bf9d64ad8 update unittests for util.py 2017-08-13 14:31:22 +02:00
Mike Fährmann
d74a635e41 [util] update 'default' values and improve test coverage
for 'code_to_language()' and 'language_to_code()'
2017-08-08 19:22:04 +02:00
Mike Fährmann
abd7c559cd [yonkouprod] remove module
Every manga chapter on this site has been removed.
2017-08-07 18:32:14 +02:00
Mike Fährmann
852e7acd31 [twitter] ignore "Promoted Tweets" 2017-08-06 13:43:08 +02:00
Mike Fährmann
6950708e52 [hentaicdn] use HTTPS 2017-08-02 18:31:21 +02:00