Commit Graph

736 Commits

Author SHA1 Message Date
Mike Fährmann
c45770331a use 'str.partition()'
The (r)partition method is always faster then split() or any other
method that has been replaced in this commit.
2017-08-21 18:29:50 +02:00
Mike Fährmann
017a72f448 [pixiv] improve input validation 2017-08-21 17:53:27 +02:00
Mike Fährmann
dcf42c5e89 [pixiv] add extractor for ranking lists 2017-08-20 20:21:52 +02:00
Mike Fährmann
4ea82ea556 [warosu] add thread extractor 2017-08-18 19:54:07 +02:00
Mike Fährmann
6078ec5908 restructure the output of --help
Using argument groups is a definite improvement over how things looked
previously, but general group membership of individual items might be
a thing to reconsider.
2017-08-16 19:56:50 +02:00
Mike Fährmann
9aa95fba8c [deviantart] adapt download URLs to use https
Even though DeviantArt is "completely switching over to HTTPS"[1],
every URL contained in an API response is still using HTTP

[1] https://danlev.deviantart.com/journal/DeviantArt-Is-Switching-To-HTTPS-697996906
2017-08-16 12:17:50 +02:00
Mike Fährmann
d70c66c516 fix "text:" downloader 2017-08-16 12:11:47 +02:00
Mike Fährmann
f7de048980 add additional debug output 2017-08-13 20:35:44 +02:00
Mike Fährmann
9bf9d64ad8 update unittests for util.py 2017-08-13 14:31:22 +02:00
Mike Fährmann
02e89700fc [foolfuuka] ensure sorted posts 2017-08-13 14:29:26 +02:00
Mike Fährmann
8bcf88bff7 [flickr] fix extraction
This issue was only noticeable with older Python versions, as these
don't exhibit a consistent ordering of dict keys.
2017-08-12 21:41:10 +02:00
Mike Fährmann
e3bfb8325a fix circular dependency
- util.py imported config.py and vice versa
- Python < 3.5 doesn't like this
2017-08-12 21:32:24 +02:00
Mike Fährmann
004456d5d5 properly update the config-dictionary
When using 2 or more config files, the values of the second would
improperly overwrite nested dictionaries of the first one.
The new method properly combines these nested dictionaries as well.
2017-08-12 20:07:27 +02:00
Mike Fährmann
ae2d61e5b3 handle format string exceptions separately 2017-08-11 21:48:37 +02:00
Mike Fährmann
3c9f190757 extend output of --list-keywords 2017-08-10 17:36:21 +02:00
Mike Fährmann
cfa479fab5 update error message for unspecified exceptions
- ask user to report unexpected errors, which usually indicate
  extractor failure
- handle OSErrors separately (permissions, disk full, etc)
- revert 30eef52
2017-08-10 16:35:46 +02:00
Mike Fährmann
7e936e9c06 [luscious] simplify and remove dead code 2017-08-08 19:26:13 +02:00
Mike Fährmann
d74a635e41 [util] update 'default' values and improve test coverage
for 'code_to_language()' and 'language_to_code()'
2017-08-08 19:22:04 +02:00
Mike Fährmann
0245a0ba5f fix extraction and update test results
- fixes for hbrowse, imgyt, imgcandy, hosturimage
- test updates for deviantart, gfycat
2017-08-08 19:11:13 +02:00
Mike Fährmann
abd7c559cd [yonkouprod] remove module
Every manga chapter on this site has been removed.
2017-08-07 18:32:14 +02:00
Mike Fährmann
da7219ba74 [kisscomic] remove module
Image links on this site are dead.
2017-08-07 18:28:35 +02:00
Mike Fährmann
852e7acd31 [twitter] ignore "Promoted Tweets" 2017-08-06 13:43:08 +02:00
Mike Fährmann
915a0137de improve 'extractor.request'
- add 'fatal' argument
- improve internal logic and flow
- raise known exception on error
- update exception hierarchy
2017-08-05 16:11:46 +02:00
rachmadani haryono
dcd573806e chg: dev: fix error (#32)
* fix: dev: error

* fix: dev: AttributeError when getting artist

* fix: dev: typo on luscious parser
2017-08-04 15:01:10 +02:00
Mike Fährmann
c4713404c8 [directlink] improve URL pattern 2017-08-02 21:06:49 +02:00
Mike Fährmann
d443822fdb [luacious] get correct image URLs (fixes #33)
Instead of using thumbnail URLs and modifying them the extractor now
goes through every single image-page and gets its download URL from
there.
2017-08-02 19:58:13 +02:00
Mike Fährmann
6950708e52 [hentaicdn] use HTTPS 2017-08-02 18:31:21 +02:00
Mike Fährmann
4f1e6c109f [deviantart] remove 'invalid escape sequence' warning
- use r"\w" or "\\w" instead of "\w"
2017-07-27 20:50:33 +02:00
Mike Fährmann
c864be479e [directlink] update URL pattern & PEP 8
- combine some file extensions
- don't match '.je'
- line length < 80
2017-07-27 20:46:15 +02:00
H R X N
45f9d64c23 Update directlink.py with additional file exts. (#30)
Add WebP, still not that common, but it's increasing.
Add 3rd JPEG variant (https://en.wikipedia.org/wiki/JPEG#JPEG_filename_extensions)
Never seen JFIF in the wild, would probably be overkill.
Extend Ogg formats (https://en.wikipedia.org/wiki/Ogg; https://wiki.xiph.org/MIME_Types_and_File_Extensions)
2017-07-27 20:40:00 +02:00
Mike Fährmann
4357966a70 [kissmanga] make URL pattern case-insensitive (fixes 28) 2017-07-26 10:36:59 +02:00
Mike Fährmann
493bd235cf workaround for missing 'assert_called_once' method
this method was introduced in Python 3.6, but calling it still
works (i.e. it doesn't cause the test to fail) on Python 3.3/3.4
2017-07-26 10:33:15 +02:00
Mike Fährmann
7aa9fa796a code cleanup and fixes 2017-07-25 14:59:41 +02:00
Mike Fährmann
f08af03845 Merge branch 'cookies' 2017-07-25 14:04:53 +02:00
Mike Fährmann
55f048d02b ignore case of cookiejar magic strings 2017-07-24 18:33:42 +02:00
Mike Fährmann
de68cf84a8 release version 0.9.1 2017-07-24 11:36:21 +02:00
Mike Fährmann
f53bf1a323 [thebarchive] add thread extractor 2017-07-23 15:45:17 +02:00
Mike Fährmann
b8cf434bb0 [rebeccablacktech] add thread extractor 2017-07-23 15:41:56 +02:00
Mike Fährmann
808f67ba7d use 'cookiedomain' for cookies set by object-config-values
otherwise these cookies would not be picked up by the
_check_cookies() method.
2017-07-22 15:43:35 +02:00
Mike Fährmann
390eeded4c [mangazuki] support 'raws.…' subdomain 2017-07-21 16:25:56 +02:00
Mike Fährmann
4a60f6068a [mangazuki] add manga extractor 2017-07-20 16:02:09 +02:00
Mike Fährmann
394241cd6f [2chan] fix extraction 2017-07-20 15:01:47 +02:00
Mike Fährmann
a13eb6010f [fallenangels] fix extraction of chapter URLs 2017-07-20 14:58:47 +02:00
Mike Fährmann
1cb1d2e0a3 [mangazuki] add chapter extractor 2017-07-19 17:20:03 +02:00
Mike Fährmann
2f2e363c97 [imgur] use /a/<key>/all as album-url 2017-07-18 21:06:31 +02:00
Mike Fährmann
1cec03c9c6 [imgur] fix extraction of large albums 2017-07-18 12:42:19 +02:00
Mike Fährmann
0610ae5000 skip login if cookies are present 2017-07-17 10:33:36 +02:00
Mike Fährmann
f105782435 [fireden] add thread extractor 2017-07-15 14:51:58 +02:00
Mike Fährmann
c93f7d7496 [archiveofsins] add thread extractor 2017-07-15 13:23:04 +02:00
Mike Fährmann
96e13604da [archivedmoe] add thread extractor 2017-07-14 13:25:53 +02:00