Commit Graph

829 Commits

Author SHA1 Message Date
Mike Fährmann
b81d068a6d [flickr] add favorites extractor (#16) 2017-06-02 16:35:04 +02:00
Mike Fährmann
c921b4f32a code cleanup and fixing tests 2017-06-02 09:10:58 +02:00
Mike Fährmann
72f1c6f87a [flickr] add support for flic.kr/p/... URLs
Example:
    https://flic.kr/p/FPVo9U
2017-06-02 09:01:35 +02:00
Mike Fährmann
93e5d8cba3 [flickr] add album extractor 2017-05-31 17:31:51 +02:00
Mike Fährmann
659c65dbb0 [flickr] add image extractor 2017-05-30 17:43:02 +02:00
Mike Fährmann
b6fffa9e26 [directlink] update filename format and metadata 2017-05-30 17:33:09 +02:00
Mike Fährmann
c184e47ee3 put common directory- and filename formats in base classes 2017-05-30 12:10:16 +02:00
Mike Fährmann
bce51e90a5 [reddit] support sorting options and sub-options (#15)
Example:
    https://www.reddit.com/r/<subreddit>/top/?sort=top&t=month
    (the 'sort=top' parameter is irrelevant and can be omitted)
2017-05-29 12:45:35 +02:00
Mike Fährmann
5f45ce2930 [gfycat] add "format" config key to select a video format
Possible values:
    - one of "mp4" (default), "webm", "gif", "webp", "mjpg"

If the selected format is not available, "mp4", "webm" and "gif"
(in that order) will be tried instead, until an available format
is found.
2017-05-29 12:16:37 +02:00
Mike Fährmann
011659ced5 [imgur] add "mp4" config key to decide between GIF and MP4
possible values:
    - false   : always use GIF
    - true    : use MP4 if "prefer_video" flag is set,
                GIF otherwise (default)
    - "always": always use MP4
2017-05-29 08:48:07 +02:00
Mike Fährmann
48ccee2505 [gfycat] add image extractor 2017-05-28 17:09:54 +02:00
Mike Fährmann
bf452a8516 [imgur] choose .mp4 over .gif if available 2017-05-27 11:49:29 +02:00
Mike Fährmann
f79320e35b fix tests 2017-05-27 11:47:15 +02:00
Mike Fährmann
67791e1b36 [imgur] improve and add image extractor 2017-05-26 22:30:09 +02:00
Mike Fährmann
99b72130ee [reddit] enable recursion (#15)
reddit extractors now recursively visit other submissions/posts
linked to in the initial set of submissions.
This behaviour can be configured via the 'extractor.reddit.recursion'
key in the configuration file or by `-o recursion=<value>`.

Example:
{"extractor": {
  "reddit": {
   "recursion": <value>
}}}

Possible values:
* -1 - infinite recursion (don't do this)
*  0 - recursion is disabled (default)
*  1 and higher - maximum recursion level
2017-05-26 17:01:27 +02:00
Mike Fährmann
691c4dd709 support direct image links 2017-05-24 12:51:18 +02:00
Mike Fährmann
d2dceb35b7 implement context-manager to blacklist extractors 2017-05-24 12:42:37 +02:00
Mike Fährmann
e425243b1e [reddit] some small fixes
- filter or complete some URLs
- remove the 'nofollow:' scheme before printing URLs
- (#15)
2017-05-23 11:48:00 +02:00
Mike Fährmann
a22892f494 [reddit] add subreddit- and submission-extractor
- these extractors scan submissions and their comments for
  (external) URLs and defer them to other extractors
- (#15)
2017-05-23 09:38:50 +02:00
Mike Fährmann
832a4a8ee9 [fallenangels] add manga extractor 2017-05-21 10:37:38 +02:00
Mike Fährmann
f226417420 simplify code by using a MangaExtractor base class 2017-05-20 11:27:43 +02:00
Mike Fährmann
2974d782a3 [yomanga] remove module
site has been shut down
2017-05-20 11:18:44 +02:00
Mike Fährmann
cbb4323f66 add setup.cfg to configure flake8 2017-05-19 19:22:39 +02:00
Mike Fährmann
232fe2dd08 improve the test extractor 2017-05-19 14:04:52 +02:00
Mike Fährmann
b0131ea402 [fallenangels] support this site's Vietnamese version
- https://truyen.fascans.com/
2017-05-18 15:22:25 +02:00
Mike Fährmann
b6b214f7e9 [deviantart] fix headers for custom-style journals
example: http://shimoda7.deviantart.com/journal/Temporary-absence-231936282
2017-05-15 15:58:06 +02:00
Mike Fährmann
e9a2738257 [deviantart] support images on top of journal entries
example: http://raxnae.deviantart.com/art/Kami-s-Journal-679482236
2017-05-13 21:42:29 +02:00
Mike Fährmann
92597f46d4 [deviantart] add title to journals 2017-05-13 15:36:52 +02:00
Mike Fährmann
107d29ad8a improve handling of text:... URLs
- don't require // after the colon
- open output files in text mode
2017-05-12 14:10:25 +02:00
Mike Fährmann
677c8ced11 [deviantart] add "journal" extractor
(#14)
2017-05-10 17:21:33 +02:00
Mike Fährmann
e5f79ae839 [deviantart] add support for all media types
- this includes
  - images
  - videos
  - flash-animations
  - journals

- also renamed some of the extractors
  - User  -> Gallery
  - Image -> Deviation
2017-05-10 16:45:45 +02:00
Mike Fährmann
9f1c83297f [pinterest] allow URLs with any TLD 2017-05-08 15:08:39 +02:00
Mike Fährmann
b3b92ac243 [deviantart] support "All" favorites and add "mature" option
- since there is apparently no actual way to get the "All" favorites
  listing via API, corresponding URLs (.../favourites/?catpath=/) will
  be handled by yielding all deviations from all favorite collections of
  that user

- the "mature" config key works on a per extractor basis (like "username"
  or "password"). values can be the strings "true" or "false", or the
  booleans true or false.

- (#14)
2017-05-06 21:26:27 +02:00
Mike Fährmann
7376ad7f3d [deviantart] turn the "Mature Content Filter" off
(#14)
2017-05-06 14:56:41 +02:00
Mike Fährmann
cfbf79d788 [pixiv] fix login 2017-05-05 10:38:22 +02:00
Mike Fährmann
85a46ed700 [booru] fix issue with multiple tags 2017-05-04 11:58:51 +02:00
Mike Fährmann
fc9223c072 add '--abort-on-skip' option and ability to control skip behavior
the 'skip' config option controls skipping behavior:
    true    - skip download if file already exist (default)
    false   - download and overwrite files even if it exists
    "abort" - abort extractor run if a download would be skipped
              (same as '--abort-on-skip')
2017-05-03 15:26:04 +02:00
Mike Fährmann
d948ba1322 [readcomics] remove module
- site has been unavailable for two weeks
- (#12)
2017-05-01 11:44:12 +02:00
Mike Fährmann
a610b35a0d [mangashare] remove module
this site has been unavailable for at least two months
2017-05-01 11:06:38 +02:00
Mike Fährmann
4e8587bad4 [pixiv] add support for https://i.pximg.net URLs 2017-04-30 22:54:49 +02:00
Mike Fährmann
e41efbd2d9 [kissmanga] fix edge-case 2017-04-30 11:02:32 +02:00
Mike Fährmann
ffd72424bf [kissmanga] another attempt at getting the AES key 2017-04-29 15:58:33 +02:00
Mike Fährmann
af56887a47 [exhentai] fall back to e-hentai if no username is given 2017-04-28 15:59:56 +02:00
Mike Fährmann
4b967fa189 implement and use extractor.config() method 2017-04-25 17:12:48 +02:00
Mike Fährmann
82ab1fca07 [seiga] reduce cache maxage to one week 2017-04-24 15:25:20 +02:00
Mike Fährmann
ec48d25afc [pawoo] fix extraction results 2017-04-22 11:14:20 +02:00
Mike Fährmann
244ab75cad [kissmanga] update AES key retrieval 2017-04-21 20:36:47 +02:00
Chen John L
a5485a46cb fixed the module for pixhost 2017-04-21 19:54:10 +08:00
Mike Fährmann
13dc5d72bc update some extractors to use https 2017-04-20 13:32:40 +02:00
Mike Fährmann
342371086b [pawoo] add extractors for accounts and statuses
https://pawoo.net is a Mastodon[1] instance hosted by Pixiv
[1] https://github.com/tootsuite/mastodon
2017-04-19 10:17:43 +02:00