Commit Graph

755 Commits

Author SHA1 Message Date
Mike Fährmann
8704d850bf add explicit proxy support (#76)
- '--proxy' as command-line argument
- 'extractor.*.proxy' as config option
2018-02-19 18:45:06 +01:00
Mike Fährmann
367b963d37 [pixiv] fix ugoira extraction ... again (#78)
Some animations are not available for mobile devices, so we
pretend to be a desktop browser when requesting the ugoira page.
2018-02-19 16:50:12 +01:00
Mike Fährmann
b79f1f2ca7 [pixiv] fix ugoira extraction (closes #78) 2018-02-19 08:51:09 +01:00
Mike Fährmann
d122203be1 [mangastream] fix extraction 2018-02-17 22:40:16 +01:00
Mike Fährmann
179bcdd349 adjust archive-ids 2018-02-13 04:50:45 +01:00
Mike Fährmann
3cec533c28 Merge branch 'archive' 2018-02-12 18:07:58 +01:00
Mike Fährmann
20af86b2ea add more extractor tests
for mangastream, reddit and imgur
2018-02-12 17:07:18 +01:00
Mike Fährmann
7e0207bcf4 [imgur] strip trailing '?1' from 'ext' 2018-02-10 21:33:40 +01:00
Mike Fährmann
cf147dfee9 [hentai2read] fix manga extraction
- site changed its HTML structure
2018-02-09 22:24:34 +01:00
Mike Fährmann
f5f2d29f56 [nijie] fix dojin extraction
- correctly extract artist_id
- set extension to "jpg" if it was empty and let filetype checks do
  the rest
2018-02-09 22:06:26 +01:00
Mike Fährmann
d38bf2f54c [tumblr] recognize /image/... URLs
xyz.tumblr.com/image/123 refers to the same images
as xyz.tumblr.com/post/123.
2018-02-08 23:08:14 +01:00
Mike Fährmann
5b3c34aa96 use generic chapter-extractor in more modules 2018-02-07 12:36:39 +01:00
Mike Fährmann
7b5ba69951 [hentaihere] ensure consistent extraction results
sometimes there is a random space before the next <a>
2018-02-05 15:26:25 +01:00
Mike Fährmann
377b78b3c9 [hentai2read] fix manga name extraction 2018-02-04 22:12:24 +01:00
Mike Fährmann
54c36a8a34 [subapics] add chapter- and manga-extractor (#70) 2018-02-04 22:02:10 +01:00
Mike Fährmann
2dd3aeeeae [komikcast] add chapter- and manga-extractor (#70) 2018-02-04 22:02:10 +01:00
Mike Fährmann
7a412f5c32 implement generic manga-chapter extractor 2018-02-04 22:02:04 +01:00
Mike Fährmann
6a07e38366 implement extractor.add() and .add_module()
... as a public and non-hacky way to add (external) extractors to
gallery-dl's pool and make them available for extractor.find()
2018-02-02 00:01:41 +01:00
Mike Fährmann
34873dbd90 set 'archive_fmt' values
These are going to be used to create an unique id for each image.
2018-02-01 15:30:49 +01:00
Mike Fährmann
a34cebc253 [luscious] jump to first image if cover does not link to it 2018-01-30 22:39:01 +01:00
Mike Fährmann
84a52a9256 add DownloadArchive class 2018-01-30 15:23:23 +01:00
Mike Fährmann
619387cbb1 update extractor unittest results 2018-01-28 18:29:05 +01:00
Mike Fährmann
db91cf871c document message identifiers 2018-01-23 21:38:30 +01:00
Mike Fährmann
0dd48d644f update test results
nothing broke, but things got updated or changed
2018-01-23 21:38:29 +01:00
Mike Fährmann
1e93955170 [batoto] remove module
Site officially shut down on 2018.01.18
2018-01-23 21:37:32 +01:00
Mike Fährmann
76509a6d3c [imgur] update test results 2018-01-20 18:49:29 +01:00
Mike Fährmann
9fccd7b783 [tumblr] provide fallback URLs (#64)
Each image now produces 3 URLs:
- amazonaws.com _raw (or _1280 for older images)
- amazonaws.com _500
- media.tumblr.com (URL returned by API)
2018-01-19 23:12:15 +01:00
Mike Fährmann
9d69401391 initial support for multiple URLs per image 2018-01-17 22:08:19 +01:00
Mike Fährmann
91ed147cef [oauth] use custom key/secret values during oauth:… 2018-01-16 17:39:46 +01:00
Mike Fährmann
421a9740a3 [tumblr] add 'tumblr:' to force Tumblr extractor (#71) 2018-01-15 18:27:58 +01:00
Mike Fährmann
40d35c87bc [paheal] add tag- and post-extractors (closes #69) 2018-01-15 16:39:05 +01:00
Mike Fährmann
cc0c2cca57 [reddit] add extractor for reddit-hosted images (closes #68) 2018-01-14 18:55:42 +01:00
Mike Fährmann
f10ffc0839 update extractor blacklist to also allow classes 2018-01-14 18:47:22 +01:00
Mike Fährmann
35e09869d1 [mangapark] fix image URLs and use HTTPS 2018-01-12 14:59:49 +01:00
Mike Fährmann
9a049bdf51 [tumblr] add 'likes' extractor (#65) 2018-01-12 14:56:01 +01:00
Mike Fährmann
67d4462d26 [batoto] rudimentary Cloudflare bypass 2018-01-11 18:49:19 +01:00
Mike Fährmann
29d75fc3fa [tumblr] add support for OAuth authentication (#65) 2018-01-11 14:11:37 +01:00
Mike Fährmann
4edb25346e [slideshare] support mobile URLs (closes #67) 2018-01-10 14:15:00 +01:00
Mike Fährmann
e420a28bbc fix cookie tests 2018-01-09 21:43:52 +01:00
Mike Fährmann
b33efc99a4 [idolcomplex] add support for idol.sankakucomplex.com 2018-01-09 17:54:37 +01:00
Mike Fährmann
75b2e84b6d [tumblr] use s3.amazonaws.com for image URLs (#64) 2018-01-09 15:13:00 +01:00
Mike Fährmann
5b094328b5 [puremashiro] add chapter- and manga-extractor (closes #66)
Also adds support for region subtags in language codes (e.g. en-us)
2018-01-07 21:50:43 +01:00
Mike Fährmann
974e73bdbb [booru] smaller code adjustments 2018-01-06 17:48:49 +01:00
Mike Fährmann
03b8a548cb [tumblr] change reblogs default value to true (#61) 2018-01-06 15:52:08 +01:00
Mike Fährmann
d235f68f59 [tumblr] add option to filter reblogged posts (#61)
Reblogs are ignored by default, but can be included by setting
'extractor.tumblr.reblogs' to 'true'.
2018-01-05 13:05:57 +01:00
Mike Fährmann
a794fffc6d [batoto] extend chapter-string regex (closes #60)
Non-numeric chapter indices exist after all ...
2018-01-05 12:53:50 +01:00
Mike Fährmann
1219ebb7f5 [danbooru] use alternate subdomains; support safebooru 2018-01-04 00:51:04 +01:00
Mike Fährmann
9e8a84ab6c [booru] rewrite using Mixin classes (#59)
- improved code structure
- improved URL patterns
- better pagination to work around page limits on
  - Danbooru
  - e621
  - 3dbooru
2018-01-04 00:01:39 +01:00
Mike Fährmann
0876541e43 [seiga] update tests 2017-12-30 19:19:36 +01:00
Mike Fährmann
88bb0798fd delay initialization of PathFormat objects
This allows the DeviantArt group-check to be moved inside the
Extractor.items() method which in turn allows for better exception
handling.

As a new general rule:
Never raise exceptions during extractor initialization.
2017-12-29 22:15:57 +01:00