Commit Graph

1859 Commits

Author SHA1 Message Date
Mike Fährmann
f117e32910 [danbooru] restore 'popular' functionality 2020-02-29 23:37:53 +01:00
Mike Fährmann
39b48d665b [hiperdex] use proper name for 'chapter_minor' 2020-02-29 00:18:54 +01:00
Mike Fährmann
8fbbaa54ff [bcy] fix partial image URLs (#613)
Images from new posts can have incomplete/partial URLs (1)
without any filename extension when fetching their data from
'/apiv3/user/selfPosts', so now all data gets taken from
'/item/detail/ID' pages.

It is currently unknown how to get the non-watermarked original version
of these images, or if that is possible at all. (2)
Images with a watermark will have their 'filter' metadata field set to
"watermark". For original images this field is an empty string "".

Enabling the 'noop' option will, in addition to the watermarked version,
yield the the '~noop.image' filter version (3),
where 'filter' is set to "noop".

(1) "https://img-bcy-qn.pstatp.com/banciyuan/3ccdff22479c4060aadc86718209b281"
(2) "https://p1-bcy.byteimg.com/img/banciyuan/3ccdff22479c4060aadc86718209b281~tplv-banciyuan-logo-v3:wqnpnLLlhZLlpKfprZTnjotfCuWNiuasoeWFgyAtIEFDR-eIseWlveiAheekvuWMug==.image"
(3) "https://p1-bcy.byteimg.com/img/banciyuan/3ccdff22479c4060aadc86718209b281~noop.image"
2020-02-28 22:57:10 +01:00
Mike Fährmann
86c00f9e66 [danbooru] move extractor logic from booru.py 2020-02-28 22:53:45 +01:00
Mike Fährmann
1d4a369ea2 update extractor test results 2020-02-27 22:15:40 +01:00
Mike Fährmann
7625912b31 [piczel] improve and update
- fix tag names
- fix a bug in _pagination()
- parse datetime in 'created_at' as 'date'
- rewrite main loop
- replace user profile test
2020-02-27 22:13:12 +01:00
Mike Fährmann
913b8333cc write DeviantArt refresh-tokens to cache (#616)
Writing the token is currently disabled by default and must be
enabled with 'extractor.oauth.cache'.

'extractor.deviantart.refresh-token' must be set to '"cache"'
to use the cached token.
2020-02-25 22:55:11 +01:00
Mike Fährmann
2a4f227e08 warn about expired cookies 2020-02-25 00:34:42 +01:00
Mike Fährmann
4e361b3008 add tests for specific datetime values 2020-02-23 16:48:30 +01:00
Mike Fährmann
80ecb99089 [hitomi] fix extraction 2020-02-22 22:07:21 +01:00
Mike Fährmann
247c9e1416 [vsco] update gallery URL pattern 2020-02-22 21:39:31 +01:00
Mike Fährmann
19ae6f3fc4 update test results
- twitter:

    Don't test the whole kwdict, only the actual content, since the
    keyword hash changes whenever that user changes his display name.

- khinsider:

    Download host changed
2020-02-22 03:25:32 +01:00
Mike Fährmann
cc5079c844 [hiperdex] add chapter and manga extractors (closes #606) 2020-02-22 03:09:29 +01:00
Mike Fährmann
64bdec8430 [deviantart] check availability of intermediary URLs (fixes #609) 2020-02-21 03:10:53 +01:00
Mike Fährmann
5607dd3646 [hitomi] follow multiple redirects 2020-02-20 18:22:13 +01:00
Mike Fährmann
765b2a0527 [hentaihand] add extractors (closes #605) 2020-02-19 21:55:47 +01:00
Mike Fährmann
d94215d119 [tumblr] replace '-' with ' ' in tag searches (fixes #611)
To search for tags with actual minus signs in them
(there shouldn't be too many,) manually replace those
with url-encoded minus characters ('-' -> '%2d')
before inputting them into gallery-dl:

https://s679874.tumblr.com/tagged/tag-with-minus
 ->
https://s679874.tumblr.com/tagged/tag%2dwith%2dminus
2020-02-17 23:29:13 +01:00
Mike Fährmann
e6cd49e78b update extractor test results 2020-02-16 21:48:46 +01:00
Mike Fährmann
5d9437b398 [vsco] skip "invalid" entities 2020-02-15 23:49:44 +01:00
Mike Fährmann
650f2b6d58 [furaffinity] accept sfw.furaffinity.net URLs (closes #608)
Just as an alias for regular URLs with no extra content filtering.
2020-02-15 22:47:12 +01:00
Mike Fährmann
74e684e828 [twitter] change default value for 'videos' to 'true'
Every other 'videos' option defaulted to 'true', except Twitter.
2020-02-14 01:03:42 +01:00
Mike Fährmann
c7cf9dd111 [furaffinity] support classic layout (#284) 2020-02-12 21:39:43 +01:00
Mike Fährmann
138135c190 [furaffinity] add extractors (#284) 2020-02-11 19:51:24 +01:00
Mike Fährmann
b9c574bd1d [patreon] log skipped files (#590) 2020-02-11 19:01:07 +01:00
Mike Fährmann
80ea9104b8 [8kun] adjust URL pattern 2020-02-11 19:00:13 +01:00
Mike Fährmann
ce26070231 [pixiv] reduce calls to '/user/detail' 2020-02-09 13:54:58 +01:00
Mike Fährmann
da0d5f6092 [oauth] add 'port' option (#604) 2020-02-09 13:45:44 +01:00
Mike Fährmann
719b63d0ca [bcy] add user and post extractors (#592) 2020-02-09 02:37:14 +01:00
Mike Fährmann
6426e3efc7 [khinsider] fix and improve metadata extraction 2020-02-07 18:20:38 +01:00
Mike Fährmann
b7eb6cecbb [pixiv] handle tags at the end of new bookmark URLs 2020-02-06 23:42:13 +01:00
Mike Fährmann
109f6c8685 [patreon] filter duplicate files per post (#590) 2020-02-05 23:38:24 +01:00
Mike Fährmann
b38cf59711 [sexcom] fix image URLs & parse 'date' fields 2020-02-04 22:52:00 +01:00
Mike Fährmann
1f4c9c5f9d [8kun] add thread and board extractors (closes #582) 2020-02-04 22:50:31 +01:00
Mike Fährmann
facc5daa6d [twitter] force old login page layout (fixes #584, fixes #598) 2020-02-02 17:24:53 +01:00
Mike Fährmann
d1de7dc296 [hitomi] implement workaround for "broken" redirects
Some galleries redirect to a new "version" with different gallery id.
This new version might not be available any more, but the /reader/
page for the original gallery id can still work.
2020-02-02 17:24:23 +01:00
Mike Fährmann
40fe062851 [pixiv] fix user id for bookmarks API calls (closes #596) 2020-02-01 01:48:46 +01:00
Mike Fährmann
91aaaf1a9e [pixiv] add 'rating' metadata field (#595)
A human-friendlier representation of 'x_restrict'
2020-02-01 01:36:06 +01:00
Mike Fährmann
dff33b260c [reddit] add 'videos' option 2020-01-31 23:45:02 +01:00
Mike Fährmann
2ad43618cc [piczel] fix extraction 2020-01-31 15:46:21 +01:00
Mike Fährmann
cf7a67d67f [yaplog] remove module
Yaplog! ended its service on 2020-01-31
2020-01-31 12:56:54 +01:00
Mike Fährmann
e0dd073ce0 [twitter] replace embedded tweet test
the old one was deleted
2020-01-31 12:51:55 +01:00
Mike Fährmann
ec36df4851 [deviantart] fix video extraction from 'extended_fetch' results
DeviantArt is now serving videos from wixmp servers (1), instead of
the former film00.deviantart.com (2), even though those URLS are still
functional.

They seem to also have re-encoded those videos. The 10 MB 1080p video
from (2) is now only available in 720p at ~20 MB (with a higher
bitrate, but still …). Other videos are still available in 1080p, but
not this one for some reason.

(Changing the '720p' in (1) to '1080p' doesn't work.)

(1) https://wixmp-ed30a86b8c4ca887773594c2.wixmp.com/v/mp4/9feaa2c9-1baf-4fc2-84f7-f3384b34cefe/d5gxnb5-282a2e9a-b552-40ff-8542-b3c5eed823f5.720p.a837d7cec12c41be8ca2ee53152cea3a.mp4
(2) https://film00.deviantart.net/4c1d/v/mp4/2012/279/d/1/_video____brushes_i_use_in_paint_tool_sai_by_chi_u-d5gxnb5.mp4
2020-01-30 18:02:21 +01:00
Mike Fährmann
48be2266ed [deviantart] better error message for 'extended_fetch' (#585) 2020-01-30 15:25:33 +01:00
Mike Fährmann
71851a6241 [pixiv] update URLs of followed users to the new format 2020-01-30 15:17:42 +01:00
Mike Fährmann
d086f30b42 [reddit] restore archive keys for i.redd.it images 2020-01-29 22:12:55 +01:00
Mike Fährmann
56f1c96168 implement 'parent-directory' option (#551) 2020-01-29 18:32:37 +01:00
Mike Fährmann
ae07f92f7e [reddit] rewrite extractor logic (closes #551)
Handle images and videos hosted on Reddit "natively",
allowing them to use reddit-specific metadata to build directory
and file names.
2020-01-29 17:57:25 +01:00
Mike Fährmann
2852691d78 [paheal] replace test URL
searching for 'k-on' doesn't yield any results anymore
2020-01-27 22:19:41 +01:00
Mike Fährmann
2a9be48511 improve util.load/save_cookiestxt() and add tests
- take a file object as argument instead of an filename
- accept whitespace before comments ("   # comment")
- map expiration "0" to None and not the number 0
2020-01-25 23:02:15 +01:00
Mike Fährmann
e35c2ea1a6 [weibo] use youtube-dl to download from m3u8 manifests 2020-01-24 23:39:34 +01:00