Commit Graph

1270 Commits

Author SHA1 Message Date
Mike Fährmann
a50e9faf0e [newgrounds] recognize direct links 2019-01-25 16:35:12 +01:00
Mike Fährmann
c5559fa07d [photobucket] improve subalbum extraction (#117)
The former implementation would produce a complete list of all subalbums
for each (sub)album extraction. This would for example result in a
level 2 subalbum getting "extracted" twice: once through the root-album
(level 0) and once through its parent album on level 1.

In the current implementation only the next level of subalbums are
returned, which themselves will handle their next level in a recursive
fashion.
2019-01-22 21:44:05 +01:00
Mike Fährmann
ecad69100a [photobucket] add 'image' extractor (#117) 2019-01-22 17:24:43 +01:00
Mike Fährmann
b50b30f1c9 [photobucket] download subalbums (#117) 2019-01-22 14:05:18 +01:00
Mike Fährmann
d19bac71be [photobucket] add 'album' extractor (#117) 2019-01-20 16:19:13 +01:00
Mike Fährmann
78b5f29a00 [sankaku] unescape tags 2019-01-20 16:18:13 +01:00
Mike Fährmann
9b8ac12eed [behance] enable 'categorytransfer' for collections (#157) 2019-01-19 20:02:20 +01:00
Mike Fährmann
217a0687ef [behance] add 'collection' extractor (closes #157) 2019-01-19 18:11:20 +01:00
Mike Fährmann
b8fed34548 add generalized extractors for Mastodon instances (#144)
Extractors for Mastodon instances can now be dynamically generated,
based on the instance names in the 'extractor.mastodon.*' config path.

Example:
{
    "extractor": {
        "mastodon": {
            "pawoo.net": { ... },
            "mastodon.xyz": { ... },
            "tabletop.social": { ... },
            ...
        }
    }
}

Each entry requires an 'access-token' value, which can be generated with
'gallery-dl oauth:mastodon:<instance URL>'.
An 'access-token' (as well as a 'client-id' and 'client-secret') for
pawoo.net is always available, but can be overwritten as necessary.
2019-01-19 14:28:59 +01:00
Mike Fährmann
66460337f1 [mangapark] fix extraction 2019-01-17 21:24:53 +01:00
Mike Fährmann
2ffc105887 [exhentai] extract tag metadata 2019-01-15 18:08:17 +01:00
Mike Fährmann
0fb98d1d79 [hbrowse] extract tag metadata 2019-01-15 18:08:10 +01:00
Mike Fährmann
9bbbadd93a [hbrowse] use HTTPS 2019-01-15 18:07:39 +01:00
Mike Fährmann
2fbf072723 [newgrounds] ensure consistent tag order
... plus some code restructuring
2019-01-14 16:14:19 +01:00
Mike Fährmann
d7a4739cf6 [hbrowse] print error message if site is down
... instead of crashing with a meaningless exception
2019-01-14 15:44:23 +01:00
Mike Fährmann
98c6520384 [pinterest] update root URL of API calls 2019-01-14 15:22:04 +01:00
Mike Fährmann
751e535948 [nhentai] fix extraction (closes #156)
Use JSON embedded in webpage since API endpoints have been disabled
2019-01-14 07:57:50 +01:00
Mike Fährmann
89df37a173 [artstation] use a separate dict for each asset (#154)
Using the same base-dict for each asset of a project causes unwanted
side effects like re-using image filename extensions for videos,
resulting in errors with the youtube-dl downloader.
2019-01-11 12:26:12 +01:00
Mike Fährmann
1734a6c879 [reactor] detect "circular" redirects (#148) 2019-01-09 14:59:15 +01:00
Mike Fährmann
e53cdfd6a8 update build_supportedsites.py 2019-01-09 14:58:35 +01:00
Mike Fährmann
1e4d351ad3 [danbooru] add authentication support (closes #151)
... via HTTP Basic Auth with username and "password".

The password value in this case is not the account password itself,
but the"api_key" found in your user profile.
2019-01-09 14:19:07 +01:00
Mike Fährmann
06cbf5f9c4 implement 'chapter-reverse' option (#149)
Setting it to `true` will start with the latest chapter instead of the
first one.
2019-01-07 18:22:33 +01:00
Mike Fährmann
e95b24f056 [reactor] add wait-min & -max options (#148) 2019-01-07 18:04:47 +01:00
Mike Fährmann
8e01cf0ef8 [reactor] generalize extractors (#148)
- support *.reactor.cc domains
- combine joyreactor and pornreactor modules
2019-01-07 17:06:47 +01:00
Mike Fährmann
1737d7f576 [joyreactor] fix and improve pagination (#148) 2019-01-03 22:13:38 +01:00
Mike Fährmann
8753627ef4 [joyreactor] improve error handling for faulty JSON (#148)
- remove all ASCII escape codes, not just \n and \r
- ignore faulty posts instead of letting the exception propagate
2019-01-03 16:31:25 +01:00
Mike Fährmann
a36f52a730 [joyreactor] add extractor for search results (#148) 2019-01-03 16:25:56 +01:00
Mike Fährmann
a303efb597 [mangadex] handle manga pages without chapters 2019-01-03 16:22:12 +01:00
Mike Fährmann
0afa913de4 [tumblr] add tests for hidden and private blogs (#145)
Hidden / dashboard-only blogs are pretty straightforward and "only"
require a valid 'access-token' and 'access-token-secret' for the given
'api-key' and 'api-secret', so that signed OAuth1.0 requests are possible.

Private / password protected blogs on the other hand are a bit
cumbersome. In addition to a valid 'access-token' and
'access-token-secret', they also require the account belonging to those
tokens to be a member of the blog itself. Knowing the password and
entering it in the website isn't enough to access a blog through the
API. Following a private blog is also impossible, so that option can't
work either.
2019-01-03 16:12:24 +01:00
Mike Fährmann
fa7fa2f8ff [deviantart1 update tests] 2019-01-01 15:39:34 +01:00
Mike Fährmann
b7b5456a32 [kissmanga] use HTTPS 2018-12-30 14:04:46 +01:00
Mike Fährmann
259123732f [readcomiconline] improve comic-page parsing 2018-12-30 13:19:23 +01:00
Mike Fährmann
4ab0960083 [reddit] add metadata to extracted URLs 2018-12-29 17:52:43 +01:00
Mike Fährmann
2f4f60de33 [tumblr] add tests for each post type 2018-12-27 22:41:42 +01:00
Mike Fährmann
98314aa04c [mangapark] detect non-existent chapters 2018-12-27 21:41:50 +01:00
Mike Fährmann
6c71e9cf5d [deviantart] add separate 'sta.sh' extractor (#113)
- supports multiple stashed deviations per page
- explicitly mentions sta.sh support on supportedsites.rst
2018-12-26 18:56:57 +01:00
Mike Fährmann
f9ace0f4a3 [mangapark] fix manga extraction ... again 2018-12-26 18:56:57 +01:00
Mike Fährmann
28f9539551 [tumblr] change default values for post types and inline media 2018-12-26 18:55:59 +01:00
Mike Fährmann
5be95034ba [tumblr] add option to download avatars (#137) 2018-12-26 14:29:30 +01:00
Mike Fährmann
7471933d5f use extractor.request for all other API calls
- deviantart
- pawoo
- pixiv
- reddit
2018-12-22 14:42:23 +01:00
Mike Fährmann
995844c915 [instagram] relax test pattern even more 2018-12-22 14:25:55 +01:00
Mike Fährmann
2e5f82e59e [tumblr] don't follow 'external' Tumblr URLs (#139) 2018-12-22 14:05:43 +01:00
Mike Fährmann
0c9762f00e [mangapark] fix extraction 2018-12-22 13:52:48 +01:00
Mike Fährmann
c9ef5ed364 [luscious] ensure URLs have a scheme 2018-12-21 17:56:51 +01:00
Mike Fährmann
851ee9f89f [sensescans] replace tests
the old ones got removed
2018-12-21 16:05:07 +01:00
Mike Fährmann
0be7ee3106 [hitomi] fix image subdomains (closes #142)
galleries with an ID ending in 1 need some special treatment
2018-12-14 16:15:06 +01:00
Mike Fährmann
fe96835d25 [kissmanga] add fallback for chapter-string parsing (#20) 2018-12-14 16:08:36 +01:00
Mike Fährmann
4d73cc785d update test results 2018-12-14 16:07:32 +01:00
Mike Fährmann
049a9575c4 [tumblr] fix inline extraction #2
Using only the "comment" field isn't enough ...

[ci skip]
2018-12-11 21:57:20 +01:00
Mike Fährmann
f6bf66f72c [pixiv] create directory for each "work" item (#136) 2018-12-11 20:37:47 +01:00