Commit Graph

265 Commits

Author SHA1 Message Date
David Hoppenbrouwers
b17e2dcf93 [wallpapercave] add extractor for images (#2205) 2022-02-11 23:44:51 +01:00
Thomas Jost
a7de819aca [lightroom] add Lightroom gallery extractor (#2263) 2022-02-11 21:30:59 +01:00
Mike Fährmann
563bd0ecf4 [danbooru] inherit from BaseExtractor
- merge danbooru and e621 code
- support booru.allthefallen.moe (closes #2283)
- remove support for old e621 tag search URLs
2022-02-11 21:01:51 +01:00
enormous-muscles
55326377d8 Add Kohlchan extractor (#2251) 2022-02-04 23:22:17 +01:00
Vrihub
96fcff182c generic extractor (#735)
* Generic extractor, see issue #683

* Fix failed test_names test, no subcategory needed

* Prefix directory_fmt with "generic"

* Relax regex (would break some urls)

* Flake8 compliance

* pattern: don't require a scheme

This fixes a bug when we force the generic extractor on urls without a
scheme (that are allowed by all other extractors).

* Fix using g: and r: on urls without http(s) scheme

Almost all extractors accept urls without an initial http(s) scheme.

Many extractors also allow for generic subdomains in their "pattern"
variable; some of them implement this with the regex character class
"[^.]+" (everything but a dot).

This leads to a problem when the extractor is given a url starting
with g: or r: (to force using the generic or recursive extractor)
and without the http(s) scheme: e.g. with "r:foobar.tumblr.com"
the "r:" is wrongly considered part of the subdomain.

This commit fixes the bug, replacing the too generic "[^.]+" with the
more specific "[\w-]+" (letters, digits and "-", the only characters
allowed in domain names), which is already used by some extractors.

* Relax imageurl_pattern_ext: allow relative urls

* First round of small suggested changes

* Support image urls starting with "//"

* self.baseurl: remove trailing slash

* Relax regexp (didn't catch some image urls)

* Some fixes and cleanup

* Fix domain pattern; option to enable extractor

Fixed the domain section for "pattern", to pass "test_add" and
"test_add_module" tests.
Added the "enabled" configuration option (default False) to enable the
generic extractor. Using "g(eneric):URL" forces using the extractor.
2021-12-29 22:39:29 +01:00
Mike Fährmann
882c614281 add album extractor for lolisafe/chibisafe instances
- support bunkr.is (closes #2038)
- support zz.ht    (closes #2105)
2021-12-21 19:24:17 +01:00
Mike Fährmann
299bd2f1f5 [rule34us] add 'tag' and 'post' extractors (#1527) 2021-12-14 00:27:46 +01:00
Mike Fährmann
37c9dedee1 [seisoparty] remove module 2021-11-09 22:41:04 +01:00
Alice
bfd7401b1e [skeb] add 'user' and 'post' extractors (#1031) (#1971)
* Create skeb.py

* Update __init__.py

* Update supportedsites.py

* Update supportedsites.md

* Update supportedsites.py

* Update skeb.py
2021-10-26 20:00:41 +02:00
Mike Fährmann
918fc9974d [picarto] add 'gallery' extractor (closes #1931) 2021-10-13 01:22:10 +02:00
Mike Fährmann
e4684c5cb9 [desktopography] simplify (#1740) 2021-09-17 20:09:24 +02:00
Giacomo Rossetto
4a7d7899ff Implement desktopography extractor (#1740) 2021-09-17 19:59:51 +02:00
Mike Fährmann
20ee091289 [429chan] add 'thread' and 'board' extractors (closes #1773) 2021-08-21 22:46:22 +02:00
enormous-muscles
975e1ac6e2 Add Wikieat extractor (#1699)
* Add Wikieat extractor

* Add Wikieat extractor to extractor list
2021-08-12 15:13:20 +02:00
Mike Fährmann
da7297c0b9 [comicvine] add extractor (closes #1712) 2021-07-23 16:17:06 +02:00
Mike Fährmann
e4788fa663 [bbc] add 'gallery' and 'programme' extractors (closes #1706) 2021-07-22 20:37:05 +02:00
Mike Fährmann
36ac2197db [ytdl] add extractor for sites supported by youtube-dl
(#1680, #878)

Can be used by prefixing any URL with 'ytdl:',
or by setting 'extractor,ytdl.enabled' to 'true'.
2021-07-10 20:55:47 +02:00
Mike Fährmann
267bbf5996 [mangasee] add 'chapter' and 'manga' extractors 2021-06-27 02:03:03 +02:00
Mike Fährmann
f74cf52e2b [seisoparty] add 'user' and 'post' extractors (#1635) 2021-06-25 18:40:11 +02:00
thatfuckingbird
e47952ac14 add extractors for fantia and fanbox (#1459)
* add extractors for fantia and fanbox

* appease linter

* make docstrings unique

* [fantia] refactor post extraction

* [fantia] capitalize

* [fantia] improve regex pattern

* code style

* capitalize

* [fanbox] use BASE_PATTERN for url regexes

* [fanbox] refactor metadata and post extraction

* [fanbox] improve url base pattern

* [fanbox] accept creator page links ending with /posts

* [fanbox] more tests

* [fantia] improved pagination

* [fanbox] misc. code logic improvements

* [fantia] finish restructuring pagination code

* [fanbox] avoid making a request for each individual post when processing a creator page

* [fanbox] support embedded videos

* [fanbox] fix errors

* [fanbox] document extractor.fanbox.videos

* [fanbox] handle "article" and "entry" post types, all embeds

* [fanbox] fix downloading of embedded fanbox posts
2021-04-25 19:39:13 +02:00
Hans Christian Gunawan
334d690687 [hentaicosplays] Add extractor (#1473) 2021-04-18 20:28:00 +02:00
Mike Fährmann
78d7ee3ef4 [yuki] remove module for yuki.la 2021-04-12 21:42:32 +02:00
FollieHiyuki
e3b9f88540 Add manganelo extractor (#1415) 2021-04-02 21:01:31 +02:00
Mike Fährmann
5aa30c3669 [tapas] add 'series' and 'episode' extractors (#692) 2021-03-27 18:28:16 +01:00
Mike Fährmann
62cfee4d28 [vk] initial support for albums (#474) 2021-03-23 19:02:16 +01:00
Mike Fährmann
fcdda6128c [mangastream] remove module 2021-03-16 23:52:36 +01:00
Mike Fährmann
c677ea19dd [mangareader] remove module 2021-03-16 23:48:55 +01:00
Mike Fährmann
71523aaab6 [architizer] add 'project' extractor (#1369) 2021-03-16 03:24:29 +01:00
Mike Fährmann
466966bf83 [hentaicafe] remove module 2021-03-14 17:19:57 +01:00
Mike Fährmann
97641cd151 [hentainexus] remove module 2021-03-14 17:19:57 +01:00
Mike Fährmann
c485d0a956 [philomena] add generalized extractors for philomena sites
(closes #1379)
2021-03-14 17:19:57 +01:00
Seonghyeon Cho
665499924d Support naver webtoon (#1331)
* Support naver webtoon (WIP)

* Apply patch

* Change filename format

* Fill test results

* Fill test result
2021-03-03 15:21:13 +01:00
topozorra
a9119da4d4 support tumblrgallery.xyz (#1298)
* support `tumblrgallery.xyz`

* fix format issues

* Refactor and add post and search page support

* Fix warnings

* Few improvments

* Better file names

* Fix linting errors

* move id closer to the begining of the file name

Co-authored-by: topozorra <none>
2021-03-03 15:20:47 +01:00
Mike Fährmann
8821dceb79 use __import__() to dynamically load modules 2021-03-01 01:27:02 +01:00
loragja
7b5ee922b7 cyberdrop extractor (#1328)
* create cyberdrop extractor

* add cyberdrop to list of extractors

* fix formatting

* change class name from CyberdropExtractor to CyberdropAlbumExtractor

* add cyberdrop to list of supported sites

* attempt to clean up diff of supportedsites.rst

* replace regex with functions from text library
2021-02-21 20:42:45 +01:00
Mike Fährmann
595bdaa4be add extractors for gelbooru v0.1 sites
- support https://illusioncards.booru.org/  (closes #426)
- support https://the-collection.booru.org/ (closes #767)
- support https://allgirl.booru.org/
- closes #234, closes #473, closes #1238

To get gallery-dl to recognize other sites running Gelbooru v0.1
(most sites on booru.org), add one or more entries to the
'gelbooru_v01' block in your config file. For example:

{
    "extractor": {
        "gelbooru_v01": {
            "rozenmaidenbooru": {"root": "http://rm.booru.org"},
            "drawfriendsbooru": {"root": "http://drawfriends.booru.org"}
        }
    }
}
2021-02-17 02:36:27 +01:00
Mike Fährmann
08d7934c6e move extractors from booru.py into their own gelbooru_v02 module 2021-02-17 00:26:24 +01:00
Mike Fährmann
ae530f6365 [erome] add extractors for albums, users, searches (closes #409) 2021-02-07 22:58:19 +01:00
Mike Fährmann
7ca3bf7cb0 [pillowfort] add 'user' and 'post' extractors (#846) 2021-01-25 15:03:22 +01:00
Federico Ravasio
25297815bc [photovogue] added portfolio extractor (#1253) 2021-01-22 19:36:13 +01:00
Mike Fährmann
534194bf92 [unsplash] add extractors (#1197)
for
- single photos  (/photos/ID)
- user profiles  (/@USER)
- user likes     (/@USER/likes)
- search results (/s/photos/SEARCH)
2021-01-19 02:23:39 +01:00
Mike Fährmann
e07dfc4fe5 [kemonoparty] add 'user' and 'post' extractors (#1216) 2021-01-11 22:17:08 +01:00
Mike Fährmann
fa8ee6eac4 [derpibooru] add search and gallery extractors (#862) 2021-01-07 18:05:32 +01:00
Mike Fährmann
212ae0c399 [mangapanda] remove module
site now redirects to mangareader.net
2020-12-20 17:42:15 +01:00
Mike Fährmann
a3a863fc13 [booru] add generalized extractors for *booru sites
similar to cc15fbe7
2020-12-08 18:34:30 +01:00
Mike Fährmann
cc15fbe71a [moebooru] add generalized extractors for moebooru sites
- add support for sakugabooru.com (closes #1136)
- add support for lolibooru.moe   (closes #1050)

This allows users to dynamically add support for moebooru/myimouto
based sites by adding an entry to their config file
(like for foolslide, foolfuuka, etc)

For example:
{
    "extractor": {
        "moebooru": {
            "new-site-1": {"root": "https://site1.net"},
            "new-site-2": {"root": "https://www.site2.moe"}
        }
    }
}
2020-12-01 22:27:18 +01:00
Mike Fährmann
350b1afe1c speed up _list_classes() after iterating over all modules once 2020-10-26 22:18:15 +01:00
Mike Fährmann
c874071f5a [kissmanga] remove module 2020-10-04 22:46:41 +02:00
Zanny
ebb7737b9b Weasyl Extractor (#977)
* weasyl extractor

* @kattjevfel suggested changes

* @mikf changes
2020-09-25 15:18:21 +02:00
choeronline
05b9ac8d37 [myhentaigallery] add extractor (#1001)
* adds support for myhentaigallery

* fixes linting issues in myhentaigallery extractor
2020-09-17 17:32:54 +02:00