Commit Graph

3005 Commits

Author SHA1 Message Date
Christian Paul
41fbc20020 [webtoons]: Add cookie rstagGDPR_DE=true (#1431) 2021-04-07 21:42:55 +02:00
Mike Fährmann
583bee7725 release version 1.17.2 2021-04-02 21:16:44 +02:00
FollieHiyuki
e3b9f88540 Add manganelo extractor (#1415) 2021-04-02 21:01:31 +02:00
Mike Fährmann
fd858eed7b [twitter] add 'user_likes' metadata field for liked tweets
i.e. the 'screen_name' of the user whose liked tweets get extracted.

Ideally this would replace 'user' or at least be in the same format,
but that would break backwards compatibility or be impossible/too
complicated thanks to API result differences.

(#1421)
2021-04-02 03:41:41 +02:00
Mike Fährmann
8d124a3766 [twitter] rename variables 2021-04-02 02:49:53 +02:00
Mike Fährmann
105f3c9666 [twitter] add extractor for direct image links (closes #1417) 2021-04-02 02:45:23 +02:00
Mike Fährmann
ec3d5d58a8 [vk] improve extractor (#474)
- fetch all photos
- add 'metadata' option
- fix extracting photos without '?' in URL
2021-04-01 14:35:56 +02:00
Mike Fährmann
ebd142e2a8 [twitter] don't use youtube-dl for cards when videos are disabled
(#1416)
2021-04-01 14:26:08 +02:00
Mike Fährmann
d5aad999dc [tapas] implement login with username & password (#692) 2021-03-30 01:45:28 +02:00
Mike Fährmann
e9ec91c811 [exhentai] improve image limits check
- check if current image is the '509 Bandwidth Exceeded' notification
  (https://ehgt.org/g/509.gif or https://exhentai.org/img/509.gif)
- remove 'limits' option
2021-03-29 19:01:13 +02:00
Mike Fährmann
387fe415d5 unescape items in text.split_html() 2021-03-29 02:12:29 +02:00
Mike Fährmann
36291176bc [pinterest] add 'search' extractor (#1411) 2021-03-29 01:41:28 +02:00
Mike Fährmann
058cc47e9b [bcy] improve pagination 2021-03-28 23:08:26 +02:00
Mike Fährmann
ddd48ceee5 update extractor test results 2021-03-28 23:06:44 +02:00
Mike Fährmann
1a540fbe00 [komikcast] fix extraction 2021-03-28 21:18:58 +02:00
Mike Fährmann
78fd63b8f0 remove 'text.clean_xml()'
was not used anywhere
2021-03-28 04:05:16 +02:00
Mike Fährmann
8553b218d9 replace calls to 'os.path.splitext()' with 'str.rpartition()'
Makes functions who used it more than twice as fast
and we can get rid of an import as well.
2021-03-28 04:01:27 +02:00
Mike Fährmann
0a9af56e3c build executables on GitHub Actions with Python 3.8
Python 3.9 is incompatible with Windows 7, so using a lower
Python version maybe allows those files to run on Windows 7.
2021-03-27 18:31:15 +01:00
Mike Fährmann
5aa30c3669 [tapas] add 'series' and 'episode' extractors (#692) 2021-03-27 18:28:16 +01:00
Mike Fährmann
ccfa5a8694 [twitter] better error message when logging in with 2FA (#1409) 2021-03-27 18:26:37 +01:00
Mike Fährmann
214ecf62ce [deviantart] fix arguments for search/popular results (#1408) 2021-03-27 18:26:10 +01:00
Magnus Boman
522d0a834c [aryion] Unescape paths too (#1414)
Without this you'll get paths like this:
  - Starcross - Ch. 2 "The Ins and Outs of Sarah"

This commit changes it to:
  - Starcross - Ch. 2 "The Ins and Outs of Sarah"
2021-03-27 18:25:38 +01:00
beesdotjson
5ad615f0db fix PixivFavoriteExtractor regex (#1405)
* fix PixivFavoriteExtractor regex

* do not use lookbehind
2021-03-25 14:59:33 +01:00
Mike Fährmann
62cfee4d28 [vk] initial support for albums (#474) 2021-03-23 19:02:16 +01:00
Mike Fährmann
0e601de67b [sankaku] simplify 'pool' tags (#1388)
normalize 'tags' and 'artist_tags' to a string-list
2021-03-23 18:45:45 +01:00
Mike Fährmann
d085ade9d5 [sankaku] add 'tag_string' metadata field (#1388)
The 'join()'ed version of 'tags'.
Handling lists in format strings isn't properly supported yet.
2021-03-23 15:42:13 +01:00
Mike Fährmann
2dffd231b7 [sankaku] add enumeration index for books (#1388) 2021-03-23 15:32:54 +01:00
Mike Fährmann
139fb84108 [deviantart] fix username for 'watch' results (#794)
before it'd use "/" as username
2021-03-22 22:14:21 +01:00
Mike Fährmann
91c2e15da9 [deviantart] add support for posts from watched users (#794) 2021-03-22 19:25:04 +01:00
Mike Fährmann
03c20d8c8e [deviantart] update 'watch' URL pattern (#794) 2021-03-21 22:48:06 +01:00
Mike Fährmann
2846235669 [twitter] allow specifying a custom format for user results
(#1337)
2021-03-21 22:26:26 +01:00
Mike Fährmann
bf241811dd allow '_extractor' fields to be None or empty 2021-03-20 01:19:31 +01:00
Mike Fährmann
dc23cfd684 [deviantart] use fallback for /intermediary/ URLs
instead of checking availability with HEAD requests
2021-03-20 00:10:53 +01:00
Mike Fährmann
15daa62842 release version 1.17.1 2021-03-19 19:14:04 +01:00
Mike Fährmann
b0438c8f99 Revert "[deviantart] extend 'extra' option"
This reverts commit
5ad2b9c82b,
5c32a7bf58, and
83f465faca.

(#1387, #1356)
2021-03-19 16:24:23 +01:00
Mike Fährmann
58b93635ee [architizer] add 'firm' extractor (#1369) 2021-03-19 01:31:34 +01:00
Mike Fährmann
204523611c [imgclick] use 'http://' for image URLs
The TLS certificate for main.imgclick.net is invalid.
2021-03-19 01:30:49 +01:00
Mike Fährmann
0725cfde4f [tests] pin Ubuntu version to still be able to use Python 3.4 2021-03-18 16:20:05 +01:00
Mike Fährmann
0b55f5ad84 [imgur] fix/improve rate limit handling (#1386)
- also wait-and-retry on 429 status codes
- use infinite loop instead of recursive calls
- 'extractor.sleep()' -> 'extractor.wait()'
2021-03-18 15:45:26 +01:00
Mike Fährmann
69ca4e29f1 [deviantart] add 'watch' extractor (#794) 2021-03-17 22:50:02 +01:00
Mike Fährmann
fcdda6128c [mangastream] remove module 2021-03-16 23:52:36 +01:00
Mike Fährmann
c677ea19dd [mangareader] remove module 2021-03-16 23:48:55 +01:00
Mike Fährmann
71523aaab6 [architizer] add 'project' extractor (#1369) 2021-03-16 03:24:29 +01:00
Mike Fährmann
3378b39719 [twitter] implement 'users' option (#1337) 2021-03-16 00:51:05 +01:00
Mike Fährmann
847e9b0ed7 [philomena] support post URLs without '/images/'
e.g. 'derpibooru.org/1'
2021-03-14 18:26:39 +01:00
Mike Fährmann
466966bf83 [hentaicafe] remove module 2021-03-14 17:19:57 +01:00
Mike Fährmann
97641cd151 [hentainexus] remove module 2021-03-14 17:19:57 +01:00
Mike Fährmann
23641742a3 improve 'parent-directory' (#1364)
Allow forwarding metadata from the top-level extractor to all children
if 'parent-directory' is enabled for all extractors along the way.

For example 'reddit' -> 'gfycat' -> 'redgifs'
2021-03-14 17:19:57 +01:00
Mike Fährmann
c485d0a956 [philomena] add generalized extractors for philomena sites
(closes #1379)
2021-03-14 17:19:57 +01:00
Mike Fährmann
6be7df53da [hentaifox] improve metadata extraction (fixes #1378) 2021-03-14 17:19:56 +01:00