Commit Graph

2004 Commits

Author SHA1 Message Date
Mike Fährmann
3af9350648 [twitter] update API calls
- use 'https://twitter.com/i/api' for all requests
  except '/guest/activate.json'
- update (default) URL parameters
- update GraphQL endpoints
2020-12-28 22:05:48 +01:00
Mike Fährmann
b656b829db [twitter] fix login with username & password
It is no longer possible to get an 'authenticity_token' from Twitter's
Javascript-free login form, which got disabled few days ago.

Generating a random 16 byte hex string client-side and sending that as
a cookie alongside the regular login form works just as well.
2020-12-28 16:10:19 +01:00
Mike Fährmann
912eea29bc update extractor test results 2020-12-27 17:41:08 +01:00
Mike Fährmann
47a7a51944 [sankaku] fix 'invalid_token' detection 2020-12-27 02:31:01 +01:00
Mike Fährmann
ba5df84f7e [keenspot] improve redirect handling
Before it would use http:// for all requests and
get a redirect to a https:// version if those are supported.

Now the redirect only happens once during the first request.
2020-12-26 21:38:40 +01:00
Mike Fährmann
d781e6ac44 [e621] return pool posts in order (closes #1195)
… and add a 'num' enumeration index.

A bit more code than the PR version, but it prints some helpful messages
and doesn't call 'metadata()' twice.
2020-12-26 19:00:29 +01:00
Mike Fährmann
e7d446a8f7 [danbooru] slight code refactoring 2020-12-25 22:06:25 +01:00
Mike Fährmann
e41e2be2f9 [booru] split '_prepare_post()' 2020-12-24 01:13:54 +01:00
Mike Fährmann
53222445d5 [hentaicafe] simplify default filenames 2020-12-23 01:03:08 +01:00
Mike Fährmann
712c792fbe [hentaicafe] prefer title of /hc.fyi/ pages (closes #1106) 2020-12-23 01:01:15 +01:00
Mike Fährmann
2c4d4a75db [mangadex] respect 'chapter-reverse' settings (closes #1194)
The extractor in question doesn't inherit from MangaExtractor
and therefore didn't do this automatically.
2020-12-22 15:08:10 +01:00
Mike Fährmann
3bd08acc8f [pixiv] output debug message on failed login attempt
(#1192)
2020-12-22 14:59:31 +01:00
Mike Fährmann
b58e605dc7 raise error when required username or password are missing
do not try to login as 'None' (#1192)
2020-12-22 14:40:18 +01:00
Mike Fährmann
b233531aaa [sankaku] use '/posts' endpoint for single posts 2020-12-22 02:44:40 +01:00
Mike Fährmann
459a0af4f8 [sankaku] add support for sankaku.app URLs (closes #1193) 2020-12-22 01:57:53 +01:00
Mike Fährmann
371e9ca6df [pinterest] implement video support (closes #1189) 2020-12-21 16:09:06 +01:00
Mike Fährmann
537742c0ee [sankaku] normalize 'created_at' metadata (closes #1190) 2020-12-21 02:06:29 +01:00
Mike Fährmann
ae6748996a [pornhub] update tests 2020-12-21 02:06:28 +01:00
Mike Fährmann
bf629a2818 [instagram] add 'include' option (closes #1180)
Split the functionality of the old 'user' extractor into separate
'posts' and 'highlights' extractors, which respond to virtual URLs
('/<user>/posts' and '/<user>/highlights')
2020-12-21 02:06:28 +01:00
Mike Fährmann
78061658ea [booru] reduce exceptions caught during _prepare_post()
don't catch HttpErrors etc.
2020-12-21 02:05:59 +01:00
Mike Fährmann
212ae0c399 [mangapanda] remove module
site now redirects to mangareader.net
2020-12-20 17:42:15 +01:00
Mike Fährmann
337b118e25 [instagram] warn about private profiles (#1187) 2020-12-19 22:32:28 +01:00
Mike Fährmann
465015f75a [sankaku] reimplement login support (#1176, #1182) 2020-12-17 16:12:59 +01:00
Mike Fährmann
8d2e4e5f13 [booru] improve error handling
e.g. for posts without a valid 'file_url' (#1176)
2020-12-17 01:16:45 +01:00
Mike Fährmann
1d753542c2 [hentainexus] fix extraction (fixes #1166) 2020-12-12 20:30:51 +01:00
Mike Fährmann
a00b60fbe7 [twitter] update 'x-csrf-token' header (fixes #1170)
Twitter started using a bigger (80 instead of 16 bytes) CSRf token for
logged in users, and expects those to be used as 'x-csrf-token' header
when send via 'ct0' cookie.

Generating an 80 byte token ourselves doesn't work, and Twitter will
still insist on using its own.
2020-12-11 13:46:58 +01:00
Mike Fährmann
b88c97b873 [instagram] add 'cursor' option (#1149)
To enable at least 'some' way to continue downloading from the middle
of a user profile listing.
2020-12-11 13:46:58 +01:00
Mike Fährmann
0d406c8daf [common] restrict values used in 'generate_extractors()' 2020-12-11 13:46:47 +01:00
Mike Fährmann
b2c55f0a72 [sankaku] remove login support
The old login method for 'https://chan.sankakucomplex.com/user/login'
and the cookies it produces have no effect on the results from
'beta.sankakucomplex.com'.
2020-12-08 21:05:47 +01:00
Mike Fährmann
7f3d811d7b [moebooru] inherit from BooruExtractor 2020-12-08 18:34:56 +01:00
Mike Fährmann
a3a863fc13 [booru] add generalized extractors for *booru sites
similar to cc15fbe7
2020-12-08 18:34:30 +01:00
Mike Fährmann
5f23441e12 [piczel] update API URLs 2020-12-07 15:56:32 +01:00
Mike Fährmann
47114339a2 [webtoons] update 'ageGate' cookie 2020-12-07 14:56:32 +01:00
Mike Fährmann
4225f12783 [nozomi] handle empty 'date' fields (fixes #1163) 2020-12-07 00:08:53 +01:00
Mike Fährmann
2b93515ee0 [instagram] reimplement support for stories (#1149) 2020-12-06 21:32:10 +01:00
Mike Fährmann
ecdea799dd [sankaku] use 'beta.sankakucomplex.com' API endpoints 2020-12-05 22:08:58 +01:00
Mike Fährmann
b3ecc89a9a [instagram] use double quotes for strings when possible 2020-12-05 19:33:42 +01:00
Mike Fährmann
76285eb60d [instagram] reimplement support for story highlights (#1149) 2020-12-05 19:13:00 +01:00
Mike Fährmann
8ca7f54750 rename '_request_…' variables
- remove '_' at the beginning
- _request_last -> request_timestamp
2020-12-05 00:09:15 +01:00
Mike Fährmann
15a122aff3 [instagram] update 'X-IG-WWW-Claim' headers 2020-12-04 20:58:34 +01:00
Mike Fährmann
e5d81bdc7b [mangadex] handle 'external' chapters (closes #1154) 2020-12-04 20:56:30 +01:00
Mike Fährmann
447488fb18 [instagram] rewrite
(#1113, #1122, #1128, #1130, #1149)

Rely on the results of GraphQL queries instead of requesting data
for each post separately via '/p/<shortcode>/?__a=1'.

This might result in some missing metadata, and there might be some
issues for '/channel/' and '/saved/' URLs, but at least downloading
from the regular post listings should work without issues and without
getting users blocked/banned.

TODO: reimplement support for stories
2020-12-03 14:30:59 +01:00
Mike Fährmann
cc15fbe71a [moebooru] add generalized extractors for moebooru sites
- add support for sakugabooru.com (closes #1136)
- add support for lolibooru.moe   (closes #1050)

This allows users to dynamically add support for moebooru/myimouto
based sites by adding an entry to their config file
(like for foolslide, foolfuuka, etc)

For example:
{
    "extractor": {
        "moebooru": {
            "new-site-1": {"root": "https://site1.net"},
            "new-site-2": {"root": "https://www.site2.moe"}
        }
    }
}
2020-12-01 22:27:18 +01:00
Mike Fährmann
43120407cc [paheal] create directory for each post (closes #1147) 2020-12-01 12:14:55 +01:00
Mike Fährmann
63e61a0932 [twitter] update image URL format (#1145)
use
'/<name>?format=<fmt>&name=<size>'
instead of the potentially deprecated
'/<name>.<fmt>:<size>'

but keep all of them as fallback URLs
2020-12-01 11:53:51 +01:00
Mike Fährmann
ae6a1d5fbc [mangoxo] fix extraction 2 2020-11-27 13:55:30 +01:00
Mike Fährmann
f6a684bc37 [hentainexus] update data decoding procedure (#1125) 2020-11-25 11:26:26 +01:00
Mike Fährmann
c57a918f4a [e621] implement delay via '_request_interval_min' 2020-11-25 00:19:32 +01:00
Mike Fährmann
93ce7466e2 [2chan] skip external links 2020-11-24 16:41:47 +01:00
Mike Fährmann
b214e89b5c [mangoxo] fix extraction 2020-11-24 12:50:46 +01:00