Commit Graph

68 Commits

Author SHA1 Message Date
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
850df34c31 remove '&' from URL patterns part 2
follow-up on 968d3e8465
2023-05-03 20:26:25 +02:00
Mike Fährmann
4d415376d1 [pinterest] fix 'pin.it' extractor
it really was just the single '/' at the end of the url_shortener URL
2023-05-03 20:05:10 +02:00
Mike Fährmann
657b6a9100 [pinterest] update endpoint for related board pins 2023-05-03 18:41:09 +02:00
Mike Fährmann
0b93420a81 [pinterest] unescape search terms (#3621) 2023-02-15 15:44:20 +01:00
Mike Fährmann
5503ac4d5e replace json.dumps with direct calls to JSONEncoder.encode 2023-02-09 15:51:40 +01:00
Mike Fährmann
9116398c1c [pinterest] add 'domain' option (#3484)
use input URL domain by default
2023-01-04 17:20:14 +01:00
Mike Fährmann
294108c90a [pinterest] support 'All Pins' boards (#2855, #3484) 2023-01-03 19:11:20 +01:00
Mike Fährmann
311e9383af [pinterest] handle section pins with separate extractors (#2684) 2022-07-03 18:12:16 +02:00
Mike Fährmann
0b33435da5 [pinterest] support multiple files per pin (closes #1619, #2452) 2022-04-06 21:21:33 +02:00
Mike Fährmann
9c5d2d7af3 [pinterest] add extractor for created pins (#2452) 2022-04-01 16:59:58 +02:00
Mike Fährmann
9313d4dc10 [pinterest] do not force 'm3u8_native' for video downloads (#2436) 2022-03-21 10:11:51 +01:00
Mike Fährmann
36291176bc [pinterest] add 'search' extractor (#1411) 2021-03-29 01:41:28 +02:00
Mike Fährmann
780b6adb91 rename 'generate_csrf_token()' to just 'generate_token()'
and add a 'size' argument
2021-01-11 22:12:40 +01:00
Mike Fährmann
8a88025dc4 [pinterest] support generic user URLs (#1205)
i.e. https://www.pinterest.com/USERNAME

also renames 'BoardsExtractor' to 'UserExtractor'
2021-01-02 02:36:53 +01:00
Mike Fährmann
6cdbab07b5 [pinterest] add support for getting all boards of a user
(#1205)
2020-12-29 16:57:03 +01:00
Mike Fährmann
371e9ca6df [pinterest] implement video support (closes #1189) 2020-12-21 16:09:06 +01:00
Mike Fährmann
b8daabc3ca [pinterest] implement login support (closes #1055)
being logged allows access to secret/protected boards
2020-10-15 15:14:18 +02:00
Mike Fährmann
26a967cbd4 [pinterest] match 'pinterest.co.uk' URLs (fixes #914) 2020-07-27 14:41:34 +02:00
Mike Fährmann
0e714b9a0e [pinterest] add 'section' extractor (#835) 2020-06-21 00:08:14 +02:00
Mike Fährmann
5ba90f72ca [pinterest] add support for sections (closes #835) 2020-06-16 14:41:05 +02:00
Mike Fährmann
32d7195d08 [pinterest] improve detection of invalid pin.it links 2020-01-18 21:06:44 +01:00
Mike Fährmann
1f2a69f3c5 add '_extractor' information to redirect results 2019-12-29 23:37:34 +01:00
Mike Fährmann
c4702ec9b6 simplify some logging calls 2019-12-10 21:30:08 +01:00
Mike Fährmann
da6789b2b0 disable unique archive id checks for some tests
- same image twice in a livedoor blog post
- unreliable results for related pinterest items
2019-11-10 17:04:51 +01:00
Mike Fährmann
4409d00141 embed error messages in StopExtraction exceptions 2019-10-28 16:39:49 +01:00
Mike Fährmann
fdec59f8e2 replace extractor.request() 'expect' argument
with
- 'fatal': allow 4xx status codes
- 'notfound': raise NotFoundError on 404
2019-07-05 00:42:16 +02:00
Mike Fährmann
4b1880fa5e propagate 'match' to base extractor constructor 2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107 simplify extractor constants
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
6126615698 update URLs for supportedsites.rst 2019-01-30 16:18:22 +01:00
Mike Fährmann
98c6520384 [pinterest] update root URL of API calls 2019-01-14 15:22:04 +01:00
Mike Fährmann
40e30694f3 [pinterest] fix pin.it redirects 2018-12-02 19:38:50 +01:00
Mike Fährmann
7f6a0be982 adjust some tests 2018-11-15 22:50:04 +01:00
Mike Fährmann
3bdfc15be1 [pinterest] don't crash on pins without image info 2018-11-14 11:46:14 +01:00
Mike Fährmann
1532d1b690 fix 'range' tests and update a few test results 2018-10-08 23:53:58 +02:00
Mike Fährmann
d3f1eed2a6 [pinterest] improvements
- add stop condition for pin-related pins
- improve URL patterns
- make Pylint happy
2018-08-16 18:11:39 +02:00
Mike Fährmann
63fa0b2006 [pinterest] add extractors for related pins
Related pins can not be accessed by adding a "#related" fragment
to the end of a Pinterest URL, for example:
- https://www.pinterest.com/pin/858146903966145189/#related
- https://www.pinterest.com/g1952849/test-/#related

There are no explicit real URLs for related pins,
using an option to enable them results in "clunky" code,
and a custom "related:<URL>" scheme doesn't feel right either.
2018-08-15 21:49:45 +02:00
Mike Fährmann
a86f2bfc80 [pinterest] update not-found redirects 2018-08-07 12:13:19 +02:00
Mike Fährmann
b8c97d2295 use 'extractor.request()' for more HTTP requests 2018-06-25 23:40:59 +02:00
Mike Fährmann
017188d268 improve extractor.request()
Replace the 'fatal' parameter with 'expect', which is a list/range
of HTTP status codes >= 400 that should also be accepted.
2018-06-18 16:29:56 +02:00
Mike Fährmann
e1e23165a0 [pinterest] catch JSON decode errors 2018-05-11 17:37:27 +02:00
Mike Fährmann
2ea0d1da42 [smugmug] improve API code; use data expansions 2018-04-30 18:22:44 +02:00
Mike Fährmann
2395d870dd [pinterest] unquote board and user names, better errors 2018-04-26 16:38:12 +02:00
Mike Fährmann
0f1e07f627 [pinterest] scrap OAuth implementation; code improvements
OAuth authentication isn't needed anymore and other tools
like Postman are better suited for this job anyway.
2018-04-25 16:04:30 +02:00
Mike Fährmann
55d4d23860 [pinterest] use Pinterest's "Web" API (#83)
no access tokens, no user credentials of any kind ...
2018-04-24 22:28:10 +02:00
Mike Fährmann
d10579edb5 [pinterest] improve PinterestAPI code; remove OAuth mentions
on another note: access_tokens have been set to only allow for
10 requests per hour (from 200 yesterday)
2018-04-17 17:12:42 +02:00
Mike Fährmann
4bd182c107 [pinterest] implement oauth:pinterest (#83)
Pinterest access tokens are rate limited at 200 requests per
hour (or maybe per 2 or 3 hours?) so having just one access token
for all users isn't going to work in the long run.
2018-04-16 20:03:28 +02:00
Mike Fährmann
9651f3fce0 [pinterest] improve error messages (#83) 2018-04-16 19:36:54 +02:00
Mike Fährmann
dbe250f7e5 [pinterest] update access_token (#83) 2018-04-16 09:46:45 +02:00
Mike Fährmann
e32fe1cdf1 [pinterest] cast IDs to int
... and update test results.

Image URLs changed from
https://s-media-cache-ak0.pinimg.com/... to
https://i.pinimg.com/...
2018-03-06 14:28:21 +01:00