Mike Fährmann
968597a302
yield 3-tuples for Message.Directory
...
adapt tuples to the same length and semantics as other messages
2025-12-05 21:39:52 +01:00
Mike Fährmann
d7c97d5a97
use f-strings when building 'pattern'
2025-10-20 21:23:11 +02:00
Mike Fährmann
8c62be343e
[output] add 'Logger.traceback()' helper
2025-10-14 18:44:29 +02:00
Mike Fährmann
a0b3e08f64
[tests/extractor] ensure Extractor classes match
2025-09-17 19:29:49 +02:00
Mike Fährmann
a097a373a9
simplify if statements by using walrus operators ( #7671 )
2025-07-22 20:57:54 +02:00
Mike Fährmann
2ccb9acf1a
[pinterest] support 'pin.it' board redirects ( #7805 )
2025-07-11 22:28:26 +02:00
Mike Fährmann
8e40ea2fe2
[pinterest] match board URLs with query strings ( #7805 )
2025-07-11 22:28:26 +02:00
Mike Fährmann
d8ef1d693f
rename 'StopExtraction' to 'AbortExtraction'
...
for cases where StopExtraction was used to report errors
2025-07-09 21:07:28 +02:00
Mike Fährmann
9dbe33b6de
replace old %-formatted and .format(…) strings with f-strings ( #7671 )
...
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
41191bb60a
'match.group(N)' -> 'match[N]' ( #7671 )
...
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
9d3cf67f3e
[pinterest] remove excess whitespace from 'description' fields ( #4335 )
...
and 'closeup_unified_description' & 'closeup_description'
2025-06-13 13:11:18 +02:00
Mike Fährmann
e08ec7e083
update copyright notices
2025-06-13 00:03:41 +02:00
Mike Fährmann
f5b8c25559
[pinterest] ignore 'story_pin_product_sticker_block' blocks ( #7563 )
2025-05-22 18:42:39 +02:00
Mike Fährmann
88f1541a83
[common] add 'request_location()' convenience function
2025-04-19 16:45:05 +02:00
Mike Fährmann
c4d08b24e9
[pinterest] ignore 'story_pin_static_sticker_block' blocks ( #7251 )
2025-03-28 20:20:29 +01:00
Mike Fährmann
b8b943fc38
[pinterest] update API headers ( #6513 )
...
'BoardFeed' requests fail without 'X-Pinterest-PWS-Handler'
2024-11-22 08:41:10 +01:00
Mike Fährmann
ce90566c56
[pinterest] detect video/audio by block content ( #6421 )
...
story blocks from search/board results do not always contain a 'type'
2024-11-05 15:55:24 +01:00
Mike Fährmann
a9a9f3a180
[pinterest] support 'story_pin_music_block' blocks ( #6421 )
2024-11-05 15:55:24 +01:00
Mike Fährmann
5d984f35aa
[pinterest] support 'story' pins ( #6188 , #6078 , #4229 )
2024-10-19 17:47:31 +02:00
Mike Fährmann
5477ed181d
[pinterest] move file extraction into separate method
2024-10-18 20:55:20 +02:00
Mike Fährmann
1824267447
[dl:ytdl] implement explicit HLS/DASH handling
...
add '_ytdl_manifest' to specify a manifest type to process
2024-10-16 15:16:21 +02:00
Mike Fährmann
d7823b9f81
[pinterest] fix section URLs for boards with /?# in name ( #5104 )
2024-02-05 15:54:06 +01:00
blankie
375f2db4c2
[pinterest] add count metadata field
2023-12-28 01:07:04 +11:00
Mike Fährmann
75fa1a5553
[pinterest] remove login code
...
this has been broken since forever
and is still "protected" by an invisible recaptcha check
2023-12-20 20:59:18 +01:00
Mike Fährmann
57fc6fcf83
replace '24*3600' with '86400'
...
and generalize cache maxage values
2023-12-18 23:57:22 +01:00
Mike Fährmann
3ecb512722
send Referer headers by default
2023-09-19 00:02:04 +02:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
850df34c31
remove '&' from URL patterns part 2
...
follow-up on 968d3e8465
2023-05-03 20:26:25 +02:00
Mike Fährmann
4d415376d1
[pinterest] fix 'pin.it' extractor
...
it really was just the single '/' at the end of the url_shortener URL
2023-05-03 20:05:10 +02:00
Mike Fährmann
657b6a9100
[pinterest] update endpoint for related board pins
2023-05-03 18:41:09 +02:00
Mike Fährmann
0b93420a81
[pinterest] unescape search terms ( #3621 )
2023-02-15 15:44:20 +01:00
Mike Fährmann
5503ac4d5e
replace json.dumps with direct calls to JSONEncoder.encode
2023-02-09 15:51:40 +01:00
Mike Fährmann
9116398c1c
[pinterest] add 'domain' option ( #3484 )
...
use input URL domain by default
2023-01-04 17:20:14 +01:00
Mike Fährmann
294108c90a
[pinterest] support 'All Pins' boards ( #2855 , #3484 )
2023-01-03 19:11:20 +01:00
Mike Fährmann
311e9383af
[pinterest] handle section pins with separate extractors ( #2684 )
2022-07-03 18:12:16 +02:00
Mike Fährmann
0b33435da5
[pinterest] support multiple files per pin ( closes #1619 , #2452 )
2022-04-06 21:21:33 +02:00
Mike Fährmann
9c5d2d7af3
[pinterest] add extractor for created pins ( #2452 )
2022-04-01 16:59:58 +02:00
Mike Fährmann
9313d4dc10
[pinterest] do not force 'm3u8_native' for video downloads ( #2436 )
2022-03-21 10:11:51 +01:00
Mike Fährmann
36291176bc
[pinterest] add 'search' extractor ( #1411 )
2021-03-29 01:41:28 +02:00
Mike Fährmann
780b6adb91
rename 'generate_csrf_token()' to just 'generate_token()'
...
and add a 'size' argument
2021-01-11 22:12:40 +01:00
Mike Fährmann
8a88025dc4
[pinterest] support generic user URLs ( #1205 )
...
i.e. https://www.pinterest.com/USERNAME
also renames 'BoardsExtractor' to 'UserExtractor'
2021-01-02 02:36:53 +01:00
Mike Fährmann
6cdbab07b5
[pinterest] add support for getting all boards of a user
...
(#1205 )
2020-12-29 16:57:03 +01:00
Mike Fährmann
371e9ca6df
[pinterest] implement video support ( closes #1189 )
2020-12-21 16:09:06 +01:00
Mike Fährmann
b8daabc3ca
[pinterest] implement login support ( closes #1055 )
...
being logged allows access to secret/protected boards
2020-10-15 15:14:18 +02:00
Mike Fährmann
26a967cbd4
[pinterest] match 'pinterest.co.uk' URLs ( fixes #914 )
2020-07-27 14:41:34 +02:00
Mike Fährmann
0e714b9a0e
[pinterest] add 'section' extractor ( #835 )
2020-06-21 00:08:14 +02:00
Mike Fährmann
5ba90f72ca
[pinterest] add support for sections ( closes #835 )
2020-06-16 14:41:05 +02:00
Mike Fährmann
32d7195d08
[pinterest] improve detection of invalid pin.it links
2020-01-18 21:06:44 +01:00
Mike Fährmann
1f2a69f3c5
add '_extractor' information to redirect results
2019-12-29 23:37:34 +01:00