Commit Graph

40 Commits

Author SHA1 Message Date
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
1d4db83d49 [weibo] fix end of cursor based pagination 2023-07-04 17:41:22 +02:00
Mike Fährmann
654267a335 [weibo] fix 'json' extension for some videos 2023-06-15 13:49:17 +02:00
Mike Fährmann
0a9aaa7a8d [weibo] prevent fatal exception due to missing video (#4150) 2023-06-08 22:22:43 +02:00
Mike Fährmann
6b6bb4be73 [weibo] require numeric IDs to have length >= 10 (#4059) 2023-05-14 18:45:37 +02:00
Mike Fährmann
72f1f16eb2 [weibo] support 'mix_media_info' entries (#3793) 2023-03-18 15:19:25 +01:00
Mike Fährmann
dd884b02ee replace json.loads with direct calls to JSONDecoder.decode 2023-02-09 15:22:00 +01:00
Mike Fährmann
7e277d0f7d [weibo] add 'count' metadata field (#3305)
or '{status[count]}', as most metadata for weibo is inside 'status'
2022-11-30 11:36:46 +01:00
Mike Fährmann
c25905641e [weibo] fix bug with empty 'playback_list' (#3301) 2022-11-26 12:00:17 +01:00
Mike Fährmann
e3abab8629 [weibo] send 'Referer' headers (#3188) 2022-11-10 17:11:57 +01:00
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
1c89ccb27d [weibo] prevent errors when paginating over album entries (#2817) 2022-08-11 12:22:14 +02:00
Mike Fährmann
0f5826e884 [weibo] prevent exception for missing 'playback_list' (#2792) 2022-07-30 16:49:08 +02:00
Mike Fährmann
c6a9bab019 update extractor test results 2022-07-12 15:49:22 +02:00
Mike Fährmann
539e3bbed9 [weibo] handle invalid/broken status objects 2022-07-12 15:49:09 +02:00
Mike Fährmann
6db77d4656 [weibo] support '?tabtype=video' listings (#2601) 2022-06-12 17:55:23 +02:00
Mike Fährmann
45c980daf0 [weibo] fix retweets (#2601) 2022-06-11 15:30:26 +02:00
Mike Fährmann
61cbf8318c [weibo] fix URLs generated by 'user' extractor (#2601) 2022-06-05 21:37:57 +02:00
Mike Fährmann
e59bcb8437 [weibo] ensure media URLs use https:// 2022-06-03 17:37:57 +02:00
Mike Fährmann
73f673e3ca [weibo] handle 'gif' pictures 2022-06-03 17:33:14 +02:00
Mike Fährmann
57508d3bb7 [weibo] support all different 'tabtype' listings (#686, #2601) 2022-06-03 16:36:22 +02:00
Mike Fährmann
7a9cba9c10 [weibo] add support for usernames in URLs (#1662) 2022-05-31 22:48:34 +02:00
Mike Fährmann
4bf5bc2403 [weibo] support 'livephoto' entries (#2146) 2022-05-31 15:35:24 +02:00
Mike Fährmann
a0692818af [weibo] switch to desktop API (#2601) 2022-05-31 12:46:35 +02:00
Mike Fährmann
afde76269c [weibo] fix infinite retries for deleted accounts (fixes #2521) 2022-04-27 20:23:11 +02:00
Mike Fährmann
e670dc518e [weibo] update pagination code (fixes #2244)
- send proper headers and query parameters
- use 'since_id' instead of page numbers
- set a 1-2 second delay between requests
2022-01-31 19:16:01 +01:00
Mike Fährmann
c80b18a477 [weibo] extend 'retweets' option (closes #1542)
Setting 'retweets' to "original" will use metadata from the
original posts, and not from the retweeted ones.
2021-05-27 23:09:42 +02:00
Mike Fährmann
73373c06ec [weibo] handle posts with more than 9 images (closes #926)
Responses from '/api/container/getIndex' don't list more than
9 images per 'status' object, but the embedded JSON from a
'/detail/<ID>' page does.
2020-10-06 18:16:08 +02:00
Mike Fährmann
c51fbd72ba update extractor test results 2020-07-13 22:57:48 +02:00
Mike Fährmann
7158bdd7c7 [weibo] improve extractor logic (#829) 2020-06-18 15:00:31 +02:00
Mike Fährmann
d5d90a0450 [weibo] add 'date' field to 'status' objects (#829) 2020-06-16 14:46:46 +02:00
Mike Fährmann
5e2974d699 [weibo] add 'videos' option 2020-04-30 00:00:30 +02:00
Mike Fährmann
699036ea0c [weibo] accept status URLs with non-numeric IDs (#664) 2020-03-31 22:46:50 +02:00
Mike Fährmann
e35c2ea1a6 [weibo] use youtube-dl to download from m3u8 manifests 2020-01-24 23:39:34 +01:00
Mike Fährmann
922b8a9595 [weibo] raise NotFoundError for unavailable/deleted statuses 2019-12-14 22:10:02 +01:00
Mike Fährmann
d1ea08c67d [weibo] fixes and improvements
- ignore unavailable videos (fixes #427)
- handle empty 'geo' fields
- consistent metadata fields for images and videos
2019-09-26 14:57:35 +02:00
Mike Fährmann
17c11393f5 [weibo] allow user-ids in status URLs 2019-03-30 18:38:58 +01:00
Mike Fährmann
973a720a7a [weibo] fix unit test URL patterns 2019-03-15 15:19:39 +01:00
Mike Fährmann
19860655a3 [weibo] add 'user' and 'status' extractors 2019-02-17 18:18:31 +01:00