Commit Graph

14 Commits

Author SHA1 Message Date
Mike Fährmann
b5c88b3d3e replace standard library 're' uses with 'util.re()' 2025-06-06 13:24:52 +02:00
Mike Fährmann
d18f311fe2 [plurk] fix 'user' data extraction and make it non-fatal (#6742) 2025-01-06 20:27:37 +01:00
Mike Fährmann
1f9b16a70b replace static 'sleep-request' defaults with dynamic ones 2023-12-18 22:06:26 +01:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
51301e0c31 replace remaining time.sleep() calls
with Extractor.sleep() or request_interval
2023-02-23 00:35:37 +01:00
Mike Fährmann
dd884b02ee replace json.loads with direct calls to JSONDecoder.decode 2023-02-09 15:22:00 +01:00
Mike Fährmann
194803f3a7 [plurk] fix extraction (#2977) 2022-09-28 13:04:32 +02:00
Mike Fährmann
968d3e8465 remove '&' from URL patterns
'/?&#' -> '/?#' and '?&#' -> '?#'

According to https://www.ietf.org/rfc/rfc3986.txt, URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
3918b69677 remove 'extractor.blacklist' context manager 2020-09-11 13:17:35 +02:00
Mike Fährmann
8759403f37 [plurk] add delay between comment requests 2019-12-01 01:03:31 +01:00
Mike Fährmann
2c332edaad [plurk] fix comment pagination 2019-11-27 19:39:56 +01:00
Mike Fährmann
de83ae4576 make 'method' argument of Extractor.request keyword-only 2019-11-05 17:28:09 +01:00
Mike Fährmann
70be494161 [plurk] add a 'comments' options (#212) 2019-04-14 22:12:46 +02:00
Mike Fährmann
0b2ff406f6 [plurk] add timeline- and post-extractors (#212) 2019-04-14 21:48:38 +02:00