Commit Graph

21 Commits

Author SHA1 Message Date
Mike Fährmann
9bf76c1352 replace 'util.re()' with 'text.re()'
remove unnecessary 'util' imports
2025-10-20 17:44:58 +02:00
Mike Fährmann
69f7cfdd0c [dt] replace 'datetime' imports 2025-10-16 11:42:42 +02:00
Mike Fährmann
f2a72d8d1e replace 'request(…).json()' with 'request_json(…)' 2025-06-29 17:50:19 +02:00
Mike Fährmann
9dbe33b6de replace old %-formatted and .format(…) strings with f-strings (#7671)
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
41191bb60a 'match.group(N)' -> 'match[N]' (#7671)
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
e08ec7e083 update copyright notices 2025-06-13 00:03:41 +02:00
Mike Fährmann
811b665e33 remove @staticmethod decorators
There might have been a time when calling a static method was faster
than a regular method, but that is no longer the case. According to
micro-benchmarks, it is 70% slower in CPython 3.13 and it also makes
executing the code of a class definition slower.
2025-06-12 22:50:52 +02:00
Mike Fährmann
b5c88b3d3e replace standard library 're' uses with 'util.re()' 2025-06-06 13:24:52 +02:00
Mike Fährmann
d18f311fe2 [plurk] fix 'user' data extraction and make it non-fatal (#6742) 2025-01-06 20:27:37 +01:00
Mike Fährmann
1f9b16a70b replace static 'sleep-request' defaults with dynamic ones 2023-12-18 22:06:26 +01:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
51301e0c31 replace remaining time.sleep() calls
with Extractor.sleep() or request_interval
2023-02-23 00:35:37 +01:00
Mike Fährmann
dd884b02ee replace json.loads with direct calls to JSONDecoder.decode 2023-02-09 15:22:00 +01:00
Mike Fährmann
194803f3a7 [plurk] fix extraction (#2977) 2022-09-28 13:04:32 +02:00
Mike Fährmann
968d3e8465 remove '&' from URL patterns
'/?&#' -> '/?#' and '?&#' -> '?#'

According to https://www.ietf.org/rfc/rfc3986.txt, URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
3918b69677 remove 'extractor.blacklist' context manager 2020-09-11 13:17:35 +02:00
Mike Fährmann
8759403f37 [plurk] add delay between comment requests 2019-12-01 01:03:31 +01:00
Mike Fährmann
2c332edaad [plurk] fix comment pagination 2019-11-27 19:39:56 +01:00
Mike Fährmann
de83ae4576 make 'method' argument of Extractor.request keyword-only 2019-11-05 17:28:09 +01:00
Mike Fährmann
70be494161 [plurk] add a 'comments' options (#212) 2019-04-14 22:12:46 +02:00
Mike Fährmann
0b2ff406f6 [plurk] add timeline- and post-extractors (#212) 2019-04-14 21:48:38 +02:00