Mike Fährmann
|
d7c97d5a97
|
use f-strings when building 'pattern'
|
2025-10-20 21:23:11 +02:00 |
|
Mike Fährmann
|
9bf76c1352
|
replace 'util.re()' with 'text.re()'
remove unnecessary 'util' imports
|
2025-10-20 17:44:58 +02:00 |
|
Mike Fährmann
|
c8fc790028
|
merge branch 'dt': move datetime utils into separate module
- use 'datetime.fromisoformat()' when possible (#7671)
- return a datetime-compatible object for invalid datetimes
(instead of a 'str' value)
|
2025-10-20 09:30:05 +02:00 |
|
Mike Fährmann
|
238d0973f7
|
[twitter] fix "KeyError - 'source_id'" with disabled 'transform' (#8429)
|
2025-10-18 19:39:24 +02:00 |
|
Mike Fährmann
|
085616e0a8
|
[dt] replace 'text.parse_datetime()' & 'text.parse_timestamp()'
|
2025-10-17 17:43:06 +02:00 |
|
Mike Fährmann
|
e42030a3a6
|
[twitter] fix 'KeyError' for "temporarily unavailable" users (#8423)
|
2025-10-16 15:50:48 +02:00 |
|
Mike Fährmann
|
8c62be343e
|
[output] add 'Logger.traceback()' helper
|
2025-10-14 18:44:29 +02:00 |
|
Mike Fährmann
|
c1d21e8cb9
|
[twitter] remove login support (#4202 #6029 #6040 #8362)
broken feature
|
2025-10-07 08:32:40 +02:00 |
|
24xyz
|
92be341711
|
[twitter] fix 'quote_id' of individual Tweets (#8284)
Fix 'quoted_by_id_str' to use parent tweet id
|
2025-09-24 19:50:12 +02:00 |
|
Mike Fährmann
|
5aa2124736
|
[twitter] fix all quoted Tweets being marked as 'deleted' (#8225)
due to "KeyError: 'screen_name'"
when trying to access the author's name
fixes regression introduced in 5747dbf00c
|
2025-09-16 10:08:32 +02:00 |
|
Mike Fährmann
|
05128ccf49
|
[twitter] add 'search-limit' option (#8173)
reduce default limit from 100 to 20
|
2025-09-13 10:30:58 +02:00 |
|
Mike Fährmann
|
f6fcba4040
|
[twitter] add 'search-stop' option (#8173)
and rename 'pagination-search' to 'search-pagination'
|
2025-09-09 10:14:43 +02:00 |
|
Mike Fährmann
|
d182749f45
|
[twitter] implement 'pagination-search' option (#8173)
|
2025-09-07 21:03:29 +02:00 |
|
Mike Fährmann
|
f94eedbe1d
|
[twitter] continue searches on empty response (#8173)
stop when receiving more than 3 empty responses in a row
|
2025-09-07 17:42:56 +02:00 |
|
Mike Fährmann
|
52c932add6
|
[twitter] prevent "KeyError: 'name'" in '_transform_user()' (#8154)
fixes regression introduced in 5747dbf00c
|
2025-09-01 20:52:05 +02:00 |
|
Mike Fährmann
|
8650a6bf39
|
[twitter] fix "KeyError: 'core'" when processing communities (#8141)
fixes regression introduced in 8252980264
|
2025-08-29 19:42:37 +02:00 |
|
Mike Fährmann
|
d251996d8e
|
[twitter] prevent exceptions in '_transform_community()' (#8134)
fixes regression introduced in 8252980264
|
2025-08-28 11:24:45 +02:00 |
|
Mike Fährmann
|
9bfde2f535
|
[twitter] simplify URL patterns with USER_PATTERN
|
2025-08-22 19:41:16 +02:00 |
|
Mike Fährmann
|
ff94f1dec5
|
[twitter:avatar] fix "KeyError: 'profile_image_url_https'" (#8087)
fixes regression introduced in 5747dbf00c
|
2025-08-21 05:58:33 +02:00 |
|
Mike Fährmann
|
a8b334e866
|
[twitter] add 'home' extractor (#7974)
|
2025-08-19 23:03:24 +02:00 |
|
Mike Fährmann
|
47150f3e8a
|
[twitter] add 'highlights' extractor (#7826)
|
2025-08-19 09:14:39 +02:00 |
|
Mike Fährmann
|
8252980264
|
[twitter] extract 'community' metadata (#7424)
update default download directories and archive IDs
for community extractors
|
2025-08-19 08:56:04 +02:00 |
|
Mike Fährmann
|
5747dbf00c
|
[twitter] update API endpoint query hashes & parameters
|
2025-08-18 21:50:10 +02:00 |
|
Mike Fährmann
|
c1abcb99de
|
[twitter] handle "KeyError: 'result'" for retweets (#8072)
|
2025-08-18 10:18:03 +02:00 |
|
Mike Fährmann
|
3b93184997
|
[twitter] fix potential 'UnboundLocalError' (#7932)
this happens with Tweets containing both images and video
when 'videos' are disabled.
|
2025-07-30 16:45:48 +02:00 |
|
Mike Fährmann
|
a097a373a9
|
simplify if statements by using walrus operators (#7671)
|
2025-07-22 20:57:54 +02:00 |
|
Mike Fährmann
|
d8ef1d693f
|
rename 'StopExtraction' to 'AbortExtraction'
for cases where StopExtraction was used to report errors
|
2025-07-09 21:07:28 +02:00 |
|
Mike Fährmann
|
cfafbc0675
|
[twitter] extract 'sensitive_flags' metadata (#2523)
a list of 'sensitive_media_warning' flags per file
and a combination of all file flags per Tweet
|
2025-07-09 12:39:23 +02:00 |
|
Mike Fährmann
|
9dbe33b6de
|
replace old %-formatted and .format(…) strings with f-strings (#7671)
mostly using flynt
https://github.com/ikamensh/flynt
|
2025-06-29 17:50:19 +02:00 |
|
Mike Fährmann
|
41191bb60a
|
'match.group(N)' -> 'match[N]' (#7671)
2.5x faster
|
2025-06-18 13:05:58 +02:00 |
|
Mike Fährmann
|
e08ec7e083
|
update copyright notices
|
2025-06-13 00:03:41 +02:00 |
|
Mike Fährmann
|
e2d104a110
|
[twitter] extract 'source_id' and 'source_user' metadata (#7470 #7640)
|
2025-06-12 18:59:22 +02:00 |
|
Mike Fährmann
|
06e2f2cd91
|
[twitter] restructure media data extraction
|
2025-06-12 18:53:15 +02:00 |
|
Mike Fährmann
|
8dace96af3
|
[twitter] simplify 'expand' & 'unique' init code
|
2025-06-05 15:33:47 +02:00 |
|
Mike Fährmann
|
e199396872
|
[common] simplify 'user' extractors by using 'Dispatch' mixin
|
2025-05-24 18:04:53 +02:00 |
|
Mike Fährmann
|
b97dc456b0
|
[twitter] import 'transaction_id' only when needed
|
2025-05-04 07:42:44 +02:00 |
|
Mike Fährmann
|
edc67983ed
|
[twitter] update 'x-csrf-token' header after ct init (#7467)
|
2025-05-03 12:55:31 +02:00 |
|
Mike Fährmann
|
771317b36c
|
[twitter:ctid] cache client transaction keys (#7382)
and 'ondemand.s.…a.js' responses
|
2025-05-03 12:50:00 +02:00 |
|
Mike Fährmann
|
e0913c95b2
|
[twitter] generate 'x-client-transaction-id' header values (#7382)
TODO: cache ClientTransaction state on disk
|
2025-05-02 12:10:05 +02:00 |
|
stephanelsmith
|
f0e7992674
|
[twitter] added 'followers' extractor
modeled after the 'following' extractor
- cleanup
- add test
|
2025-04-19 18:24:29 +02:00 |
|
Mike Fährmann
|
2798fb8a80
|
[twitter] update API endpoint query hashes (#7382 #7386)
and associated 'variables', 'features', and 'fieldToggles' parameters
|
2025-04-19 16:45:47 +02:00 |
|
Mike Fährmann
|
a859abf6a1
|
[twitter] prevent exception in '_extract_components()' (#7139)
|
2025-03-09 10:15:18 +01:00 |
|
Mike Fährmann
|
d2cad599f7
|
[twitter] support 'grok' cards content (#7040)
|
2025-02-25 20:47:31 +01:00 |
|
Mike Fährmann
|
64dc655ed6
|
[twitter] revert generated CSRF token length to 32 characters (#6895)
revert d9c4fcc7fa
|
2025-01-30 19:16:10 +01:00 |
|
Mike Fährmann
|
cb1a75eefc
|
[twitter] handle errors during file extraction (#6647)
|
2025-01-21 18:23:54 +01:00 |
|
Mike Fährmann
|
d9c4fcc7fa
|
[twitter] generate longer CSRF token values
|
2025-01-21 18:19:25 +01:00 |
|
Mike Fährmann
|
cfe24a9e31
|
[twitter] make 'source' metadata extraction non-fatal (#6472)
|
2024-11-14 18:59:01 +01:00 |
|
Mike Fährmann
|
e3fbd6825b
|
[twitter] remove cookies migration workaround
revert 141efc2ad3
|
2024-10-31 17:10:13 +01:00 |
|
Mike Fährmann
|
a120295632
|
[util] use minimal separators for 'json_dumps()'
|
2024-10-01 17:03:13 +02:00 |
|
Mike Fährmann
|
bd932b6860
|
[twitter] add 'info' as a possible 'include' value (#6114)
|
2024-08-31 17:04:22 +02:00 |
|