Commit Graph

297 Commits

Author SHA1 Message Date
Mike Fährmann
c55955db03 [twitter] quick and dirty fix for /media changes (#4898) 2023-12-09 15:38:42 +01:00
Mike Fährmann
a4e6ea667b [twitter] retry API calls when their response contains errors (#4811) 2023-12-05 15:57:26 +01:00
Mike Fährmann
cf5702c843 [twitter] generalize "Login Required" error (#4734, #4324) 2023-12-05 15:13:58 +01:00
Mike Fährmann
7a0f145cbe [twitter] ignore promoted Tweets (#4790, #3894)
add 'ads' option in case someone actually wants to
download promoted content for whatever reason
2023-11-10 23:46:46 +01:00
thatfuckingbird
44d7964c09 [twitter] recognize fixupx.com URLs 2023-11-01 15:50:36 +01:00
Mike Fährmann
fd36eafe32 [twitter] restore truncated retweet text (#3430, #4690) 2023-10-27 23:26:21 +02:00
Mike Fährmann
218295a4c6 [twitter] fix avatars without 'date' information (#4696) 2023-10-27 17:58:02 +02:00
Mike Fährmann
31dbbffc0b [twitter] cache 'user_by_…' results (#4719) 2023-10-25 16:45:27 +02:00
Mike Fährmann
08bdde5aac merge #4619: [twitter] add 'sensitive' metadata field 2023-10-09 15:40:58 +02:00
Mike Fährmann
f3d6aaff13 [twitter] rename to 'sensitive'; use 'tget()' 2023-10-09 15:39:09 +02:00
Mike Fährmann
efaab4fbfa [twitter] fix crash due to missing 'source' (#4620)
regression caused by 06aaedde
2023-10-04 23:01:04 +02:00
Nahida
3438a3098d [twitter] add possible_sensitive field 2023-10-04 10:34:02 +08:00
Mike Fährmann
6178177227 [twitter] fix '_extractor' of following results (#4536)
regression from 20ed647f
2023-09-15 23:04:30 +02:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
4c0b3d5dc5 [twitter] fix crash when 'sortIndex' is None (#4499) 2023-09-04 18:28:43 +02:00
Mike Fährmann
06aaedded5 [twitter] extract 'source' metadata (#4459) 2023-08-28 16:31:57 +02:00
Mike Fährmann
e0829ff0fd [twitter] add 'date_original' metadata for retweets (#4337, #4443) 2023-08-23 23:58:11 +02:00
Mike Fährmann
2b88ad19e9 [twitter] accept 'x.com' URLs (#4452) 2023-08-21 19:47:07 +02:00
Mike Fährmann
089d1a4f67 [twitter] fix 'TweetWithVisibilityResults' (#4369) 2023-08-06 22:08:50 +02:00
Mike Fährmann
fb3f0453db [twitter] improve error messages for single Tweets (#4369)
also fixes '"quoted": false' not having any effect
2023-08-03 22:02:07 +02:00
Mike Fährmann
7fbc304ae9 [twitter] fix crash on private user (#4349) 2023-07-26 17:53:51 +02:00
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
90231f2d5a [twitter] add 'tweet-endpoint' option (#4307)
use the newer TweetResultByRestId only for guests by default
2023-07-18 17:19:32 +02:00
Mike Fährmann
20ed647f6f [twitter] add 'user' extractor and 'include' option (#4275) 2023-07-18 16:42:55 +02:00
Mike Fährmann
86be197d11 [twitter] remove '/search/adaptive.json' 2023-07-18 15:45:37 +02:00
Mike Fährmann
0b08e2e8a8 merge #4287: [twitter] Fix following extractor not getting all users 2023-07-10 14:41:00 +02:00
Mike Fährmann
f6553ffd2f [twitter] simplify '_pagination_users'
- remove 'stop' variable
- call 'cursor.startswith()' only once
2023-07-10 14:39:09 +02:00
Mike Fährmann
a27dbe8c82 [twitter] use 'TweetResultByRestId' endpoint (#4250)
allows accessing single Tweets without login
2023-07-08 23:17:10 +02:00
Mike Fährmann
d3d639a159 [twitter] don't treat missing 'TimelineAddEntries' as fatal (#4278) 2023-07-08 22:49:34 +02:00
ActuallyKit
c321c773f2 make the code less ugly 2023-07-09 02:52:04 +07:00
ActuallyKit
a437a34bcf fix lint i guess? 2023-07-09 02:41:46 +07:00
ActuallyKit
6cbc434b54 Fix users pagination 2023-07-09 02:28:35 +07:00
Mike Fährmann
1bf9f52c99 [twitter] add 'ratelimit' option (#4251) 2023-07-04 18:17:32 +02:00
Mike Fährmann
f86fdf64a6 [twitter] use GraphQL search by default (#4264) 2023-07-04 17:55:22 +02:00
Mike Fährmann
c1cce4a80b [twitter] extend 'conversations' option (#4211) 2023-06-24 21:34:34 +02:00
Mike Fährmann
54cf1fa3e7 [twitter] use GraphQL search endpoint (#3942)
for guest users; selectable with 'search-endpoint' option.

adapted from 9c7b888ffa
2023-06-01 21:37:31 +02:00
Mike Fährmann
864a654b25 [twitter] update query hashes 2023-06-01 21:37:31 +02:00
Mike Fährmann
45cc7cee1a [twitter] better error message for guest searches (#3942) 2023-06-01 21:37:11 +02:00
Mike Fährmann
271f23d971 [twitter] extract 'conversation_id' metadata (#3839) 2023-06-01 15:31:52 +02:00
Mike Fährmann
d0184fddcf [twitter] optimize '_extract_twitpic()'
- use findall instead of finditer
- store URLs in a dict to discard duplicates
2023-05-25 15:18:49 +02:00
Mike Fährmann
3dc862c7fc merge #3796: [twitter] extract TwitPic URLs in text (#3792) 2023-05-25 14:59:07 +02:00
Mike Fährmann
1d505b39f8 [twitter] support 'profile-conversation' entries (#3938) 2023-04-21 15:08:50 +02:00
Mike Fährmann
f500b45b5e [twitter] improve 480bc34e
only check for double user assignment where necessary
2023-04-18 20:50:23 +02:00
Mike Fährmann
480bc34e54 [twitter] do not overwrite previously assigned users (#3922) 2023-04-16 17:30:43 +02:00
Mike Fährmann
f5a59c4170 [twitter] add 'date_bookmarked' metadata (#3816) 2023-04-06 20:16:25 +02:00
Mike Fährmann
1c1f6fdc80 [twitter] fix regression from 160335ad
Tweets from 'homeConversation' or 'conversationthread' entries do not
contain a 'sortIndex' field. Accessing it raises a KeyError and would
erroneously get them labeled as 'deleted'.
2023-04-06 19:22:48 +02:00
Mike Fährmann
160335ad44 [twitter] add 'date_liked' metadata for liked Tweets (#3816) 2023-04-06 18:33:45 +02:00
Mike Fährmann
6d850ce629 [twitter] calculate 'date' from Tweet IDs
20 times faster than parsing 'created_at'
2023-04-05 22:29:14 +02:00
Mike Fährmann
dbe06cdba1 [twitter] warn about 'withheld' Tweets and users (#3864) 2023-04-04 16:15:08 +02:00