Commit Graph

279 Commits

Author SHA1 Message Date
Mike Fährmann
089d1a4f67 [twitter] fix 'TweetWithVisibilityResults' (#4369) 2023-08-06 22:08:50 +02:00
Mike Fährmann
fb3f0453db [twitter] improve error messages for single Tweets (#4369)
also fixes '"quoted": false' not having any effect
2023-08-03 22:02:07 +02:00
Mike Fährmann
7fbc304ae9 [twitter] fix crash on private user (#4349) 2023-07-26 17:53:51 +02:00
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
90231f2d5a [twitter] add 'tweet-endpoint' option (#4307)
use the newer TweetResultByRestId only for guests by default
2023-07-18 17:19:32 +02:00
Mike Fährmann
20ed647f6f [twitter] add 'user' extractor and 'include' option (#4275) 2023-07-18 16:42:55 +02:00
Mike Fährmann
86be197d11 [twitter] remove '/search/adaptive.json' 2023-07-18 15:45:37 +02:00
Mike Fährmann
0b08e2e8a8 merge #4287: [twitter] Fix following extractor not getting all users 2023-07-10 14:41:00 +02:00
Mike Fährmann
f6553ffd2f [twitter] simplify '_pagination_users'
- remove 'stop' variable
- call 'cursor.startswith()' only once
2023-07-10 14:39:09 +02:00
Mike Fährmann
a27dbe8c82 [twitter] use 'TweetResultByRestId' endpoint (#4250)
allows accessing single Tweets without login
2023-07-08 23:17:10 +02:00
Mike Fährmann
d3d639a159 [twitter] don't treat missing 'TimelineAddEntries' as fatal (#4278) 2023-07-08 22:49:34 +02:00
ActuallyKit
c321c773f2 make the code less ugly 2023-07-09 02:52:04 +07:00
ActuallyKit
a437a34bcf fix lint i guess? 2023-07-09 02:41:46 +07:00
ActuallyKit
6cbc434b54 Fix users pagination 2023-07-09 02:28:35 +07:00
Mike Fährmann
1bf9f52c99 [twitter] add 'ratelimit' option (#4251) 2023-07-04 18:17:32 +02:00
Mike Fährmann
f86fdf64a6 [twitter] use GraphQL search by default (#4264) 2023-07-04 17:55:22 +02:00
Mike Fährmann
c1cce4a80b [twitter] extend 'conversations' option (#4211) 2023-06-24 21:34:34 +02:00
Mike Fährmann
54cf1fa3e7 [twitter] use GraphQL search endpoint (#3942)
for guest users; selectable with 'search-endpoint' option.

adapted from 9c7b888ffa
2023-06-01 21:37:31 +02:00
Mike Fährmann
864a654b25 [twitter] update query hashes 2023-06-01 21:37:31 +02:00
Mike Fährmann
45cc7cee1a [twitter] better error message for guest searches (#3942) 2023-06-01 21:37:11 +02:00
Mike Fährmann
271f23d971 [twitter] extract 'conversation_id' metadata (#3839) 2023-06-01 15:31:52 +02:00
Mike Fährmann
d0184fddcf [twitter] optimize '_extract_twitpic()'
- use findall instead of finditer
- store URLs in a dict to discard duplicates
2023-05-25 15:18:49 +02:00
Mike Fährmann
3dc862c7fc merge #3796: [twitter] extract TwitPic URLs in text (#3792) 2023-05-25 14:59:07 +02:00
Mike Fährmann
1d505b39f8 [twitter] support 'profile-conversation' entries (#3938) 2023-04-21 15:08:50 +02:00
Mike Fährmann
f500b45b5e [twitter] improve 480bc34e
only check for double user assignment where necessary
2023-04-18 20:50:23 +02:00
Mike Fährmann
480bc34e54 [twitter] do not overwrite previously assigned users (#3922) 2023-04-16 17:30:43 +02:00
Mike Fährmann
f5a59c4170 [twitter] add 'date_bookmarked' metadata (#3816) 2023-04-06 20:16:25 +02:00
Mike Fährmann
1c1f6fdc80 [twitter] fix regression from 160335ad
Tweets from 'homeConversation' or 'conversationthread' entries do not
contain a 'sortIndex' field. Accessing it raises a KeyError and would
erroneously get them labeled as 'deleted'.
2023-04-06 19:22:48 +02:00
Mike Fährmann
160335ad44 [twitter] add 'date_liked' metadata for liked Tweets (#3816) 2023-04-06 18:33:45 +02:00
Mike Fährmann
6d850ce629 [twitter] calculate 'date' from Tweet IDs
20 times faster than parsing 'created_at'
2023-04-05 22:29:14 +02:00
Mike Fährmann
dbe06cdba1 [twitter] warn about 'withheld' Tweets and users (#3864) 2023-04-04 16:15:08 +02:00
Mike Fährmann
3cc1dd1572 [twitter] update query hashes 2023-04-03 23:20:20 +02:00
Mike Fährmann
3846ce0de5 [twitter] update to bookmark timeline v2 (#3859) 2023-04-03 22:46:12 +02:00
Mike Fährmann
e6cb92864a [twitter] allow setting custom features per API endpoint 2023-04-03 16:18:31 +02:00
Amer Jazaerli
bebbff6578 fix: graphql_timeline_v2_bookmark_timeline cannot be null
twitter: 400 Bad Request (The following features cannot be null: graphql_timeline_v2_bookmark_timeline)
2023-03-31 00:06:49 +02:00
Mike Fährmann
197882cf12 [twitter] add 'hashtag' extractor (#3783) 2023-03-22 22:20:40 +01:00
ClosedPort22
d4fb4ff47f [twitter] extract TwitPic URLs in text (#3792)
also ignore previously seen URLs
2023-03-18 21:19:24 +08:00
Mike Fährmann
2bb937014f [twitter] fall back to legacy /media endpoint when not logged in 2023-03-17 20:54:35 +01:00
Mike Fährmann
b68094d326 [twitter] support 'note_tweet's 2023-03-17 19:36:07 +01:00
Mike Fährmann
3dcabc97ed [twitter] update API endpoints and parameters 2023-03-17 19:25:53 +01:00
Mike Fährmann
9037128315 [twitter] fix some 'original' retweets not downloading (#3744) 2023-03-08 18:33:19 +01:00
Mike Fährmann
dd884b02ee replace json.loads with direct calls to JSONDecoder.decode 2023-02-09 15:22:00 +01:00
Mike Fährmann
1ae48a54f8 [twitter] add 'transform' option 2023-02-02 22:01:36 +01:00
ClosedPort22
ab58c375b4 [twitter] fix search (#3536)
- partially revert 18fe4b334d
- properly search for cursor when processing 'replaceEntry'
2023-01-20 14:12:25 +08:00
Mike Fährmann
9683d79bb7 [twitter] "fix" search pagination (#3536, #3534)
- properly process instructions
- do not expect a predetermined instruction order
2023-01-16 14:58:30 +01:00
Mike Fährmann
4fec848858 [twitter] use "browser": "firefox" by default (#3522)
and reenable TLS 1.2 ciphers
2023-01-15 22:11:04 +01:00
Mike Fährmann
78937564fd [twitter] fix login after 32b03433 2023-01-15 22:10:21 +01:00
Mike Fährmann
32b0343334 [twitter] refresh guest tokens (#3445, #3458) 2023-01-13 22:19:25 +01:00
Mike Fährmann
26c3292538 [twitter] disable TLS 1.2 ciphers by default (#3522) 2023-01-13 16:05:43 +01:00