Mike Fährmann
0fd959a2a7
[twitter] support '/with_replies' URLs ( closes #1833 )
2021-09-10 20:44:26 +02:00
Mike Fährmann
6651da27e9
[twitter] fix 'url' extraction for users without 'expanded_url'
...
(#1532 , #1787 )
2021-08-27 18:41:16 +02:00
Mike Fährmann
ae78d95a5f
[twitter] fix issue when filtering quote tweets ( #1792 )
...
When a user quotes his own Tweet and that Tweet gets filtered by
'"quoted": false', it could also get filtered when it appeared later
as regular Tweet.
2021-08-25 20:04:22 +02:00
Mike Fährmann
0817f468ef
[twitter] expand t.co links in user descriptions ( #1532 , #1787 )
2021-08-23 23:34:59 +02:00
Mike Fährmann
7c0ae88185
[twitter] add 'url' to user objects ( #1532 , #1787 )
2021-08-23 22:51:35 +02:00
Mike Fährmann
5919dc5b5a
[twitter] slightly improve '_transform_user()'
2021-08-23 22:28:09 +02:00
Mike Fährmann
6b56b3ebe1
[twitter] report API errors as generic StopExtraction exceptions
...
prevents duplicate logging messages for nonexistent users
(#1759 )
2021-08-21 22:46:22 +02:00
Mike Fährmann
c866fcba48
[twitter] fix 'logout' ( #1719 )
...
delete 'auth_token' cookie and cookies.txt path
2021-08-16 01:36:34 +02:00
Mike Fährmann
52984f7e22
[twitter] add option to log out when blocked ( #1719 )
2021-08-12 19:11:41 +02:00
Mike Fährmann
e5a93e113f
[twitter] extend 'replies' option ( #1254 )
...
Allow setting 'replies to '"self"' to only download from self-replies.
2021-08-10 22:14:00 +02:00
Mike Fährmann
229498b8aa
[twitter] warn about suspended accounts etc ( closes #1759 )
2021-08-09 02:58:27 +02:00
Mike Fährmann
414bdc95a3
[twitter] set 'retweet_id' for original retweets ( #1481 )
2021-07-02 21:50:37 +02:00
Mike Fährmann
5323c1c73a
[twitter] ensure guest tokens are returned as string ( #1665 )
2021-07-01 14:35:53 +02:00
Mike Fährmann
035562bd11
[twitter] remove old-style URLs from image fallback lists
2021-06-28 16:25:24 +02:00
Mike Fährmann
a751afdfb3
[twitter] change some defaults
...
- 'retweets' option: true -> false
- 'quoted' option : true -> false
i.e. disable downloading tweets from other user's timelines by default
- search directory:
'["{category}", "Search", "{search}"]' ->
'["{category}", "{user[name]}"]'
i.e. change it to the same as other twitter extractors (#1308 )
2021-06-11 21:26:11 +02:00
Mike Fährmann
b5affc62aa
[twitter] rename 'text-only' to 'text-tweets' ( #570 )
2021-05-22 21:41:12 +02:00
Mike Fährmann
724ca61f36
[twitter] add 'text-only' option ( #570 )
2021-05-22 17:01:49 +02:00
Mike Fährmann
394fbb5f56
[twitter] strip useless t.co links ( #1532 )
...
The 'full_text' of Tweets with media content usually ends with a t.co
link to itself. This commit removes those.
2021-05-17 00:20:29 +02:00
Mike Fährmann
41457dbb1b
[twitter] resolve t.co URLs in 'content' ( #1532 )
2021-05-15 18:52:37 +02:00
Mike Fährmann
17b0ccb071
[twitter] add missing retweet media entities ( fixes #1555 )
...
from the original tweets
2021-05-14 22:51:01 +02:00
Mike Fährmann
fd858eed7b
[twitter] add 'user_likes' metadata field for liked tweets
...
i.e. the 'screen_name' of the user whose liked tweets get extracted.
Ideally this would replace 'user' or at least be in the same format,
but that would break backwards compatibility or be impossible/too
complicated thanks to API result differences.
(#1421 )
2021-04-02 03:41:41 +02:00
Mike Fährmann
8d124a3766
[twitter] rename variables
2021-04-02 02:49:53 +02:00
Mike Fährmann
105f3c9666
[twitter] add extractor for direct image links ( closes #1417 )
2021-04-02 02:45:23 +02:00
Mike Fährmann
ebd142e2a8
[twitter] don't use youtube-dl for cards when videos are disabled
...
(#1416 )
2021-04-01 14:26:08 +02:00
Mike Fährmann
ccfa5a8694
[twitter] better error message when logging in with 2FA ( #1409 )
2021-03-27 18:26:37 +01:00
Mike Fährmann
2846235669
[twitter] allow specifying a custom format for user results
...
(#1337 )
2021-03-21 22:26:26 +01:00
Mike Fährmann
3378b39719
[twitter] implement 'users' option ( #1337 )
2021-03-16 00:51:05 +01:00
Mike Fährmann
5d69e437d0
[twitter] add option to download all media from a conversation
...
(fixes #1319 )
2021-02-26 13:50:46 +01:00
Mike Fährmann
de0656941b
[twitter] add extractor for followed users ( #1337 )
...
https://twitter.com/USER/following or
https://twitter.com/id:USERID/following
2021-02-22 18:22:01 +01:00
Mike Fährmann
5542a11c46
[twitter] update GraphQL endpoints
2021-02-20 02:09:17 +01:00
Mike Fährmann
24e8e398e0
[twitter] skip login if 'auth_token' cookie is present
2021-01-25 15:03:59 +01:00
Mike Fährmann
95e5911895
[twitter] match '/i/user/ID' URLs
2021-01-20 00:33:57 +01:00
Mike Fährmann
069b113cbf
[twitter] improve and fix retry after hitting rate limit
...
- replace recursive call with infinite loop
- fix function arguments for recursive call
2021-01-19 23:50:07 +01:00
Mike Fährmann
780b6adb91
rename 'generate_csrf_token()' to just 'generate_token()'
...
and add a 'size' argument
2021-01-11 22:12:40 +01:00
Mike Fährmann
25074aec47
[twitter] fetch media from pinned tweets ( #1203 )
2020-12-29 16:27:43 +01:00
Mike Fährmann
2475176d99
[twitter] fetch tweets from 'homeConversation' entries
...
When logged in, some entries returned by Twitter's API are so called
'homeConversation's (they would be regular tweet entries otherwise.)
Those weren't picked up before and resulted in missing files compared
to accessing a timeline as guest.
('/media' timelines and search results were not affected)
2020-12-29 00:42:46 +01:00
Mike Fährmann
3af9350648
[twitter] update API calls
...
- use 'https://twitter.com/i/api ' for all requests
except '/guest/activate.json'
- update (default) URL parameters
- update GraphQL endpoints
2020-12-28 22:05:48 +01:00
Mike Fährmann
b656b829db
[twitter] fix login with username & password
...
It is no longer possible to get an 'authenticity_token' from Twitter's
Javascript-free login form, which got disabled few days ago.
Generating a random 16 byte hex string client-side and sending that as
a cookie alongside the regular login form works just as well.
2020-12-28 16:10:19 +01:00
Mike Fährmann
a00b60fbe7
[twitter] update 'x-csrf-token' header ( fixes #1170 )
...
Twitter started using a bigger (80 instead of 16 bytes) CSRf token for
logged in users, and expects those to be used as 'x-csrf-token' header
when send via 'ct0' cookie.
Generating an 80 byte token ourselves doesn't work, and Twitter will
still insist on using its own.
2020-12-11 13:46:58 +01:00
Mike Fährmann
63e61a0932
[twitter] update image URL format ( #1145 )
...
use
'/<name>?format=<fmt>&name=<size>'
instead of the potentially deprecated
'/<name>.<fmt>:<size>'
but keep all of them as fallback URLs
2020-12-01 11:53:51 +01:00
Mike Fährmann
ddfb4fd07a
[twitter] use ' https://twitter.com/i/api/ ' for logged in users
...
Doesn't seem to make a difference from what I can tell,
i.e. downloaded files are the same, but the website does it.
2020-11-16 11:26:37 +01:00
Mike Fährmann
de0c57886d
[twitter] add 'list-members' extractor ( closes #1096 )
2020-11-13 06:47:45 +01:00
Mike Fährmann
41d4968866
[twitter] add 'list' extractor ( #1096 )
2020-11-05 22:55:38 +01:00
Mike Fährmann
5d10520f4c
[twitter] update GraphQL endpoint & fix width/height entries
2020-11-05 22:53:29 +01:00
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
1686dc1757
[twitter] support media from Cards ( #1005 , #937 )
...
Can be enabled with 'extractor.twitter.cards', but for now disabled by
default because cards can redirect to rather large videos from YouTube
or Twitch.
2020-10-22 21:33:53 +02:00
Mike Fährmann
a3ca2f6080
update fallback URL handling
...
remove Message.Urllist and use a '_fallback' field inside a kwdict
2020-10-16 01:09:55 +02:00
Mike Fährmann
1b1cf01d0d
add a general 'generate_csrf_token()' function
2020-10-15 15:14:18 +02:00
Mike Fährmann
844502cad5
update extractor test results
2020-10-03 19:24:19 +02:00
Mike Fährmann
430b6d6e2e
[twitter] extend 'retweets' option ( closes #1026 )
...
Setting 'retweets' to '"original"' will use metadata from the
original retweeted Tweets, and not from the Retweet entry.
2020-09-28 23:03:35 +02:00