Commit Graph

226 Commits

Author SHA1 Message Date
Mike Fährmann
30a31836e7 merge #3449: [twitter] force HTTPS for TwitPic URLs 2023-01-05 14:57:03 +01:00
Mike Fährmann
e18482e9ae [twitter] improve 'http' -> 'https' replacement 2023-01-05 14:55:55 +01:00
Mike Fährmann
4fd6da474f merge #3473: [twitter] fix crash when using 'expand' and 'syndication' 2023-01-05 14:19:47 +01:00
Mike Fährmann
6933727b45 merge #3483: [twitter] implement 'syndication=extended' 2023-01-04 17:36:17 +01:00
Mike Fährmann
ed2d715019 fix 'keywords' in extractor tests (#3491) 2023-01-03 15:14:23 +01:00
ClosedPort22
6853b14be3 [twitter] apply suggestions from code review
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2023-01-02 21:03:01 +08:00
ClosedPort22
7c8eab8d52 [twitter] implement 'syndication=extended'
to be able to fetch extended user metadata
2022-12-30 20:48:41 +08:00
ClosedPort22
be3286206a [twitter] assume 'conversation_id' when using syndication
not possible to expand replies at the momemt
2022-12-30 13:57:37 +08:00
ClosedPort22
ce8dbb1ccc [twitter] fix crash when using 'expand' and 'syndication'
caused by KeyError: 'conversation_id_str'
2022-12-30 12:45:44 +08:00
ClosedPort22
38786a9593 [twitter] refactor extraction of TwitPic URLs
flattening
2022-12-27 12:23:12 +08:00
ClosedPort22
3eb352fcb0 [twitter] force HTTPS for TwitPic URLs 2022-12-23 18:16:34 +08:00
Mike Fährmann
90a9c0790f [twitter] update 'search' pagination (#544)
Only stop when list of all returned Tweets is empty
instead of when no valid Tweet was found.
2022-12-14 19:56:59 +01:00
Mike Fährmann
3082544fff misc fixes
- fix typo (#3399)
- remove double assignment
- [bunkr] update things I forgot in 6b6f886d
- [soundgasm] adjust 'archive_fmt' (#3388)
2022-12-14 13:30:27 +01:00
Mike Fährmann
cd931e1139 update extractor test results 2022-12-08 18:58:29 +01:00
Mike Fährmann
0e75358af8 [twitter] fix using user IDs for suspended accounts 2022-11-26 12:02:05 +01:00
Mike Fährmann
a24dcbe802 [twitter] fix login (#3220)
Using an email as 'username' seems to no longer be possible,
as Twitter will always additionally ask for username or phone number
when providing an email address as 'username'.
2022-11-19 23:11:37 +01:00
Mike Fährmann
08fd1ff835 [twitter] add 'avatar' and 'background' extractors (#349, #3023) 2022-11-18 23:06:22 +01:00
Mike Fährmann
6c153750fa [nitter] add extractors for Nitter instances (#2696) 2022-11-15 11:44:16 +01:00
Mike Fährmann
15cd114c9c [twitter] update bookmarks pagination (#3172)
Do not stop when there aren't any tweets in a batch,
but only when the same cursor value appears twice in a row.
2022-11-09 20:40:51 +01:00
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
thatfuckingbird
9d3f86dbcd [twitter] update URL for syndication API (#3160)
Twitter changed the URL format to access tweet data through their syndication API.
2022-11-04 17:49:45 +01:00
Mike Fährmann
e99a9b2aff [twitter] improve 'cards-blacklist' (#2875)
allow blacklisting domains and 'name:domain',
where 'domain' depends on a card's 'vanity_url' value
2022-09-17 17:46:34 +02:00
Mike Fährmann
aaf6992bae [twitter] fix new-style '/card_img/' URLs 2022-09-17 17:45:09 +02:00
Mike Fährmann
40baa77630 [twitter] provide proper 'date' for syndication results (#2920) 2022-09-17 14:11:43 +02:00
Mike Fährmann
4d78ca89db [twitter] add 'cards-blacklist' option (#2875) 2022-08-31 10:28:25 +02:00
Mike Fährmann
4d7cb0bf56 [twitter] general support for unified cards (#2875)
just removing the 'type' check seems to work
2022-08-31 10:25:27 +02:00
Mike Fährmann
7ddfff957c [twitter] support "image_website" unified cards (#2875) 2022-08-30 18:16:10 +02:00
Mike Fährmann
69995d789b Revert "[twitter] use '{author[name]' in default directory names"
This reverts commit 9ad3cdc5d8.
2022-08-27 15:11:59 +02:00
Mike Fährmann
6ba72b6bc6 [twitter] ignore invalid user entries (#2850) 2022-08-26 17:57:17 +02:00
Mike Fährmann
264f1336ad [twitter] unescape '+' in search queries (#2226)
... and do not raise exception if searched user does not exist
2022-08-17 22:20:26 +02:00
Mike Fährmann
9ad3cdc5d8 [twitter] use '{author[name]' in default directory names
with the changes to 'user' (749802c7),
'{user[name]' with enabled retweets / quote tweets
would put a lot of them in a wrong directory
2022-08-12 11:36:55 +02:00
Mike Fährmann
81a37d21d3 [twitter] simplify 'user' assignment 2022-07-29 20:26:22 +02:00
Mike Fährmann
8a70b94245 [twitter] implement constant 'user' for tweet URLs 2022-07-29 19:44:29 +02:00
Mike Fährmann
1540d0e695 [twitter] use filter:links (#2766) 2022-07-27 12:17:43 +02:00
Mike Fährmann
8d0801ad8e [twitter] fall back to unfiltered search (#2766) 2022-07-27 12:16:53 +02:00
Mike Fährmann
77bdd8fe0f [twitter] implement constant 'user' for 'from:…' searches 2022-07-17 19:14:32 +02:00
Mike Fährmann
a267a05a3f [twitter] update 'quote_id' and 'quote_by'
- 'quote_id' is now non-null for quoted Tweets and has the ID of the
  quoting Tweet, instead the other way round like before
- 'quote_by' is now the 'screen_name' of the quoting user
  (was the same the new 'quote_id' is now)
2022-07-17 18:50:21 +02:00
Mike Fährmann
749802c7bd [twitter] update 'user' and 'author' fields
- 'author' is always the user who authored a tweet
- 'user' is always the user specified in the input URL
  or equal to 'author' when the former is not given
2022-07-17 17:04:24 +02:00
blankie
836402bf58 [twitter] unescape content (#2756) (#2757)
Fixes #2756
2022-07-13 19:45:14 +02:00
Mike Fährmann
b590774f67 [twitter] add 'count' metadata field (#2741) 2022-07-10 14:37:04 +02:00
Mike Fährmann
1d14928bd9 [twitter] ignore previously seen Tweets (#2712)
occurs primarily for /with_replies results when logged in
2022-07-03 16:13:53 +02:00
Mike Fährmann
4b2a0a0eda [twitter] implement 'strategy' option (#2712)
to be able to better control what Tweets get used an returned
for twitter.com/USER URLs.
2022-07-03 14:29:15 +02:00
Mike Fährmann
7b073bf9ef Revert "[twitter] improve strategy for user URLs (#2665)"
'user_tweets_and_replies' was a mistake
2022-06-28 20:38:56 +02:00
Mike Fährmann
d6c6c8a4a0 [twitter] improve '"replies": "self"' (#2665)
If a username is given in the input URL,
only download from replies by that user.
2022-06-13 19:21:32 +02:00
Mike Fährmann
9c8d895d19 [twitter] implement 'csrf' option (#2676) 2022-06-13 18:36:39 +02:00
Mike Fährmann
08db8435f1 [twitter] fix pagination for conversion tweets
a relic from the switch to GraphQL API
2022-06-13 16:27:30 +02:00
Mike Fährmann
1da3ccf608 [twitter] implement 'expand' option (#2665) 2022-06-12 17:26:51 +02:00
Mike Fährmann
0add1fc090 [twitter] improve strategy for user URLs (#2665)
- use '/with_replies' when appropriate
- consider 'text-tweets'
- build search query as necessary
2022-06-12 17:24:53 +02:00
thatfuckingbird
da0696e1f5 recognize vxtwitter URLs (#2621) 2022-05-25 17:01:58 +02:00
Mike Fährmann
dcb580240d [twitter] extract alt texts as 'description' (closes #2617) 2022-05-24 12:37:38 +02:00