Commit Graph

18 Commits

Author SHA1 Message Date
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
f856987297 [subscribestar] fix preview detection (#4468)
and show a warning message when posts contain previews
2023-09-04 22:21:14 +02:00
Mike Fährmann
d97b8c2fba consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
dd884b02ee replace json.loads with direct calls to JSONDecoder.decode 2023-02-09 15:22:00 +01:00
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
541a61d344 [subscribestar] fix 'date' metadata (#2642)
Handle instances where the actual datetime information
is preceded by "Updated on "
2022-06-04 12:24:08 +02:00
Mike Fährmann
d50a1ec2cc [subscribestar] unescape attachment URLs (fixes #2370) 2022-03-09 19:06:04 +01:00
Mike Fährmann
522782c09d [subscribestar] emit metadata for posts without media (#1569) 2021-11-18 23:42:17 +01:00
Mike Fährmann
1c8aaf9318 [subscribestar] add 'num' enumeration index (closes #2040) 2021-11-18 23:38:41 +01:00
Mike Fährmann
21c2da454f update extractor test results 2021-07-04 22:00:32 +02:00
Mike Fährmann
d09bc5bd34 [subscribestar] improve attachment filenames (#1609) 2021-06-10 17:09:13 +02:00
Mike Fährmann
968d3e8465 remove '&' from URL patterns
'/?&#' -> '/?#' and '?&#' -> '?#'

According to https://www.ietf.org/rfc/rfc3986.txt, URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
69e4871005 update extractor test results
- sensescans: replace 404d chapters
- mangapark: replace 404d chapters
- subscribestar: update test for attached files
2020-08-28 22:32:32 +02:00
Mike Fährmann
0d84d3af55 [subscribestar] extract attached media files (#852) 2020-08-03 22:02:42 +02:00
Mike Fährmann
e50c75628c [subscribestar] update 'date' parsing 2020-07-24 22:27:36 +02:00
Mike Fährmann
d5fcffcced [subscribestar] add login capabilities (#852) 2020-07-17 22:18:01 +02:00
Mike Fährmann
f5c9f1d066 [subscribestar] use current date instead of hard-coded '2020' (#852) 2020-07-09 22:12:39 +02:00
Mike Fährmann
821524e4ee [subscribestar] add 'user' and 'post' extractors (#852) 2020-07-03 21:08:47 +02:00