Commit Graph

29 Commits

Author SHA1 Message Date
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
7e385ed63e [foolfuuka] update domains
- remove nyafuu
- add rozenarcana (https://archive.alice.al/)
- add tokyochronos (https://www.tokyochronos.net)
2022-08-26 17:57:17 +02:00
Mike Fährmann
2dc57637cf [foolfuuka] remove archive.wakarimasen.moe 2022-07-10 23:13:49 +02:00
Mike Fährmann
bd6ec5c352 [foolfuuka] match 4chan filenames (#2577)
introduce two new metadata fields:
- filename_media: original filename of file uploaded to 4chan
- timestamp_ms  : timestamp with millisecond precision (tim)
2022-05-15 14:39:54 +02:00
Mike Fährmann
d26da3b9e5 add pre-generated 'pattern' for supported BaseExtractor sites 2022-05-09 22:20:09 +02:00
Mike Fährmann
dee0d22561 update extractor test results 2022-02-06 21:39:24 +01:00
Mike Fährmann
275543b2d2 update extractor test results 2021-11-27 19:26:44 +01:00
Mike Fährmann
211de95dd0 update extractor test results 2021-11-01 02:58:53 +01:00
Mike Fährmann
c04f7ab139 [foolfuuka] add 'gallery' extractor (#1785) 2021-08-21 22:46:23 +02:00
Mike Fährmann
21c2da454f update extractor test results 2021-07-04 22:00:32 +02:00
Mike Fährmann
407627ec86 [foolfuuka] support 'archive.wakarimasen.moe' (closes #1595) 2021-06-02 15:45:43 +02:00
Mike Fährmann
532ac79fb0 update extractor test results 2021-05-21 02:28:53 +02:00
Mike Fährmann
671a95cae5 [foolfuuka] use BaseExtractor 2021-01-26 18:48:37 +01:00
Mike Fährmann
e9a75e27d9 [foolfuuka] stop search when results are exhausted (#1174) 2021-01-17 22:48:21 +01:00
Mike Fährmann
56b460dcea [foolfuuka] add 'search' extractors (#1174) 2021-01-02 02:34:06 +01:00
Mike Fährmann
fb64183d53 [foolfuuka] add 'board' extractors (closes #1044) 2021-01-01 19:33:35 +01:00
Mike Fährmann
1e3dd7330e merge SharedConfigMixin functionality into Extractor 2020-11-17 00:34:07 +01:00
Mike Fährmann
f5b7ae01c1 update extractor test results 2020-09-15 18:07:08 +02:00
Mike Fährmann
82f7f4172a update test results 2020-01-01 16:05:38 +01:00
Mike Fährmann
41a3169c67 [foolfuuka] use '{extension}' in default filename format 2019-11-28 23:12:48 +01:00
Mike Fährmann
2a3bd4e3c7 rename extractor classes starting with a digit 2019-11-02 20:42:09 +01:00
Mike Fährmann
8de5866fd2 [twitter] replace unit test URLs
https://twitter.com/PicturesEarth was deleted
2019-05-09 10:17:55 +02:00
Mike Fährmann
591a07f20c small code changes and cleanups 2019-03-13 22:03:02 +01:00
Mike Fährmann
09d872a2b1 generalize extractor creation code 2019-03-07 22:55:26 +01:00
Mike Fährmann
4b1880fa5e propagate 'match' to base extractor constructor 2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107 simplify extractor constants
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
4d656a81ca replace SharedConfigExtractor class with a Mixin 2019-02-04 13:46:02 +01:00
Mike Fährmann
12ff750111 [foolfuuka] smaller code changes and updates 2019-02-04 12:55:33 +01:00
Mike Fährmann
58a9eede38 [foolfuuka] dynamically generate extractor classes 2019-02-03 17:09:45 +01:00