15 Commits

Author SHA1 Message Date
Mike Fährmann
366b0750a8 [common] use extractor subcategory for 'notfound=True' 2026-01-19 11:19:35 +01:00
Mike Fährmann
e006d26c8e Revert "use f-strings when building 'pattern'"
revert d7c97d5a97.
2025-12-20 22:07:37 +01:00
Mike Fährmann
8f621b32bd [2chen] implement generic 2chen board extractors
support
- https://sturdychan.help/
- https://schan.help/ (#8680)
2025-12-13 18:08:50 +01:00
Mike Fährmann
968597a302 yield 3-tuples for Message.Directory
adapt tuples to the same length and semantics as other messages
2025-12-05 21:39:52 +01:00
Mike Fährmann
d7c97d5a97 use f-strings when building 'pattern' 2025-10-20 21:23:11 +02:00
Mike Fährmann
085616e0a8 [dt] replace 'text.parse_datetime()' & 'text.parse_timestamp()' 2025-10-17 17:43:06 +02:00
Mike Fährmann
9dbe33b6de replace old %-formatted and .format(…) strings with f-strings (#7671)
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
41191bb60a 'match.group(N)' -> 'match[N]' (#7671)
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
81bd2af83e [2chen] update domain to sturdychan.help 2023-04-19 13:54:44 +02:00
Mike Fährmann
6f0735568c [2chen] fix file URLs 2022-12-15 18:05:32 +01:00
enduser420
a2be06d873 [2chen] add '.club' support (#3406) 2022-12-15 17:51:02 +01:00
enduser420
4bc756dfe0 [2chen] fix extraction (#3356)
update 'archive_fmt'
update tests
update 'board' regex
2022-12-04 16:19:36 +01:00
Mike Fährmann
277be410a7 [2chen] update 'archive_fmt' 2022-10-14 00:19:27 +02:00
enduser420
f0321f423d [2chen] Add 2chen.moe extractor (#2707)
* [2chen] Add 2chen.moe extractor

* change "==" to is

* fix for "test_unique_pattern_matches"

* fix regex pattern and group matching

* fix regex again

* [2chen] add 'reply_no' and 'hash' metadata and change 'filename_fmt'

also made an entry in supportedsites.md

* [2chen] unescape 'title'

* [2chen] partition() -> rpartition()

* [2chen] extract 'date' and 'name' metadata

* [2chen] remove 'offset' argument

* [2chen] do some changes

* [2chen] do some more changes

* [2chen] unescape 'name' and 'filename'
2022-10-04 22:18:13 +02:00