Commit Graph

48 Commits

Author SHA1 Message Date
Mike Fährmann
366b0750a8 [common] use extractor subcategory for 'notfound=True' 2026-01-19 11:19:35 +01:00
Mike Fährmann
09635352d0 [imagebam] raise 'NotFoundError' for deleted galleries 2026-01-19 11:19:35 +01:00
Mike Fährmann
8c9ca609ea [imagebam] raise 'NotFoundError' for deleted images (#8890) 2026-01-18 21:27:27 +01:00
Mike Fährmann
968597a302 yield 3-tuples for Message.Directory
adapt tuples to the same length and semantics as other messages
2025-12-05 21:39:52 +01:00
Mike Fährmann
f692380950 [imagebam] fix 'filename' & 'extension' for names without ext (#8476) 2025-10-29 11:27:52 +01:00
Mike Fährmann
9bf76c1352 replace 'util.re()' with 'text.re()'
remove unnecessary 'util' imports
2025-10-20 17:44:58 +02:00
Benjamin VERGNAUD
8d1c79c2b2 fix(imagebam): update cookies to bypass guard page
Signed-off-by: Benjamin VERGNAUD <ben@bvergnaud.fr>
2025-08-25 23:44:57 +02:00
Mike Fährmann
a097a373a9 simplify if statements by using walrus operators (#7671) 2025-07-22 20:57:54 +02:00
Mike Fährmann
41191bb60a 'match.group(N)' -> 'match[N]' (#7671)
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
e08ec7e083 update copyright notices 2025-06-13 00:03:41 +02:00
Mike Fährmann
811b665e33 remove @staticmethod decorators
There might have been a time when calling a static method was faster
than a regular method, but that is no longer the case. According to
micro-benchmarks, it is 70% slower in CPython 3.13 and it also makes
executing the code of a class definition slower.
2025-06-12 22:50:52 +02:00
Mike Fährmann
b5c88b3d3e replace standard library 're' uses with 'util.re()' 2025-06-06 13:24:52 +02:00
Mike Fährmann
b81fc5c124 replace text.rextract() with rextr() 2025-05-23 18:28:58 +02:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
b1b15d6cef [imagebam] add support for /view/ paths (closes #2378) 2022-03-14 08:38:20 +01:00
Mike Fährmann
1c79044433 [imagebam] set 'nsfw_inter' cookie (fixes #2334) 2022-02-27 16:12:28 +01:00
Mike Fährmann
8a909e478d [imagebam] fix extraction of NSFW images (#1534) 2021-05-22 21:41:44 +02:00
Mike Fährmann
15b0241bbc [imagebam] fix extraction 2021-05-06 16:47:36 +02:00
Mike Fährmann
eb7da159e2 [imagebam] update URL test results
Image URLs are now using https://, but the website itself is still
served as http://.
2019-08-07 21:47:44 +02:00
Mike Fährmann
155e1faeaf [imagebam] support galleries with >100 images (fixes #219) 2019-04-11 19:12:27 +02:00
Mike Fährmann
4b1880fa5e propagate 'match' to base extractor constructor 2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107 simplify extractor constants
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
eb1c24b286 [imagebam] detect nonexistent galleries 2018-10-17 15:21:47 +02:00
Mike Fährmann
789608c107 [imagebam] fix extraction for certain galleries 2018-05-11 17:11:52 +02:00
Mike Fährmann
5008e105ee update archive IDs
... to behave in a more straightforward way when dealing with
bookmarks/favourites/etc.

specific IDs are now grouped by their owner, album-id, ... to
allow for duplicates when it would be expected.
2018-03-01 18:20:50 +01:00
Mike Fährmann
34873dbd90 set 'archive_fmt' values
These are going to be used to create an unique id for each image.
2018-02-01 15:30:49 +01:00
Mike Fährmann
619387cbb1 update extractor unittest results 2018-01-28 18:29:05 +01:00
Mike Fährmann
92027f67f9 use consistent names for URL constants
root := <scheme>://<host>
base_url := <root>/<common path>
2017-11-06 20:56:49 +01:00
Mike Fährmann
68a0a7579c fix/improve some regular expressions 2017-10-09 22:37:50 +02:00
Mike Fährmann
6f30cf4c64 change keyword names to valid Python identifiers
This commit mostly replaces all minus-signs ('-') in keyword names with
underscores ('_') to allow them to be used in filter-expressions. For
example 'gallery-id' got renamed to 'gallery_id'.

(It is theoretically possible to access any variable, regardless of its
name, with 'locals()["NAME"]', but that seems a bit too convoluted if
just 'NAME' could be enough)
2017-09-10 22:20:47 +02:00
Mike Fährmann
c184e47ee3 put common directory- and filename formats in base classes 2017-05-30 12:10:16 +02:00
Mike Fährmann
94e10f249a code adjustments according to pep8 nr2 2017-02-01 00:53:19 +01:00
Mike Fährmann
56d810c896 update keyword hashes for tests 2016-09-25 17:28:46 +02:00
Mike Fährmann
19c2d4ff6f remove explicit (sub)category keywords 2016-09-25 14:22:07 +02:00
Mike Fährmann
d7e168799d consistent extractor naming scheme + docstrings 2016-09-12 10:34:31 +02:00
Mike Fährmann
2afa65cfc7 [imagebam] add single-image extractor 2016-09-02 08:25:29 +02:00
Mike Fährmann
000df8d1fa add 'encoding' argument for Extractor.request 2016-07-12 12:06:17 +02:00
Mike Fährmann
4d56b76aa8 update all other extractors 2015-11-21 04:26:30 +01:00
Mike Fährmann
c2f0720184 code cleanup to use nameext_from_url 2015-11-16 17:32:26 +01:00
Mike Fährmann
c0efea339e [imagebam] rewrite/fix 2015-11-04 00:03:48 +01:00
Mike Fährmann
3c13548f29 rewrite extractors to use config-module 2015-10-05 15:51:08 +02:00
Mike Fährmann
42b8e81a68 rewrite extractors to use text-module 2015-10-03 15:43:02 +02:00
Mike Fährmann
e41768d969 [imagebam] update to new extractor interface 2015-04-11 14:15:01 +02:00
Mike Fährmann
729d2d8b20 [imagebam] fixed issue with destination direcotry name 2014-11-20 21:45:59 +01:00
Mike Fährmann
98dd5f9a90 added extractor 'imagebam' 2014-11-20 21:27:57 +01:00