Commit Graph

34 Commits

Author SHA1 Message Date
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
b1b15d6cef [imagebam] add support for /view/ paths (closes #2378) 2022-03-14 08:38:20 +01:00
Mike Fährmann
1c79044433 [imagebam] set 'nsfw_inter' cookie (fixes #2334) 2022-02-27 16:12:28 +01:00
Mike Fährmann
8a909e478d [imagebam] fix extraction of NSFW images (#1534) 2021-05-22 21:41:44 +02:00
Mike Fährmann
15b0241bbc [imagebam] fix extraction 2021-05-06 16:47:36 +02:00
Mike Fährmann
eb7da159e2 [imagebam] update URL test results
Image URLs are now using https://, but the website itself is still
served as http://.
2019-08-07 21:47:44 +02:00
Mike Fährmann
155e1faeaf [imagebam] support galleries with >100 images (fixes #219) 2019-04-11 19:12:27 +02:00
Mike Fährmann
4b1880fa5e propagate 'match' to base extractor constructor 2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107 simplify extractor constants
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
eb1c24b286 [imagebam] detect nonexistent galleries 2018-10-17 15:21:47 +02:00
Mike Fährmann
789608c107 [imagebam] fix extraction for certain galleries 2018-05-11 17:11:52 +02:00
Mike Fährmann
5008e105ee update archive IDs
... to behave in a more straightforward way when dealing with
bookmarks/favourites/etc.

specific IDs are now grouped by their owner, album-id, ... to
allow for duplicates when it would be expected.
2018-03-01 18:20:50 +01:00
Mike Fährmann
34873dbd90 set 'archive_fmt' values
These are going to be used to create an unique id for each image.
2018-02-01 15:30:49 +01:00
Mike Fährmann
619387cbb1 update extractor unittest results 2018-01-28 18:29:05 +01:00
Mike Fährmann
92027f67f9 use consistent names for URL constants
root := <scheme>://<host>
base_url := <root>/<common path>
2017-11-06 20:56:49 +01:00
Mike Fährmann
68a0a7579c fix/improve some regular expressions 2017-10-09 22:37:50 +02:00
Mike Fährmann
6f30cf4c64 change keyword names to valid Python identifiers
This commit mostly replaces all minus-signs ('-') in keyword names with
underscores ('_') to allow them to be used in filter-expressions. For
example 'gallery-id' got renamed to 'gallery_id'.

(It is theoretically possible to access any variable, regardless of its
name, with 'locals()["NAME"]', but that seems a bit too convoluted if
just 'NAME' could be enough)
2017-09-10 22:20:47 +02:00
Mike Fährmann
c184e47ee3 put common directory- and filename formats in base classes 2017-05-30 12:10:16 +02:00
Mike Fährmann
94e10f249a code adjustments according to pep8 nr2 2017-02-01 00:53:19 +01:00
Mike Fährmann
56d810c896 update keyword hashes for tests 2016-09-25 17:28:46 +02:00
Mike Fährmann
19c2d4ff6f remove explicit (sub)category keywords 2016-09-25 14:22:07 +02:00
Mike Fährmann
d7e168799d consistent extractor naming scheme + docstrings 2016-09-12 10:34:31 +02:00
Mike Fährmann
2afa65cfc7 [imagebam] add single-image extractor 2016-09-02 08:25:29 +02:00
Mike Fährmann
000df8d1fa add 'encoding' argument for Extractor.request 2016-07-12 12:06:17 +02:00
Mike Fährmann
4d56b76aa8 update all other extractors 2015-11-21 04:26:30 +01:00
Mike Fährmann
c2f0720184 code cleanup to use nameext_from_url 2015-11-16 17:32:26 +01:00
Mike Fährmann
c0efea339e [imagebam] rewrite/fix 2015-11-04 00:03:48 +01:00
Mike Fährmann
3c13548f29 rewrite extractors to use config-module 2015-10-05 15:51:08 +02:00
Mike Fährmann
42b8e81a68 rewrite extractors to use text-module 2015-10-03 15:43:02 +02:00
Mike Fährmann
e41768d969 [imagebam] update to new extractor interface 2015-04-11 14:15:01 +02:00
Mike Fährmann
729d2d8b20 [imagebam] fixed issue with destination direcotry name 2014-11-20 21:45:59 +01:00
Mike Fährmann
98dd5f9a90 added extractor 'imagebam' 2014-11-20 21:27:57 +01:00