24 Commits

Author SHA1 Message Date
Mike Fährmann
53cdfaac37 [common] add reference to 'exception' module to Extractor class
- remove 'exception' imports
- replace with 'self.exc'
2026-02-15 10:57:22 +01:00
Mike Fährmann
12f5e24ab5 use sets for ' in { ... }' checks 2026-02-11 22:55:01 +01:00
Mike Fährmann
366b0750a8 [common] use extractor subcategory for 'notfound=True' 2026-01-19 11:19:35 +01:00
Mike Fährmann
e006d26c8e Revert "use f-strings when building 'pattern'"
revert d7c97d5a97.
2025-12-20 22:07:37 +01:00
Mike Fährmann
968597a302 yield 3-tuples for Message.Directory
adapt tuples to the same length and semantics as other messages
2025-12-05 21:39:52 +01:00
Mike Fährmann
d7c97d5a97 use f-strings when building 'pattern' 2025-10-20 21:23:11 +02:00
Mike Fährmann
9dbe33b6de replace old %-formatted and .format(…) strings with f-strings (#7671)
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
41191bb60a 'match.group(N)' -> 'match[N]' (#7671)
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
575e5df539 [hotleak] fix AttributeError (#5950)
fixes regression introduced in 0432e057
2024-08-07 08:26:23 +02:00
Mike Fährmann
0432e05783 [hotleak] fix faulty image URLs (#5915) 2024-08-01 12:35:01 +02:00
Mike Fährmann
bffadf35b7 [hotleak] download files with 404 status code (#5395) 2024-04-19 16:08:31 +02:00
Mike Fährmann
3ecb512722 send Referer headers by default 2023-09-19 00:02:04 +02:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Balgden
4b141cce66 Fix indentation 2023-04-03 13:44:14 +00:00
Balgden
bbc5977121 Fix line length 2023-04-03 13:38:42 +00:00
Balgden
ffd30abcb3 [hotleak] Fix downloading of creators whose name starts with a category name
E.g. `hot4lexi` would start downloading the `hot` section by mistake

This happened because the regex had a negative lookahead for the category names, but didn't ensure that they where followed by either end-of-string or a slash.
2023-04-03 13:30:27 +00:00
Mike Fährmann
7c9b1ec830 [hotleak] optimize decoding video URLs
- use binascii module
- combine slice and reverse step
2023-01-28 15:41:53 +01:00
nifnat
f14dbfe079 Make decode_video_url static (used in both post and creator extractor). 2023-01-28 14:36:49 +00:00
nifnat
bd23a701f3 Tidy up code. 2023-01-27 22:00:41 +00:00
nifnat
7f34f99a26 Reverse engineered obfuscated JS function and reimplemented in python. 2023-01-27 21:30:06 +00:00
Mike Fährmann
72c5d26e85 [hotleak] fix UnboundLocalError (#3288, #3293) 2022-11-23 22:21:59 +01:00
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
enduser420
bd846abba0 [hotleak] add hotleak extractor (#2909) (#2890) 2022-09-18 13:37:16 +02:00