Mike Fährmann
53cdfaac37
[common] add reference to 'exception' module to Extractor class
...
- remove 'exception' imports
- replace with 'self.exc'
2026-02-15 10:57:22 +01:00
Mike Fährmann
ec2267244f
[tumblr:search] prevent KeyError when using 'offset' pagination ( #8720 )
2025-12-29 16:57:04 +01:00
Mike Fährmann
00c6821a3f
replace 2-element f-strings with simple '+' concatenations
...
Python's 'ast' module and its 'NodeVisitor' class
were incredibly helpful in identifying these
2025-12-22 11:26:04 +01:00
Mike Fährmann
e006d26c8e
Revert "use f-strings when building 'pattern'"
...
revert d7c97d5a97 .
2025-12-20 22:07:37 +01:00
Mike Fährmann
968597a302
yield 3-tuples for Message.Directory
...
adapt tuples to the same length and semantics as other messages
2025-12-05 21:39:52 +01:00
Mike Fährmann
d7c97d5a97
use f-strings when building 'pattern'
2025-10-20 21:23:11 +02:00
Mike Fährmann
9bf76c1352
replace 'util.re()' with 'text.re()'
...
remove unnecessary 'util' imports
2025-10-20 17:44:58 +02:00
Mike Fährmann
085616e0a8
[dt] replace 'text.parse_datetime()' & 'text.parse_timestamp()'
2025-10-17 17:43:06 +02:00
Mike Fährmann
69f7cfdd0c
[dt] replace 'datetime' imports
2025-10-16 11:42:42 +02:00
Mike Fährmann
951bf7c6b9
[tumblr] update
...
- provide 'search_tags' metadata for tag searches (#8160 )
- support '/archive/tagged/' URLs (#8160 )
- use self.groups
- remove __init__ constructors & _init functions
- remove "#category" test results
2025-09-02 10:26:53 +02:00
Mike Fährmann
ff147c2a32
[tumblr] fix pagination when using 'date-max'
2025-09-02 10:24:00 +02:00
Mike Fährmann
d9d8172364
[tumblr:search] fix 'ValueError: not enough values to unpack' ( #8079 )
...
fixes regression introduced in 21160a8b08
2025-08-20 08:45:19 +02:00
Mike Fährmann
ca22cb1487
[tumblr] add 'following' & 'followers' extractors ( #8018 )
2025-08-12 22:11:10 +02:00
Mike Fährmann
a097a373a9
simplify if statements by using walrus operators ( #7671 )
2025-07-22 20:57:54 +02:00
Mike Fährmann
d8ef1d693f
rename 'StopExtraction' to 'AbortExtraction'
...
for cases where StopExtraction was used to report errors
2025-07-09 21:07:28 +02:00
Mike Fährmann
9dbe33b6de
replace old %-formatted and .format(…) strings with f-strings ( #7671 )
...
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
41191bb60a
'match.group(N)' -> 'match[N]' ( #7671 )
...
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
e08ec7e083
update copyright notices
2025-06-13 00:03:41 +02:00
Mike Fährmann
811b665e33
remove @staticmethod decorators
...
There might have been a time when calling a static method was faster
than a regular method, but that is no longer the case. According to
micro-benchmarks, it is 70% slower in CPython 3.13 and it also makes
executing the code of a class definition slower.
2025-06-12 22:50:52 +02:00
Mike Fährmann
b5c88b3d3e
replace standard library 're' uses with 'util.re()'
2025-06-06 13:24:52 +02:00
Mike Fährmann
a1fd329783
[tumblr] improve error message for dashboard-only blogs ( #7455 )
2025-05-03 11:02:38 +02:00
Mike Fährmann
21160a8b08
[tumblr] support URLs without subdomain ( #7358 )
2025-04-13 09:33:51 +02:00
Mike Fährmann
7916c8bf77
allow passing cookies to OAuth extractors
...
partially revert ce54b8c04c
2024-11-09 18:06:27 +01:00
Mike Fährmann
33778d35ba
[tumblr] update
...
- simplify
- fix search pagination
- support custom search mode and post types
2024-11-08 08:15:13 +01:00
Allen
0f94fa9015
[tumblr] search extractor minimal styling changes
2024-10-29 13:06:23 +01:00
Allen
d2ef9a590f
[tumblr] add search extractor
2024-09-03 08:18:58 +02:00
Mike Fährmann
785e6f2911
[tumblr] fix 401 Unauthorized for likes when using api-key ( #5994 )
...
fixes regression introduced in 540eaa5a
2024-08-12 09:09:59 +02:00
Mike Fährmann
540eaa5add
[tumblr] implement 'pagination' option ( #5880 )
...
restore pagination behavior from before
de670bd7de
2024-07-23 20:31:04 +02:00
Mike Fährmann
141a93c8fd
[docs] update docs/configuration links ( #5059 , #5369 , #5423 )
2024-04-13 02:18:44 +02:00
Mike Fährmann
da76e13e3b
[tumblr] fix exception after waiting for rate limit ( #4916 )
...
use a loop instead of recursive function calls
2023-12-12 19:14:06 +01:00
Mike Fährmann
d59d4ebff4
[tumblr] support infinite 'fallback-retries'
2023-12-11 23:40:13 +01:00
Mike Fährmann
7608201a44
[tumblr] fix 'day' extractor
...
another bug caused by a383eca7
2023-11-25 00:51:14 +01:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
1d2b5d0c60
update test comment positions
...
always put them above the test they're referring to
2023-09-06 18:16:09 +02:00
Mike Fährmann
255d08b79e
add test for 'Extractor.initialize()' ( #4359 )
2023-07-28 16:58:16 +02:00
Mike Fährmann
a383eca7f6
decouple extractor initialization
...
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().
This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
d97b8c2fba
consistent cookie-related names
...
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
5297ee0cd9
[tumblr] add 'day' extractor ( #3951 )
2023-04-24 22:01:47 +02:00
Mike Fährmann
de670bd7de
[tumblr] update pagination logic ( #2191 )
2023-04-24 20:07:10 +02:00
Mike Fährmann
8fb043e8ff
[tumblr] raise more detailed errors for dashboard-only blogs
...
(#3628 )
2023-02-12 19:38:14 +01:00
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible
2022-11-05 01:14:09 +01:00
ClosedPort22
4e80d3210e
[tumblr] Fallback to gifv when possible ( #3095 ) ( #3159 )
2022-11-04 19:42:36 +01:00
Mike Fährmann
7c6af27eb8
[tumblr] add 'fallback-*' options ( #2957 )
...
specifically 'fallback-delay' and 'fallback-retries'
and change default number of retries to 2 (down from 3)
2022-10-26 13:59:09 +02:00
Mike Fährmann
68466a7d61
[tumblr] support ' https://www.tumblr.com/BLOGNAME ' URLs ( #3034 )
2022-10-11 21:09:24 +02:00
Mike Fährmann
f1f89b2436
[tumblr] add 'offset' option
2022-10-11 10:54:23 +02:00
Mike Fährmann
e5d229c524
[tumblr] sleep between fallback retries ( #2957 )
2022-10-11 10:48:28 +02:00
Mike Fährmann
e1d714943b
[tumblr] catch exception when updating image token ( #2957 )
2022-09-30 15:08:21 +02:00
Mike Fährmann
f728b5ca06
[tumblr] add fallback for failed higher-resolution images ( #2957 )
2022-09-28 21:36:09 +02:00
Mike Fährmann
32c30754d1
[tumblr] warn when unable to fetch higher-resolution images ( #2957 )
...
and download the smaller version
instead of failing with a 404 error
2022-09-26 12:05:34 +02:00
Mike Fährmann
46fe469c53
[tumblr] implement 'ratelimit' option ( #2919 )
2022-09-17 14:10:33 +02:00