Mike Fährmann
2d64e76223
[job] implement 'follow' option ( #8752 )
...
Follow and process URLs found in the given format string result.
2026-02-07 21:47:17 +01:00
Mike Fährmann
22b12a1798
[tests:job] test 'parent-metadata' / '_extractor' handling
2026-02-05 22:37:30 +01:00
Mike Fährmann
f046529f28
[tests:job] add tests for DataJob 'resolve'
2026-02-05 22:37:30 +01:00
Mike Fährmann
f0f9575406
[job] fix 'AttributeError' when enabling 'init' for non-DownloadJob
...
fixes bug in 56dcd00391
2026-02-03 19:00:45 +01:00
Mike Fährmann
3445c51ca4
[job] add 'output.jsonl' option ( #8953 )
2026-01-30 09:36:28 +01:00
Mike Fährmann
968597a302
yield 3-tuples for Message.Directory
...
adapt tuples to the same length and semantics as other messages
2025-12-05 21:39:52 +01:00
Mike Fährmann
98d3354575
[wikimedia] implement config lookup for fandom/wikigg sites ( #7283 )
...
{
"extractor": {
"fandom": {
"filename": "..."
}
}
}
2025-10-23 20:14:56 +02:00
Mike Fährmann
b9429de774
[tests] use f-strings (##7671)
2025-08-14 10:22:42 +02:00
Mike Fährmann
790e097edd
[tests:job] update TestDataJob.test_exception result
2025-06-24 18:59:50 +02:00
Mike Fährmann
41191bb60a
'match.group(N)' -> 'match[N]' ( #7671 )
...
2.5x faster
2025-06-18 13:05:58 +02:00
Mike Fährmann
40bd145637
remove 'contextlib' imports
2024-04-06 16:59:09 +02:00
Mike Fährmann
ba062712ad
[tests] '__main__' -> "__main__"
2024-02-27 02:10:05 +01:00
Mike Fährmann
082d55de16
fix circular reference detection for -K
2023-03-21 23:46:36 +01:00
Mike Fährmann
2ab66ad899
update -K output to include quotes around keys
2023-03-21 22:28:04 +01:00
Mike Fährmann
f037429fa4
attempt to improve '-K' output for lists
...
- use [N] instead if [] to indicate a Number needs to be placed there
- enumerate list items
2022-10-28 12:04:58 +02:00
Mike Fährmann
688d6553b4
replace calls to print() with stdout_write() ( #2529 )
2022-05-19 17:09:24 +02:00
Mike Fährmann
010d65dcec
extend blacklist/whitelist syntax ( #2025 )
...
Each entry in such a list can now also include a subcategory
'<category>:<subcategory>'
and it is possible to use '*' or an empty string as placeholder
'*:<subcategory>', ':<subcategory>', '<category>:*'
For example
"blacklist": "imgur,*:tag,gfycat:user" or
"blacklist": ["imgur", "*:tag", "gfycat:user"]
will filter all 'imgur' extractors, all extractors with a 'tag'
subcategory (e.g. https://danbooru.donmai.us/posts?tags=bonocho ),
and all 'gfycat' user extractors.
2021-11-23 20:31:43 +01:00
Mike Fährmann
da6806a161
fix job tests for Python 3.4 and 3.5
...
assert_called() and assert_not_called() got added in Python 3.6
2021-05-22 21:40:52 +02:00
Mike Fährmann
af9dba4684
add DataJob tests
2021-05-21 02:59:54 +02:00
Mike Fährmann
adf4d661b3
use '_extractor' info in UrlJobs
2021-05-19 15:52:30 +02:00
Mike Fährmann
559462789d
add some tests for job.py
2021-05-14 19:44:16 +02:00