Commit Graph

262 Commits

Author SHA1 Message Date
Mike Fährmann
491d70f918 [job] apply 'extension-map' to 'SimulationJob' results 2025-08-02 07:28:12 +02:00
Mike Fährmann
2eb5e52055 extend '-A / --abort' & '"skip": "abort"' functionality (#7891)
implement ascending by more than 1 level or
up to an extractor with a specific subcategory
2025-07-30 00:01:49 +02:00
Mike Fährmann
64de6605ce [job] improve URL 'scheme' extraction performance 2025-07-29 22:26:21 +02:00
Mike Fährmann
a097a373a9 simplify if statements by using walrus operators (#7671) 2025-07-22 20:57:54 +02:00
Mike Fährmann
dd09937d69 fix exit code for requests' JSONDecodeError (#4380) 2025-07-17 16:37:40 +02:00
Mike Fährmann
232e30f64e [actions] fix 'parse_logging' import (#7837)
fixes regressions introduced in bccf467d19
2025-07-17 16:30:10 +02:00
Mike Fährmann
98895b732f [reddit] improve archive IDs of fallback files (#7760)
prevent 'DASH...' and 'HLS...' entries
2025-07-11 22:59:44 +02:00
Mike Fährmann
d8a370da0b [signals] update FLAGS handling 2025-07-11 22:28:26 +02:00
Mike Fährmann
0210ffcdd8 initial 'signals-actions' implementation (#6582)
https://github.com/mikf/gallery-dl/issues/6582#issuecomment-2973285775

To stop gracefully after the current file finishes processing when
Ctrl+C was pressed, or after the current post finishes processing when
SIGUSR1 was received:

{
    "signals-actions": {
        "SIGINT" : "file",
        "SIGUSR1": "post"
    }
}
2025-07-09 23:02:23 +02:00
Mike Fährmann
d8ef1d693f rename 'StopExtraction' to 'AbortExtraction'
for cases where StopExtraction was used to report errors
2025-07-09 21:07:28 +02:00
Mike Fährmann
4e9cb428d6 [pp] implement shortcuts for 'mode' and 'event' options
This makes it possible to specify 'mode' and/or 'event' options of a
postprocessor in its 'name' as
"NAME/MODE@EVENT" or "NAME/MODE" or "NAME@EVENT"

For example
"postprocessors": "metadata/jsonl@file,skip"

is equivalent to
"postprocessors": {
    "name" : "metadata",
    "mode" : "jsonl",
    "event": ["file", "skip"]
}
2025-07-09 12:40:37 +02:00
Mike Fährmann
1bbacba4ed [common] introduce 'status' attribute to Extractors
allows reporting error codes for exceptions that are not handled
by the Job.run() try-except block

- fixes Job.status being 0 in certain situations even when errors occurred
- fixes some URLs not getting written to -e/--error-file (#7758)
2025-07-05 21:33:01 +02:00
Mike Fährmann
59b266f883 [reddit] emit logging message when downloading previews (#7748) 2025-06-29 21:35:51 +02:00
Mike Fährmann
9dbe33b6de replace old %-formatted and .format(…) strings with f-strings (#7671)
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
0334a7c48c [job] apply 'update_kwdict()' to Message.Queue metadata as well 2025-06-26 19:15:44 +02:00
Mike Fährmann
d7103f9bdd [job:data] wrap exceptions in a dict (#7723)
fixes exception when using 'num-to-str' with '-j'
2025-06-23 17:17:28 +02:00
Mike Fährmann
fcd1b8a155 [common] add a 'kwdict' member to extractor instances
to allow setting general metadata at any point and without having to
rely on a manually implemented 'metadata()' method
2025-06-19 19:08:35 +02:00
Mike Fährmann
e08ec7e083 update copyright notices 2025-06-13 00:03:41 +02:00
Mike Fährmann
811b665e33 remove @staticmethod decorators
There might have been a time when calling a static method was faster
than a regular method, but that is no longer the case. According to
micro-benchmarks, it is 70% slower in CPython 3.13 and it also makes
executing the code of a class definition slower.
2025-06-12 22:50:52 +02:00
Mike Fährmann
33f3ed9f57 [job] refactor parent-child config paths (#7527)
- fixes TypeError when enabling 'category-transfer'
- fixes 'category-transfer' not applying to early config lookups
2025-05-31 17:41:54 +02:00
Mike Fährmann
19fc4e0ba4 [job] do not reset skip count when 'skip-filter' fails (#7433) 2025-04-27 19:16:02 +02:00
Mike Fährmann
8daf496a22 [archive] add 'archive-table' option (#6152) 2025-02-17 11:41:13 +01:00
Mike Fährmann
841bc9f66f [archive] implement support for PostgreSQL databases (#6152) 2025-02-16 17:56:52 +01:00
Mike Fährmann
5ab2ae17bc support wildcards for parent>child categories (#6673)
For example "reddit>*" for all reddit child extractors
2024-12-16 08:50:18 +01:00
Mike Fährmann
d8cf381904 [archive] use defaults when 'prefix'/'format' are 'null' 2024-11-29 16:36:35 +01:00
Mike Fährmann
55afd712d6 [pp] allow inheriting settings from global 'postprocessor' entries
No idea how to properly explain/document this, so here's an example:

The extractor.postprocessors object
gets its options from postprocessor.jl
and adds 'filename' itself.

{
    "extractor": {
        "postprocessors": {
            "type": "jl",
            "filename": "meta.jsonl"
        }
    },

    "postprocessor": {
        "jl": {
            "name": "metadata",
            "mode": "jsonl",
            "open": "a"
        }
    }
}
2024-11-16 21:16:13 +01:00
Mike Fährmann
80454460ce [config] support accumulating non-list values
fixes 1264fc518b
2024-11-16 21:13:57 +01:00
Mike Fährmann
1264fc518b allow 'postprocessors' to be a single dict/str
do not require it to be a list with just one element

"postprocessors": "metadata"
"postprocessors": {"name": "metadata"}
2024-11-15 21:15:00 +01:00
Mike Fährmann
5bc3657c59 [util] implement 'compile_filter()' (#5262)
https://github.com/mikf/gallery-dl/issues/5262#issuecomment-2477029728

allow (theoretically*) all filter expression statements
to be a list of individual filters

(*) except for 'filename' and 'directory' conditionals,
as dict keys cannot be lists
2024-11-14 22:47:36 +01:00
Mike Fährmann
2e1dab3036 [pp] add 'error' event 2024-10-19 20:30:34 +02:00
Mike Fährmann
d3dcc44bd1 use child fallbacks only when a non-user error occurs (#6329) 2024-10-17 08:04:41 +02:00
Mike Fährmann
a051e1c955 directly pass exception instances as 'exc_info' logger argument 2024-09-19 14:50:08 +02:00
Mike Fährmann
dd56bb2187 include debug exception info for GalleryDLException errors 2024-09-19 13:51:27 +02:00
Mike Fährmann
8072dcf717 [pp:rename] recheck if file exists only when necessary 2024-09-05 17:42:29 +02:00
Mike Fährmann
359572162b [pp:rename] improve renaming files 'to' a format (#5846, #6044) 2024-09-03 21:17:31 +02:00
Mike Fährmann
8ecd408f53 add '-J/--resolve-json' command-line option (#5864) 2024-07-26 20:41:35 +02:00
Mike Fährmann
84a634fc14 [job] add 'resolve' argument to DataJob (#5864) 2024-07-19 14:32:42 +02:00
Mike Fährmann
f7a6401031 [actions] move LoggerAdapter from 'output' to 'actions' 2024-06-30 20:41:51 +02:00
Mike Fährmann
ea81fa985f [archive] implement 'archive-event' option (#5784)
With this, IDs of skipped files will no longer be written to an archive
by default. Use "archive-event": "file,skip" to restore the previous
behavior.
2024-06-27 22:00:59 +02:00
Mike Fährmann
895e633c44 implement 'keywords-eval' option (#5621)
to allow evaluating 'keywords' values as format strings
2024-05-22 22:53:34 +02:00
Mike Fährmann
d2f50ecf09 add 'skip-filter' option (#5255) 2024-05-10 22:59:52 +02:00
Mike Fährmann
fd734b9222 [archive] add 'archive-mode' option (#5255) 2024-05-10 22:59:51 +02:00
Mike Fährmann
88f94190f4 [archive] move DownloadArchive into its own module 2024-05-10 01:05:28 +02:00
Mike Fährmann
92fbf09643 remove single quotes in some logging messages (#4908)
('FileNotFoundError: [Errno 2] No such file or directory: ''')
->
(FileNotFoundError: [Errno 2] No such file or directory: '')
2023-12-11 19:13:45 +01:00
Mike Fährmann
aea15f6d17 add 'metadata-extractor' option (#4549) 2023-11-20 22:16:15 +01:00
Mike Fährmann
34a387b6e2 support 'metadata-*' names for '*-metadata' options
For example, instead of 'url-metadata' it is now also possible to use
'metadata-url' as option name.

- metadata-url
- metadata-path
- metadata-http
- metadata-version
- metadata-parent
2023-11-18 23:52:10 +01:00
Mike Fährmann
2cd801232b fix --range causing crashes (#4557)
regression caused by a383eca7
2023-09-22 16:28:20 +02:00
Mike Fährmann
7defb24e1e [reddit] provide video previews if available (#4322) 2023-08-28 22:22:10 +02:00
Mike Fährmann
14af15bd18 [reddit] download preview for 404ed imgur links (#4322)
This is a pretty ugly hack as the internal infrastructure doesn't
really support switching from external URL to regular download in
case the former fails, but it kind of works ...

Can be disabled by setting 'reddit.fallback' to 'false'.
2023-08-24 15:41:05 +02:00
Mike Fährmann
92f98e6f5e 'sys.exit' -> 'SystemExit' 2023-08-21 23:46:39 +02:00