Commit Graph

90 Commits

Author SHA1 Message Date
Mike Fährmann
2578f7b5c1 [flickr] extract public API key from website (#7564 #7649 #7700 #8553)
this breaks 'oauth:flickr' with the default key,
but it allows downloading without custom key / Flickr Pro
2025-11-14 11:38:28 +01:00
Mike Fährmann
d7c97d5a97 use f-strings when building 'pattern' 2025-10-20 21:23:11 +02:00
Mike Fährmann
c8fc790028 merge branch 'dt': move datetime utils into separate module
- use 'datetime.fromisoformat()' when possible (#7671)
- return a datetime-compatible object for invalid datetimes
  (instead of a 'str' value)
2025-10-20 09:30:05 +02:00
Mike Fährmann
085616e0a8 [dt] replace 'text.parse_datetime()' & 'text.parse_timestamp()' 2025-10-17 17:43:06 +02:00
Mike Fährmann
8c62be343e [output] add 'Logger.traceback()' helper 2025-10-14 18:44:29 +02:00
Mike Fährmann
a097a373a9 simplify if statements by using walrus operators (#7671) 2025-07-22 20:57:54 +02:00
Mike Fährmann
d8ef1d693f rename 'StopExtraction' to 'AbortExtraction'
for cases where StopExtraction was used to report errors
2025-07-09 21:07:28 +02:00
Mike Fährmann
9dbe33b6de replace old %-formatted and .format(…) strings with f-strings (#7671)
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
e08ec7e083 update copyright notices 2025-06-13 00:03:41 +02:00
Mike Fährmann
811b665e33 remove @staticmethod decorators
There might have been a time when calling a static method was faster
than a regular method, but that is no longer the case. According to
micro-benchmarks, it is 70% slower in CPython 3.13 and it also makes
executing the code of a class definition slower.
2025-06-12 22:50:52 +02:00
Mike Fährmann
0285473b04 [flickr] add 'profile' option 2025-05-13 11:47:34 +02:00
Mike Fährmann
204f1c5f92 [flickr] fix overwriting 'owner'/'user' data when 'info' is enabled 2025-05-13 11:11:03 +02:00
Mike Fährmann
6b84de6cf7 [flickr] add 'info' option (#4720 #6817) 2025-05-12 17:07:36 +02:00
Mike Fährmann
bd7fcdab4c [flickr] provide human-readable 'license_name' metadata 2025-05-12 16:44:22 +02:00
Mike Fährmann
cf6eff7ff7 [flickr] remove constructors 2025-05-12 16:27:14 +02:00
Mike Fährmann
b62c466c14 [flickr] fix video download URLs (#6464)
continuation of 0e18fa395d
fix video detection in '_file_url'
2024-11-13 20:56:37 +01:00
Mike Fährmann
7916c8bf77 allow passing cookies to OAuth extractors
partially revert ce54b8c04c
2024-11-09 18:06:27 +01:00
Mike Fährmann
0e18fa395d [flickr] use "download" URLs (#6360) 2024-11-09 17:33:27 +01:00
Mike Fährmann
7c43f9e152 [flickr] update default API credentials (#6300) 2024-10-10 11:50:34 +02:00
Mike Fährmann
9a0acbe7c4 [flickr] remove debug remains (#6252)
fixes regression introduced in a051e1c9
2024-09-29 13:01:51 +02:00
Mike Fährmann
a051e1c955 directly pass exception instances as 'exc_info' logger argument 2024-09-19 14:50:08 +02:00
Mike Fährmann
58113b73d1 [flickr] make album metadata extraction non-fatal (#3441)
https://github.com/mikf/gallery-dl/issues/3441#issuecomment-2313679156
2024-08-30 10:24:03 +02:00
Pedro Cunha
dcd44cf423 [flickr] reference the correct function 2024-08-22 17:00:35 +01:00
Mike Fährmann
e92a9ae343 [flickr] make exif and context metadata extraction non-fatal (#6002) 2024-08-14 09:44:04 +02:00
Mike Fährmann
5c1f5861b6 [flickr] add 'contexts' option (#5324) 2024-03-18 17:36:16 +01:00
Mike Fährmann
6e928300bc [flickr] handle non-JSON errors (#5131) 2024-02-06 21:22:10 +01:00
Mike Fährmann
4cdab8074e update/fix --list-extractors 2023-09-11 17:32:59 +02:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
a383eca7f6 decouple extractor initialization
Introduce an 'initialize()' function that does the actual init
(session, cookies, config options) and can called separately from
the constructor __init__().

This allows, for example, to adjust config access inside a Job
before most of it already happened when calling 'extractor.find()'.
2023-07-25 22:16:16 +02:00
Mike Fährmann
7da954f810 [flickr] update default API credentials (#4332)
and add a delay between API requests
2023-07-22 15:38:33 +02:00
Mike Fährmann
d97b8c2fba consistent cookie-related names
- rename every cookie variable or method to 'cookies_*'
- simplify '.session.cookies' to just '.cookies'
- more consistent 'login()' structure
2023-07-22 01:20:50 +02:00
Mike Fährmann
c45a913bfd [flickr] add 'exif' option 2023-07-01 19:19:39 +02:00
Mike Fährmann
ccbc1a1d55 [flickr] add 'metadata' option (#4227) 2023-06-26 16:49:48 +02:00
Mike Fährmann
d0b73fec14 [flickr] add support for secure.flickr.com (#2910) 2022-09-14 16:19:27 +02:00
Vrihub
96fcff182c generic extractor (#735)
* Generic extractor, see issue #683

* Fix failed test_names test, no subcategory needed

* Prefix directory_fmt with "generic"

* Relax regex (would break some urls)

* Flake8 compliance

* pattern: don't require a scheme

This fixes a bug when we force the generic extractor on urls without a
scheme (that are allowed by all other extractors).

* Fix using g: and r: on urls without http(s) scheme

Almost all extractors accept urls without an initial http(s) scheme.

Many extractors also allow for generic subdomains in their "pattern"
variable; some of them implement this with the regex character class
"[^.]+" (everything but a dot).

This leads to a problem when the extractor is given a url starting
with g: or r: (to force using the generic or recursive extractor)
and without the http(s) scheme: e.g. with "r:foobar.tumblr.com"
the "r:" is wrongly considered part of the subdomain.

This commit fixes the bug, replacing the too generic "[^.]+" with the
more specific "[\w-]+" (letters, digits and "-", the only characters
allowed in domain names), which is already used by some extractors.

* Relax imageurl_pattern_ext: allow relative urls

* First round of small suggested changes

* Support image urls starting with "//"

* self.baseurl: remove trailing slash

* Relax regexp (didn't catch some image urls)

* Some fixes and cleanup

* Fix domain pattern; option to enable extractor

Fixed the domain section for "pattern", to pass "test_add" and
"test_add_module" tests.
Added the "enabled" configuration option (default False) to enable the
generic extractor. Using "g(eneric):URL" forces using the extractor.
2021-12-29 22:39:29 +01:00
Mike Fährmann
bd08ee2859 remove most 'yield Message.Version' statements
only leave them in oauth.py as noop results
2021-08-16 03:10:48 +02:00
Mike Fährmann
ca44111726 [flickr] update
- ensure every photo has an 'owner' (#828)
- change default directories to a more consistent schema
- create directory for each photo
2020-11-15 10:44:29 +01:00
Mike Fährmann
e6cd49e78b update extractor test results 2020-02-16 21:48:46 +01:00
Mike Fährmann
ce54b8c04c let extractors opt-out of cookie option usage
useful to avoid sending unnecessary cookies when all authentication
is done through OAuth tokens
2020-01-01 21:12:37 +01:00
Mike Fährmann
abfcb356fc [flickr] support 3k, 4k, 5k, and 6k photo sizes (closes #472) 2019-11-10 17:52:51 +01:00
Mike Fährmann
4409d00141 embed error messages in StopExtraction exceptions 2019-10-28 16:39:49 +01:00
Mike Fährmann
20fd2d8450 [flickr] skip unavailable images/videos (fixes #398) 2019-08-27 23:26:49 +02:00
Mike Fährmann
5499934ae2 [ngomik] fix extraction 2019-05-30 20:18:36 +02:00
Mike Fährmann
9890bfdf23 [flickr] improve code and metadata
- simplify pagination
- add more metadata and slightly change its structure
  - convert suitable values to int or list
  - move keys from ["photo"] to the base level
- proper video support (#246)
- rename method and variable names to better fit with other extractors
2019-05-14 22:10:50 +02:00
Mike Fährmann
d6ddb74cde update test results
- deviantart: 'index' is now an integer
- flickr: image file with lower quality
- paheal: image server name changed
- rule34: post got deleted
2019-04-12 09:59:48 +02:00
Mike Fährmann
87b0929bec Revert "[flickr] restore image quality"
This reverts commit 3f513f1056.

Both live.staticflickr and farmN.staticflickr servers now produce the
same image file with a lower overall quality than before this change in
Flickr's end.
2019-04-11 20:31:05 +02:00
Mike Fährmann
3f513f1056 [flickr] restore image quality
Flickr started serving images from live.staticflickr.com (see ec88ff1),
but the old farmN.staticflickr.com URLs still work - at least for the
time being.
Filesize (and most likely quality as well) for images from live.…  is
severely reduced compared to images from farmN.… for non-original files,
so all live URLs are replaced to point to a randomly chosen farm server.
2019-04-06 11:26:10 +02:00
Mike Fährmann
ec88ff1562 [flickr] relax unit test results
Images are now randomly served from the 'live.staticflickr.com' domain
instead of the "old" 'farmN.staticflickr.com' one, making it impossible
to use static 'url' and 'keyword' hashes as results.

Image quality doesn't appear to be effected by which image-server is
used. Files from 'farmN' and 'live' are the same.
2019-03-30 18:31:59 +01:00
Mike Fährmann
5530871b5a change results of text.nameext_from_url()
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)

Example: "https://example.org/path/filename.ext"

before:
- filename : filename.ext
- name     : filename
- extension: ext

now:
- filename : filename
- extension: ext
2019-02-14 16:07:17 +01:00
Mike Fährmann
89ee8cd7e4 filter "private" kwdict entries 2019-02-13 13:22:11 +01:00