Commit Graph

1162 Commits

Author SHA1 Message Date
Mike Fährmann
46b6b71159 [wallhaven] extract 'search[tags]' and 'search[tag_id]' metadata
(#6772)
2025-01-06 17:18:04 +01:00
Mike Fährmann
270aaea8ab [pixiv] provide fallback URLs (#6762) 2025-01-06 15:27:32 +01:00
Mike Fährmann
770f41eb4a [util] support not splitting "contains" value (#6773)
by passing any "false" value as 'separator' argument except None
2025-01-06 13:47:32 +01:00
Mike Fährmann
5767c0854c merge #6758: [subscribestar] fix attachment downloads and add support for audio type
(#6721, #>6724)
2025-01-02 18:25:37 +01:00
Mike Fährmann
671297a8cc [subscribestar] extend fix + add test
some attachments are inside an element with an additional class besides
'doc_preview', e.g. 'class="doc_preview for_post"'
2025-01-02 18:22:15 +01:00
Mike Fährmann
428eb53086 [hitomi] provide 'search_tags' metadata for search/tag results
(#1015, #6756)
2025-01-02 17:49:30 +01:00
Mike Fährmann
0c584f9be7 [sankaku] support alphanumeric book/pool IDs (#6757) 2025-01-02 15:49:07 +01:00
Mike Fährmann
7391dd208c [poipiku] always query 'ShowAppendFileF' when post has warning (#6736) 2024-12-27 20:32:50 +01:00
Mike Fährmann
bc7e95684d [piczel] fix extraction (#6735)
- fix pagination
- update API endpoints
- provide 'count' metadata field
- use BASE_PATTERN and self.groups[…]
2024-12-27 15:08:08 +01:00
Mike Fährmann
167a726972 [szurubooru] support 'visuabusters.com/booru' (#6729) 2024-12-26 19:04:16 +01:00
Mike Fährmann
998f949db1 [civitai] add 'user-videos' extractor (#6644) 2024-12-26 10:18:54 +01:00
Mike Fährmann
3024dce06b [8muses] skip albums without valid 'permalink' (#6717) 2024-12-24 13:49:19 +01:00
Mike Fährmann
f9d3603bfc [hitomi] fix searches (#6713) 2024-12-24 09:36:29 +01:00
Mike Fährmann
de9442ba75 [directlink] use domain as 'subcategory' (#6703) 2024-12-22 17:19:56 +01:00
Mike Fährmann
18491a4ce6 [tapas] fix TypeError for locked episodes (#6700) 2024-12-21 15:17:51 +01:00
Mike Fährmann
e0514817bd [saint] support 'saint2.cr' URLs (#6692) 2024-12-19 11:43:35 +01:00
Mike Fährmann
fd5869f7df [bilibili] support '/upload/opus' URLs (#6687) 2024-12-18 08:53:27 +01:00
Mike Fährmann
5fbd0c3a63 [bilibili] extract files from 'module_top' entries (#6687) 2024-12-18 08:45:29 +01:00
Mike Fährmann
9f3e4511c6 [tapas] restructure extractors (#6680)
- handle all episodes with TapasEpisodeExtractor
- prevent locked episodes from stopping processing of all following
  episodes
2024-12-17 21:36:37 +01:00
Mike Fährmann
b6b1008ef2 [kemonoparty] support new favorite URLs (#6676) 2024-12-16 07:45:33 +01:00
Mike Fährmann
7f6a53c347 [cohost] add 'avatar' and 'background' options (#6656) 2024-12-14 20:16:28 +01:00
Mike Fährmann
94d7df186f [bluesky] default to /posts if reposts/quoted is enabled (#6583) 2024-12-13 22:24:37 +01:00
Mike Fährmann
85a37ca039 [facebook] decode surrogate pairs in metadata values (#6599) 2024-12-12 20:20:30 +01:00
Mike Fährmann
a33065be86 [zerochan] parse API response manually when json.loads() fails (#6632) 2024-12-12 19:57:37 +01:00
Mike Fährmann
d2c66ac34d [zerochan] fix 'source' extraction when not logged in 2024-12-12 18:16:11 +01:00
Mike Fährmann
63008f77e2 merge #6607: [lofter] add initial support
(#650, #2294, #4095, #4728, #5656)
2024-12-11 20:41:52 +01:00
Mike Fährmann
717081dabd [lofter] update
- add tests
- update docs/supportedsites
- provide 'date' metadata
- simplify/restructure some code
2024-12-11 20:39:01 +01:00
Mike Fährmann
0e942f0829 merge #6613: [itaku] add 'search' extractor 2024-12-11 11:54:33 +01:00
Mike Fährmann
b58af14bdb [itaku] update
- simplify code
- update docs/supportedsites
- update test results
2024-12-11 11:52:42 +01:00
Mike Fährmann
86334f9c4a [yiffverse] add support (#6611) 2024-12-11 10:57:21 +01:00
Mike Fährmann
47311352de [cyberdrop] add extractor for media URLs (#2496)
https://github.com/mikf/gallery-dl/issues/2496#issuecomment-2495467133
2024-12-08 20:57:12 +01:00
Mike Fährmann
ef7ff31117 [realbooru] fix extraction (#6543)
- extract data from HTML pages since API is no longer usable
- move code into its own separate 'realbooru' module
2024-12-07 17:39:25 +01:00
Mike Fährmann
e1613fc0f4 [nhentai] select random file servers for download URLs (#6620)
i1, i2, i3, i4 instead of just i.nhentai.net
2024-12-07 17:39:25 +01:00
Shelvacu
b90c77d8f1 [itaku] add 'search' extractor 2024-12-05 21:09:38 -08:00
Mike Fährmann
624dc7f407 [bluesky] add 'info' extractor 2024-12-05 08:36:33 +01:00
Mike Fährmann
a526a3d00d [patreon] add 'format-images' option (#6569) 2024-12-04 21:38:01 +01:00
Mike Fährmann
d96717e2e6 [hentaicosplays] update domains (#6578)
inherit from BaseExtractor to make differentiating between sites easier
2024-12-03 13:56:32 +01:00
Mike Fährmann
63e042dec7 [e621] fix 'TypeError' when 'metadata' is enabled (#6587)
fixes regression introduced in 9184a564
2024-12-02 14:09:38 +01:00
Mike Fährmann
79fd3445ee [pixiv:ranking] add 'rank' metadata field (#6531) 2024-11-28 19:34:55 +01:00
Mike Fährmann
9e7d7a3bb3 merge #6548: [facebook] add more tests 2024-11-28 15:25:02 +01:00
Mike Fährmann
7c7b8a25c3 [kemonoparty] fix login / update favorites extractor (#6415) 2024-11-28 14:41:16 +01:00
Luca Russo
e36cfb73ff added more tests 2024-11-28 10:55:43 +01:00
Mike Fährmann
e1aa4a7162 [kemonoparty] support new discord channel URLs (#6542) 2024-11-27 15:19:27 +01:00
Mike Fährmann
2162fa7df2 [kemonoparty] fix 'comments' for posts without comments (#6415)
https://github.com/mikf/gallery-dl/issues/6415#issuecomment-2501966303
2024-11-26 23:23:39 +01:00
Mike Fährmann
74d855c693 [kemonoparty] update to new site layout / API endpoints
(#6415, #6503, #6528, #6530, #6536)

… at least for the most part. Favorites are still broken, but the rest
should be functional again.
2024-11-26 22:15:28 +01:00
Luca Russo
e9370b7b8a merge #5626: [facebook] add support (#470, #2612)
* [facebook] add initial support

* renamed extractors & subcategories

* better stability, modularity & naming

* added single photo extractor, warnings & retries

* more metadata + extract author followups

* renamed "album" mentions to "set" for consistency

* cookies are now only used when necessary

also added author followups for singular images

* removed f-strings

* added way to continue extraction from where it left off

also fixed some bugs

* fixed bug wrong subcategory

* added individual video extraction

* extract audio + added ytdl option

* updated setextract regex

* added option to disable start warning

the extractor should be ready :)

* fixed description metadata bug

* removed cookie "safeguard" + fixed for private profiles

I have removed the cookie "safeguard" (not using cookies until they are necessary) as I've come to the conclusion that it does more harm than good. There is no way to detect whether the extractor has skipped private images, that could have been possibly extracted otherwise. Also, doing this provides little to no advantages.

* fixed a few bugs regarding profile parsing

* a few bugfixes

Fixed some metadata attributes from not decoding correctly from non-latin languages, or not showing at all.
Also improved few patterns.

* retrigger checks

* Final cleanups

-Added tests
-Fixed video extractor giving incorrect URLs
-Removed start warning
-Listed supported site correctly

* fixed regex

* trigger checks

* fixed livestream playback extraction + bugfixes

I've chosen to remove the "reactions", "comments" and "views" attributes as I've felt that they require additional maintenance even though nobody would ever actually use them to order files.
I've also removed the "title" and "caption" video attributes for their inconsistency across different videos.
Feel free to share your thoughts.

* fixed regex

* fixed filename fallback

* fixed retrying when a photo url is not found

* fixed end line

* post url fix + better naming

* fix posts

* fixed tests

* added profile.php url

* made most of the requested changes

* flake

* archive: false

* removed unnecessary url extract

* [facebook] update

- more 'Sec-Fetch-…' headers
- simplify 'text.nameext_from_url()' calls
- replace 'sorted(…)[-1]' with 'max(…)'
- fix '_interval_429' usage
- use replacement fields in logging messages

* [facebook] update URL patterns

get rid of '.*' and '.*?'

* added few remaining tests

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2024-11-26 21:49:11 +01:00
Mike Fährmann
b78c35fd15 [motherless] add 'media' and 'gallery' extractors
(#2074, #4413, #6221)
2024-11-22 21:06:32 +01:00
Mike Fährmann
9b2d782cb7 [pp:classify] rewrite & simplify (#5213)
Do not manually build paths, which get later overwritten by
pathfmt.build_path() anyway. Just update the target directory and let
the rest of the "path logic" handle it.

Fixes skipping previously downloaded and categorized files,
which was broken since 8124c16a50
2024-11-19 08:05:11 +01:00
Mike Fährmann
b069783578 [newgrounds] fix metadata extraction (#6463)
- fix 'comment' metadata
- fix 'following' extractor pattern
- use own 'type' values, since 'og:type' is no longer available
- update test results
2024-11-18 16:21:59 +01:00
Mike Fährmann
50acf2ac84 [danbooru] add 'artist-search' extractor (#5348) 2024-11-17 16:58:54 +01:00