Commit Graph

385 Commits

Author SHA1 Message Date
Mike Fährmann
f867e690c1 merge #6855: [turboimagehost] add support for galleries 2025-01-19 17:51:48 +01:00
arebokert
556fbb1a44 [turboimagehost] add support for galleries
- added support
- raise error if gallery not found
- fix test
- fix lint issues
- simplify
2025-01-19 17:28:45 +01:00
Mike Fährmann
438c61601b [xfolio] add initial support (#5514, #6351, #6837) 2025-01-18 15:57:56 +01:00
Mike Fährmann
6e919a3695 [e621] support e621.cc and e621.anthro.fr frontend URLs (#6809) 2025-01-15 14:35:37 +01:00
Mike Fährmann
bde99cc6ce [cohost] remove module
cohost.org  now redirects to archive.org
2025-01-13 14:38:35 +01:00
Mike Fährmann
91bd3e37f2 [pexels] add support (#2286, #4214, #6769) 2025-01-12 16:50:12 +01:00
Mike Fährmann
1d75c8308c [weebcentral] add support (#6778) 2025-01-10 23:04:51 +01:00
Mike Fährmann
167a726972 [szurubooru] support 'visuabusters.com/booru' (#6729) 2024-12-26 19:04:16 +01:00
Mike Fährmann
998f949db1 [civitai] add 'user-videos' extractor (#6644) 2024-12-26 10:18:54 +01:00
Mike Fährmann
63008f77e2 merge #6607: [lofter] add initial support
(#650, #2294, #4095, #4728, #5656)
2024-12-11 20:41:52 +01:00
Mike Fährmann
717081dabd [lofter] update
- add tests
- update docs/supportedsites
- provide 'date' metadata
- simplify/restructure some code
2024-12-11 20:39:01 +01:00
Mike Fährmann
0e942f0829 merge #6613: [itaku] add 'search' extractor 2024-12-11 11:54:33 +01:00
Mike Fährmann
b58af14bdb [itaku] update
- simplify code
- update docs/supportedsites
- update test results
2024-12-11 11:52:42 +01:00
Mike Fährmann
86334f9c4a [yiffverse] add support (#6611) 2024-12-11 10:57:21 +01:00
Mike Fährmann
47311352de [cyberdrop] add extractor for media URLs (#2496)
https://github.com/mikf/gallery-dl/issues/2496#issuecomment-2495467133
2024-12-08 20:57:12 +01:00
Mike Fährmann
ef7ff31117 [realbooru] fix extraction (#6543)
- extract data from HTML pages since API is no longer usable
- move code into its own separate 'realbooru' module
2024-12-07 17:39:25 +01:00
Mike Fährmann
624dc7f407 [bluesky] add 'info' extractor 2024-12-05 08:36:33 +01:00
Mike Fährmann
d96717e2e6 [hentaicosplays] update domains (#6578)
inherit from BaseExtractor to make differentiating between sites easier
2024-12-03 13:56:32 +01:00
Luca Russo
e9370b7b8a merge #5626: [facebook] add support (#470, #2612)
* [facebook] add initial support

* renamed extractors & subcategories

* better stability, modularity & naming

* added single photo extractor, warnings & retries

* more metadata + extract author followups

* renamed "album" mentions to "set" for consistency

* cookies are now only used when necessary

also added author followups for singular images

* removed f-strings

* added way to continue extraction from where it left off

also fixed some bugs

* fixed bug wrong subcategory

* added individual video extraction

* extract audio + added ytdl option

* updated setextract regex

* added option to disable start warning

the extractor should be ready :)

* fixed description metadata bug

* removed cookie "safeguard" + fixed for private profiles

I have removed the cookie "safeguard" (not using cookies until they are necessary) as I've come to the conclusion that it does more harm than good. There is no way to detect whether the extractor has skipped private images, that could have been possibly extracted otherwise. Also, doing this provides little to no advantages.

* fixed a few bugs regarding profile parsing

* a few bugfixes

Fixed some metadata attributes from not decoding correctly from non-latin languages, or not showing at all.
Also improved few patterns.

* retrigger checks

* Final cleanups

-Added tests
-Fixed video extractor giving incorrect URLs
-Removed start warning
-Listed supported site correctly

* fixed regex

* trigger checks

* fixed livestream playback extraction + bugfixes

I've chosen to remove the "reactions", "comments" and "views" attributes as I've felt that they require additional maintenance even though nobody would ever actually use them to order files.
I've also removed the "title" and "caption" video attributes for their inconsistency across different videos.
Feel free to share your thoughts.

* fixed regex

* fixed filename fallback

* fixed retrying when a photo url is not found

* fixed end line

* post url fix + better naming

* fix posts

* fixed tests

* added profile.php url

* made most of the requested changes

* flake

* archive: false

* removed unnecessary url extract

* [facebook] update

- more 'Sec-Fetch-…' headers
- simplify 'text.nameext_from_url()' calls
- replace 'sorted(…)[-1]' with 'max(…)'
- fix '_interval_429' usage
- use replacement fields in logging messages

* [facebook] update URL patterns

get rid of '.*' and '.*?'

* added few remaining tests

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2024-11-26 21:49:11 +01:00
Mike Fährmann
b78c35fd15 [motherless] add 'media' and 'gallery' extractors
(#2074, #4413, #6221)
2024-11-22 21:06:32 +01:00
Mike Fährmann
50acf2ac84 [danbooru] add 'artist-search' extractor (#5348) 2024-11-17 16:58:54 +01:00
Mike Fährmann
e5c2882320 [koharu] cleanup
- update BASE_PATTERN formatting
- fix groups indices
- add tests for new domains
- update docs/supportedsites
2024-11-15 22:41:40 +01:00
Mike Fährmann
a3276e3b5d [hentaifoundry] add 'tag' extractor (#6465) 2024-11-13 20:56:37 +01:00
Mike Fährmann
82d561e825 [bilibili] update
- use self.groups[…] to access matched values
- extract more metadata (count, width, height, size)
- remove type hint
- add tests
- update docs/supportedsites
2024-11-10 17:59:24 +01:00
Mike Fährmann
6205e255f4 merge #6394: [tumblr] add 'search' extractor 2024-11-08 08:17:46 +01:00
Mike Fährmann
0b3ddd01af [hiperdex] update domain to 'hipertoon.com' (#6420)
and fix 'description' extraction
2024-11-05 15:54:42 +01:00
Mike Fährmann
cb0d8cae77 merge #6227: [everia] add support (#1067, #2472, #4091) 2024-11-03 17:52:17 +01:00
Mike Fährmann
cea062ffc5 [everia] update
- implement general _pagination method
- simplify code
- adjust URL patterns
- update test results
2024-11-03 17:51:04 +01:00
missionfloyd
d31a3b5da3 [everia.club] Add support
- Unescape title and URL
- Add tags and categories metadata
    Lookup tag id with API instead of downloading tag page
- Add category extractor
- Add tests
- Rename EveriaExtractor to EveriaPostExtractor
- Fix EveriaPostExtractor example
- Lookup tags/categories by post id
- Add date extractor
- Remove leftover pages parameter
- Add error handling for invalid dates.
- Add filename numbering
    Parse date
- Rename extract() to images()
- Remove html import
- Fix search/date URLs with page number
- Fix tag/category search
- Fix post extractor
- Fix tag, category extractors
- Fix search extractor
- Only load first page once
- Fix date extractor
- Fix tests
- Clean up search extractor
2024-11-03 14:09:07 +01:00
Mike Fährmann
d787c0c4ea [rule34xyz] add support (#1078, #4960) 2024-11-03 10:12:26 +01:00
Mike Fährmann
6f54328a39 [hitomi] update
- remove f-strings
- fix flake8 warnings
- move tests to test/results/hitomi.py
2024-10-29 16:56:52 +01:00
Allen
0f94fa9015 [tumblr] search extractor minimal styling changes 2024-10-29 13:06:23 +01:00
Mike Fährmann
655e42dc92 merge #6240: [rule34vault] add support (#5708) 2024-10-28 22:31:05 +01:00
ssdaniel24
3d0263b3ab [rule34vault] Added initial support for rule34vault.com
- Added playlists support for rule34vault
- Added support for posts in rule34vault
- Fixed supported sites with script
- Fixed posts pattern in rule34vault
- Added tests for rule34vault
- Clean
- Fixed lint warnings
2024-10-28 22:26:47 +01:00
Mike Fährmann
10c076e7f2 [saint] add 'album' and 'media' extractors (#4405, #6324) 2024-10-27 22:27:30 +01:00
Mike Fährmann
a4791f5243 [bluesky] add 'hashtag' extractor (#4438)
https://github.com/mikf/gallery-dl/issues/4438#issuecomment-2439979958
2024-10-27 13:59:46 +01:00
Mike Fährmann
0fd98f67ba [mangadex] add 'author' extractor (#6372) 2024-10-24 14:57:17 +02:00
Mike Fährmann
66aa514c25 [scrolller] add initial support (#295, #3418, #5051) 2024-10-21 14:17:18 +02:00
Mike Fährmann
69a75b1de2 [civitai] add extractors for global 'models' and 'images' (#6310) 2024-10-16 23:00:51 +02:00
Mike Fährmann
09d4c281b6 [shimmie2] remove 'loudbooru.com' 2024-10-10 18:32:42 +02:00
Mike Fährmann
6807bf9c11 [komikcast] update domain to 'komikcast.cz' 2024-10-10 16:52:40 +02:00
Mike Fährmann
bcd920e24d [lolisafe] remove 'xbunkr.com' 2024-10-10 16:19:08 +02:00
Mike Fährmann
4a1cbe94a9 [pururin] remove module
"This domain name has been seized in accordance with a seizure warrant
 issued by the United States District Court for the District of Idaho"
2024-10-10 15:57:17 +02:00
Mike Fährmann
3194bcbccc [blogger] remove 'micmicidol.club' 2024-10-10 14:23:58 +02:00
Mike Fährmann
a25aa26577 [chevereto] remove 'deltaporno.com' 2024-10-10 13:07:08 +02:00
Mike Fährmann
9757eacce1 [civitai] add 'post' extractors (#6279)
- https://civitai.com/posts/12345
- https://civitai.com/user/USER/posts
2024-10-06 17:48:48 +02:00
Mike Fährmann
7f945c44f5 [pixiv] support unlisted artworks (#5162) 2024-10-05 17:10:03 +02:00
Mike Fährmann
274d99e7d6 [boosty] add 'feed' and 'following' extractors (#2387) 2024-10-03 18:09:31 +02:00
Mike Fährmann
1ad58cab84 [boosty] add initial support (#2387) 2024-10-02 20:39:55 +02:00
Mike Fährmann
a937b72034 [ao3] add 'subscriptions' extractor (#6247) 2024-09-29 13:01:51 +02:00