gallery-dl

Author	SHA1	Message	Date
Mike Fährmann	4c8c98a14d	use internal, non-caching version of re.compile for extractor patterns speeds up total compile time of extractor patterns by ~10ms	2025-04-15 22:47:19 +02:00
Mike Fährmann	7a6899c647	[imhentai] support 'hentaienvy.com' and 'hentaizap.com' (#7192 #7218 ) and move 'hentaifox' support to this module as well	2025-03-24 15:33:19 +01:00
hdk5	d900e868e4	[arcalive] add support (#5657 #7100 ) * [arca.live] Add extractor skeleton * [arcalive] update names and formatting * [arcalive] implement initial file extraction code * [arcalive] improve '_extract_media()' performance compile and cache regex on demand * [arcalive] improve image extraction - extract 'data-originalurl' URLs if available - replace URL query strings with 'type=orig' - ignore emoticons by default * [arcalive] update defaults - include 'title' in filenames - use 0.5-1.5s delay between requests * [arcalive] use ext from 'data-orig' if available * [arcalive] update docs/supportedsites * [arcalive] add tests * [arcalive] update 'board' extractor pattern so it doesn't also match 'post' URLs --------- Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>	2025-03-14 10:52:21 +01:00
Mike Fährmann	2f3265a8ae	[tenor] add initial support (#6075 )	2025-03-03 19:04:50 +01:00
CasualYouTuber31	daac2c6e04	[tiktok] add support (#3061 #4177 #5646 #6878 #6708 ) * Add TikTok photo support #3061 #4177 * Address linting errors * Fix more test failures * Forgot to update category names in tests * Looking into re issue * Follow default yt-dlp output template * Fix format string error on 3.5 * Support downloading videos and audio Respond to comments Improve archiving and file naming * Forgot to update supportedsites.md * Support user profiles * Fix indentation * Prevent matching with more than one TikTok extractor * Fix TikTok regex * Support TikTok profile avatars * Fix supportedsites.md * TikTok: Ignore no formats error In my limited experience, this doesn't mean that gallery-dl can't download the photo post (but this could mean that you can't download the audio) * Fix error reporting message * TikTok: Support more URL formats vt.tiktok.com www.tiktok.com/t/ * TikTok: Only download avatar when extracting user profile * TikTok: Document profile avatar limitation * TikTok: Add support for www.tiktokv.com/share links * Address Share -> Sharepost issue * TikTok: Export post's creation date in JSON (ISO 8601) * [tiktok] update * [tiktok] update 'vmpost' handling just perform a HEAD request and handle its response * [tiktok] build URLs from post IDs instead of reusing unchanged input URLs * [tiktok] combine 'post' and 'sharepost' extractors * [tiktok] update default filenames put 'id' and 'num' first to ensure better file order * [tiktok] improve ytdl usage - speed up extraction by passing '"extract_flat": True' - pass more user options and cookies - pre-define 'TikTokUser' extractor usage * [tiktok] Add _COOKIES entry to AUTH_MAP * [tiktok] Always download user avatars * [tiktok] Add more documentation to supportedsites.md * [tiktok] Address review comments --------- Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>	2025-02-25 20:10:48 +01:00
Mike Fährmann	52d4e1a100	[imhentai] inherit from BaseExtractor combine all imhentai-like sites into one module	2025-02-19 22:14:52 +01:00
Mike Fährmann	d4c56b08d7	[hentaiera] add support (#3046 #6952 #7020 )	2025-02-19 17:42:04 +01:00
Mike Fährmann	4396029d36	[furry34] add support (#1078 #7018 )	2025-02-19 16:35:48 +01:00
Mike Fährmann	82493a6672	[hentairox] add support (#7003 )	2025-02-18 21:45:30 +01:00
Luca Russo	95c446fcd1	[discord] add support (#6836 ) * first commit * add -- * skip video embeds * fix typo * removed ambiguity * add category support * code tweaks * more reliable embed extraction * handle 403 errors (testing done) * added "parent_id" keyword * added "parent", "parent_type" keywords the extractor should be now ready to merge! * removed unnecessary dict unpacking * added empty text messages extraction * added "channel_topic" * even more metadata extraction can now extract all embeds images & text, as well as server banners. also code is much better. * added user avatar and banner * better pagination * fix regression * minor tweaks * Made requested changes	2025-02-18 18:45:39 +01:00
Mike Fährmann	55034d9638	[imhentai] add support (#1660 #3046 #3824 #4338 #5936 )	2025-02-10 21:42:07 +01:00
Mike Fährmann	b271a874ed	[fanleaks] remove module DNS record of fanleaks.club no longer exists	2025-01-26 16:35:46 +01:00
Mike Fährmann	05fa6dd354	[nekohouse] add initial support (#5241 , #6738 )	2025-01-20 20:15:34 +01:00
Mike Fährmann	438c61601b	[xfolio] add initial support (#5514 , #6351 , #6837 )	2025-01-18 15:57:56 +01:00
Mike Fährmann	bde99cc6ce	[cohost] remove module cohost.org now redirects to archive.org	2025-01-13 14:38:35 +01:00
Mike Fährmann	91bd3e37f2	[pexels] add support (#2286 , #4214 , #6769 )	2025-01-12 16:50:12 +01:00
Mike Fährmann	1d75c8308c	[weebcentral] add support (#6778 )	2025-01-10 23:04:51 +01:00
Mike Fährmann	63008f77e2	merge #6607 : [lofter] add initial support (#650, #2294, #4095, #4728, #5656)	2024-12-11 20:41:52 +01:00
Mike Fährmann	86334f9c4a	[yiffverse] add support (#6611 )	2024-12-11 10:57:21 +01:00
hdk5	0466fcab4c	[lofter]: add initial support	2024-12-08 19:37:42 +02:00
Mike Fährmann	ef7ff31117	[realbooru] fix extraction (#6543 ) - extract data from HTML pages since API is no longer usable - move code into its own separate 'realbooru' module	2024-12-07 17:39:25 +01:00
Luca Russo	e9370b7b8a	merge #5626 : [facebook] add support (#470 , #2612 ) * [facebook] add initial support * renamed extractors & subcategories * better stability, modularity & naming * added single photo extractor, warnings & retries * more metadata + extract author followups * renamed "album" mentions to "set" for consistency * cookies are now only used when necessary also added author followups for singular images * removed f-strings * added way to continue extraction from where it left off also fixed some bugs * fixed bug wrong subcategory * added individual video extraction * extract audio + added ytdl option * updated setextract regex * added option to disable start warning the extractor should be ready :) * fixed description metadata bug * removed cookie "safeguard" + fixed for private profiles I have removed the cookie "safeguard" (not using cookies until they are necessary) as I've come to the conclusion that it does more harm than good. There is no way to detect whether the extractor has skipped private images, that could have been possibly extracted otherwise. Also, doing this provides little to no advantages. * fixed a few bugs regarding profile parsing * a few bugfixes Fixed some metadata attributes from not decoding correctly from non-latin languages, or not showing at all. Also improved few patterns. * retrigger checks * Final cleanups -Added tests -Fixed video extractor giving incorrect URLs -Removed start warning -Listed supported site correctly * fixed regex * trigger checks * fixed livestream playback extraction + bugfixes I've chosen to remove the "reactions", "comments" and "views" attributes as I've felt that they require additional maintenance even though nobody would ever actually use them to order files. I've also removed the "title" and "caption" video attributes for their inconsistency across different videos. Feel free to share your thoughts. * fixed regex * fixed filename fallback * fixed retrying when a photo url is not found * fixed end line * post url fix + better naming * fix posts * fixed tests * added profile.php url * made most of the requested changes * flake * archive: false * removed unnecessary url extract * [facebook] update - more 'Sec-Fetch-…' headers - simplify 'text.nameext_from_url()' calls - replace 'sorted(…)[-1]' with 'max(…)' - fix '_interval_429' usage - use replacement fields in logging messages * [facebook] update URL patterns get rid of '.' and '.?' * added few remaining tests --------- Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>	2024-11-26 21:49:11 +01:00
Mike Fährmann	d1ad97ae0c	[motherless] add to 'modules' list	2024-11-22 21:18:13 +01:00
hdk5	6eef3e3495	[bilibili] initial support (#2824 )	2024-11-10 00:21:27 +02:00
Mike Fährmann	cb0d8cae77	merge #6227 : [everia] add support (#1067 , #2472 , #4091 )	2024-11-03 17:52:17 +01:00
missionfloyd	d31a3b5da3	[everia.club] Add support - Unescape title and URL - Add tags and categories metadata Lookup tag id with API instead of downloading tag page - Add category extractor - Add tests - Rename EveriaExtractor to EveriaPostExtractor - Fix EveriaPostExtractor example - Lookup tags/categories by post id - Add date extractor - Remove leftover pages parameter - Add error handling for invalid dates. - Add filename numbering Parse date - Rename extract() to images() - Remove html import - Fix search/date URLs with page number - Fix tag/category search - Fix post extractor - Fix tag, category extractors - Fix search extractor - Only load first page once - Fix date extractor - Fix tests - Clean up search extractor	2024-11-03 14:09:07 +01:00
Mike Fährmann	d787c0c4ea	[rule34xyz] add support (#1078 , #4960 )	2024-11-03 10:12:26 +01:00
Mike Fährmann	655e42dc92	merge #6240 : [rule34vault] add support (#5708 )	2024-10-28 22:31:05 +01:00
ssdaniel24	3d0263b3ab	[rule34vault] Added initial support for rule34vault.com - Added playlists support for rule34vault - Added support for posts in rule34vault - Fixed supported sites with script - Fixed posts pattern in rule34vault - Added tests for rule34vault - Clean - Fixed lint warnings	2024-10-28 22:26:47 +01:00
Mike Fährmann	5de8576ff6	[noop] add 'noop' extractor	2024-10-28 19:45:24 +01:00
Mike Fährmann	10c076e7f2	[saint] add 'album' and 'media' extractors (#4405 , #6324 )	2024-10-27 22:27:30 +01:00
Mike Fährmann	66aa514c25	[scrolller] add initial support (#295 , #3418 , #5051 )	2024-10-21 14:17:18 +02:00
Mike Fährmann	4a1cbe94a9	[pururin] remove module "This domain name has been seized in accordance with a seizure warrant issued by the United States District Court for the District of Idaho"	2024-10-10 15:57:17 +02:00
Mike Fährmann	1ad58cab84	[boosty] add initial support (#2387 )	2024-10-02 20:39:55 +02:00
Mike Fährmann	93eca64a73	[civitai] add initial support (#3706 , #3787 , #4129 , #5995 )	2024-09-20 17:21:17 +02:00
Mike Fährmann	638a676495	[ao3] add initial support (#6013 )	2024-09-15 22:38:21 +02:00
Mike Fährmann	df0d7d4a12	[cohost] add 'user' and 'post' extractors (#4483 )	2024-09-11 18:03:33 +02:00
Mike Fährmann	399ba85841	[fallenangels] remove module	2024-07-30 17:33:16 +02:00
Mike Fährmann	aa6d00613f	[cien] initial support (#2885 , #4103 , #5240 )	2024-07-28 19:27:12 +02:00
Mike Fährmann	c9aeedeafd	[koharu] add 'gallery' and 'search' extractors (#5893 , #4707 )	2024-07-28 12:22:18 +02:00
Mike Fährmann	226ead728e	[agnph] add 'tag' and 'post' extractors (#5284 , #5890 )	2024-07-27 12:17:47 +02:00
Mike Fährmann	8fce9ea6d5	[hentainexus] restore module (#5275 ) revert `97641cd151`	2024-06-05 16:48:25 +02:00
Mike Fährmann	ce228ee163	[photobucket] remove module had been broken for years and the new site is payed access only	2024-06-02 01:40:31 +02:00
Mike Fährmann	8a11b72253	remove extractor/test.py (#4504 )	2024-02-27 01:37:57 +01:00
Mike Fährmann	cf7d6be2d4	[bluesky] initial support (#4438 , #4708 , #4722 , #5047 )	2024-02-07 19:09:33 +01:00
Mike Fährmann	6f8592eaff	[hbrowse] remove from modules list	2024-01-20 18:25:38 +01:00
Ailothaen	e33056adcd	[wikimedia] Add Wikipedia/Wikimedia extractor	2024-01-16 02:32:25 +01:00
hunter-gatherer8	6c4abc982e	[2ch] add 'thread' and 'board' extractors - [2ch] add thread extractor - [2ch] add board extractor - [2ch] add new entry to supported sites	2024-01-15 03:51:03 +01:00
Mike Fährmann	355b909f46	merge #5041 : [steamgriddb] add support (#5033 )	2024-01-13 00:59:15 +01:00
blankie	2ccb7d3bd3	[steamgriddb] add support	2024-01-09 17:12:56 +11:00

1 2 3 4 5 ...

390 Commits