Commit Graph

454 Commits

Author SHA1 Message Date
Mike Fährmann
11127726e4 [scripts/pyprint] update maxlen selection
- consider only values smaller than 'lmax' instead of taking the max of
  all lengths and trimming that
- remove 'lmin'
2025-07-27 09:37:45 +02:00
Mike Fährmann
4138ac06e2 [scripts/pyprint] fix indentation 2025-07-25 23:43:08 +02:00
Mike Fährmann
d99b2ab47a [imgadult] add 'image' extractor (#7893) 2025-07-25 18:58:29 +02:00
Mike Fährmann
3eb0b28d6d [facebook] implement 'include' option & add 'avatar' extractor (#7848)
rename 'profile' extractor to 'photos'
2025-07-25 18:20:05 +02:00
Farahat
cf2e5a1619 [leakgallery] add support (#7872)
* add new extractor for leakgallery.com

    Added support for downloading photo and video posts from leakgallery.com.

    Supports:
    * Individual post URLs
    * User profile URLs with pagination via AJAX
    * Optional type/sort filters (e.g. /Photos/MostRecent)
    * Proper file extension handling
    * Creator-based folder structure
    * Compatibility with --download-archive

    Tested locally and functional, but may still need review or improvement.
    
* [leakgallery] add support
    Added leakgallery to extractor module imports so it's recognized and used.
* [leakgallery] update extractor structure
    - Refactored using LeakGalleryExtractorBase to remove duplication
    - Moved init logic into items() using self.groups
    - Replaced re with text.re as per upstream guidance
    - Added creator fallback and media deduplication
    - Aligned structure with gallery-dl maintainer review tips
* [leakgallery] add support
    - Added leakgallery entry to supportedsites.md
    - Includes post, user, trending, and most-liked subcategories
* add exported extractor results
* [leakgallery] fix flake8 style issues
    Cleaned up code to comply with flake8 rules, especially:
    - removed unused imports
    - split long lines >79 chars
    - ensured newline at EOF
    No functional changes made; purely formatting to satisfy CI checks.
* [tests] update extractor results
* [leakgallery] fix flake8 style issues (part 2)
    Fix remaining flake8 issues in leakgallery.py:
    - Reformat line breaks to avoid W503 (line break before binary operator)
    - Wrap long lines to respect E501 (line too long > 79 characters)
    - Cleaned up exception logging for better clarity
    - Confirmed all flake8 checks now pass successfully
    This superseedes the previous commit which partially fixed formatting violations.
* [leakgallery] fix flake8 style issues (part 3)
* [leakgallery] rename extractor classes
* [tests] update extractor results
* [tests] rename extractor results
* [leakgallery] rename extractor classes (part 2)
* [leakgallery] rename example
* update docs/supportedsites
* update test results
    and convert line endings to '\n'
* update
    - convert line endings to '\n'
    - use _pagination method
    - fix logging calls
* return more metadata for _pagination() results
2025-07-22 22:50:25 +02:00
Mike Fährmann
e8b2a496ba [scripts] ensure files use 'utf-8' encoding and '\n' newlines (#7872) 2025-07-22 20:57:54 +02:00
Mike Fährmann
0b991148a1 [civitai] rename 'generate' to 'generated' (#7796) 2025-07-19 18:30:29 +02:00
Mike Fährmann
67a4472bc2 [civitai] add 'generate' extractor (#7796) 2025-07-18 18:34:17 +02:00
Mike Fährmann
1561284815 [madokami] add 'manga' extractor (#7828) 2025-07-17 20:40:26 +02:00
Mike Fährmann
493fc483c6 [scripts/init] handle subdomains when building BASE_PATTERN 2025-07-17 18:38:54 +02:00
Mike Fährmann
df946faf40 [scripts/init] fix extra blank line without copyright
1686f32a0d (commitcomment-162021403)
2025-07-14 16:54:21 +02:00
NecRaul
a7ebb835ea [iwara] Add support (#2652 #5840 #7785)
* [iwara] Add initial support
* [iwara] Add search support
* [iwara] Code cleanup
* [iwara] Small fixes and additions
* [iwara] Add tag support
* [iwara] Add mime-type to metadata
* [iwara] Refactor patterns/matching using urllib
* [iwara] Add unit tests
* [iwara] Update docs
* [iwara] Fix linting on older Python versions
* [iwara] update 'IwaraAPI' interface class
    - define endpoints inside methods
    - implement and use _call() and _pagination()
    - cache auth tokens
* [iwara] split and rename 'profile' extractor
    TODO:
    - update test results
    - simplify code
* [iwara] simplify '_user_params()' usage
* [iwara] update 'video' extractor
    and move user data extraction into 'yield_video'
* [iwara] update 'image' extractor
    and move user info extraction into 'yield_image()'
* [iwara] update 'playlist' extractor
* [iwara] update 'search' extractor
* [iwara] update 'tag' extractor
* [iwara] simplify 'yield_image' usage
    perform API calls to get full 'files' list inside the function
* [iwara] add video "image" test
* [iwara] provide 'date' metadata
* [iwara] simplify 'source()'
    remove urllib.parse usage
* [iwara] small optimizations
    * get("key", {}) -> get("key") or {}
    * split("…", 1) -> partition("…")
    * use f-strings for all patterns
* [iwara] add missing 'keyarg=1' to profile() memcache decorator
* [tests/iwara] update results
* [iwara] extract more 'user' metadata
* [iwara] update default format strings
    include 'date' in filenames to order them chronologically
* [iwara] restructure image/video handling
    - use less generators
    - make processing individual media items non-fatal
* [iwara] fix login and token handling
* [iwara] add 'favorite' extractor
* [iwara] add 'following' and 'followers' extractors

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2025-07-13 21:30:25 +02:00
Mike Fährmann
1686f32a0d [scripts/init] split into separate scripts
- init.py:
    - generate initial extractor module code and test result file
    - insert new entries into modules list and site names
- generate_test_result.py:
    - generate test result for a given URL
    - insert it into the test result file generated by init.py
      (or an already existing one)
2025-07-12 21:14:29 +02:00
Mike Fährmann
b6bd675a9e [scripts/pre-commit] disable user site-packages when running flake8 2025-07-07 15:07:56 +02:00
Mike Fährmann
82891b4d0c [pixiv] move 'novel' extractors to a 'pixiv-novel' category (#7746)
TODO:
- restore full 'include' functionality
- allow remapping category:subcategory pairs
2025-07-04 20:13:19 +02:00
Mike Fährmann
e7922ababd [naver] change categories (#7746)
- 'naver'        -> 'naver-blog'
- 'chzzk'        -> 'naver-chzzk'
- 'naverwebtoon' -> 'naver-webtoon'
2025-07-02 23:20:40 +02:00
Mike Fährmann
5e61fe8668 [rule34xyz] implement login with username & password (#7736) 2025-06-27 22:35:59 +02:00
Mike Fährmann
c1db879b6c [scripts] publish 'pre-commit' hook script (#6582)
https://github.com/mikf/gallery-dl/issues/6582#issuecomment-3010067010
2025-06-26 23:55:12 +02:00
Mike Fährmann
dd759d34dd [scripts/init] support adding test results via '--url' 2025-06-23 17:16:11 +02:00
Mike Fährmann
68960e29a1 [dankefuerslesen] add support (#7669) 2025-06-22 12:13:12 +02:00
Mike Fährmann
eaeabda7ac [scripts] implement 'init.py'
Initial attempt at a helper script to generate new extractor module
files and the required boilerplate code.
2025-06-22 10:13:06 +02:00
Mike Fährmann
60cb4468b2 [options] update --help Usage formatting 2025-06-20 23:43:22 +02:00
Mike Fährmann
1f429da650 [scripts/options] make output width independent of terminal size 2025-06-17 18:52:46 +02:00
SpiffyChatterbox
e0f65be36b [nudostar] add support (#5735 #6556)
* Drafting initial basic extractor layout
* Better debug logging
* Update nudostar.py
    Still tinkering
* Update nudostar.py
    Basic extractor is working. Now starting on Gallery
* Update nudostar.py
    Still a work in progress.
    Got individual posts working, galleries are not.
* Update nudostar.py
* Site now appears working. Added Tests.
* PEP Updates
* PEP - Line Length Updates
* Update nudostar.py
    Resolving PEP8 issues.
* update 'gallery' extractor, rename to 'model'
* update 'image' extractor
* expand tests
* update docs/supportedsites

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2025-06-16 19:21:49 +02:00
SpiffyChatterbox
48ac41605d [redbust] add support (#6759 #6918 #7043)
* init - Redbust.com Support
* Added Test
    Could use a second set of eyes on this
* update 'gallery' extractor
    - extract more metadata
    - simplify image extraction
    - support legacy galleries
* add tests
* update 'image' extractor
* add 'tag' extractor
* add 'archive' extractor
* restrict 'image' extractor pattern
* update docs/supportedsites
* replace quotes inside f-string

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2025-06-16 12:10:42 +02:00
hunter-gatherer8
96f5cfb305 [girlswithmuscle] add support (#4493 #6016)
* [girlswithmuscle] init
* [girlswithmuscle]: fix metadata extraction (site layout change)
* [girlswithmuscle]: fix tags extraction (site layout change)
* update login code
* update 'post' extractor
* update 'gallery' extractor, rename to 'search' extractor
* update docs
* add test cases

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2025-06-14 23:05:49 +02:00
Mike Fährmann
3e423937d2 [misskey] implement 'include' option (#5347) 2025-06-06 20:52:03 +02:00
Mike Fährmann
09eed2a515 [docs:supportedsites] update 'civitai' entries (#7608 #7609) 2025-06-01 10:33:31 +02:00
Mike Fährmann
ec523c2c2c [mangasee] remove module 2025-05-30 18:04:55 +02:00
Mike Fährmann
922c296482 [kemono][coomer][schalenetwork] rename modules & extractors
category changes:

- kemonoparty -> kemono
- coomerparty -> coomer
- koharu      -> schalenetwork

also wanted to rename '2chan' -> 'sturdychan',
but the site's main page is still titled '2chen'
2025-05-30 17:51:49 +02:00
Mike Fährmann
129fc00962 [pyinstaller] exclude 'pkg_resources' module (#7592) 2025-05-28 09:30:11 +02:00
bradenhilton
3ba4404d21 [pixeldrain] add support for filesystem URLs (#7473) 2025-05-21 17:28:09 +02:00
Mike Fährmann
7907d0d3bd [mangadex] add 'following' extractor (#7487)
also fixes the URL pattern for the Updates feed at
https://mangadex.org/titles/feed
2025-05-12 12:58:22 +02:00
Mike Fährmann
7b2bcf68a5 [manganelo] support 'nelomanga.net' and mirror domains (#7423)
- natomanga.com
- nelomanga.net
- manganato.gg
- mangakakalot.gg
2025-04-29 21:12:37 +02:00
Mike Fährmann
f7cd4367c6 [chevereto] support 'imagepond.net' (#7278) 2025-04-01 10:41:54 +02:00
Mike Fährmann
24bbcbcfa3 [danbooru] add 'favgroup' extractor 2025-03-26 20:58:49 +01:00
Mike Fährmann
7a6899c647 [imhentai] support 'hentaienvy.com' and 'hentaizap.com' (#7192 #7218)
and move 'hentaifox' support to this module as well
2025-03-24 15:33:19 +01:00
Mike Fährmann
31e57bafab [arcalive] add 'user' extractor (#5657) 2025-03-14 18:58:10 +01:00
Mike Fährmann
fa7114ee20 [docs] update supportedsites 2025-02-28 10:48:28 +01:00
CasualYouTuber31
daac2c6e04 [tiktok] add support (#3061 #4177 #5646 #6878 #6708)
* Add TikTok photo support

#3061
#4177

* Address linting errors

* Fix more test failures

* Forgot to update category names in tests

* Looking into re issue

* Follow default yt-dlp output template

* Fix format string error on 3.5

* Support downloading videos and audio

Respond to comments
Improve archiving and file naming

* Forgot to update supportedsites.md

* Support user profiles

* Fix indentation

* Prevent matching with more than one TikTok extractor

* Fix TikTok regex

* Support TikTok profile avatars

* Fix supportedsites.md

* TikTok: Ignore no formats error

In my limited experience, this doesn't mean that gallery-dl can't download the photo post (but this could mean that you can't download the audio)

* Fix error reporting message

* TikTok: Support more URL formats

vt.tiktok.com
www.tiktok.com/t/

* TikTok: Only download avatar when extracting user profile

* TikTok: Document profile avatar limitation

* TikTok: Add support for www.tiktokv.com/share links

* Address Share -> Sharepost issue

* TikTok: Export post's creation date in JSON (ISO 8601)

* [tiktok] update

* [tiktok] update 'vmpost' handling

just perform a HEAD request and handle its response

* [tiktok] build URLs from post IDs

instead of reusing unchanged input URLs

* [tiktok] combine 'post' and 'sharepost' extractors

* [tiktok] update default filenames

put 'id' and 'num' first to ensure better file order

* [tiktok] improve ytdl usage

- speed up extraction by passing '"extract_flat": True'
- pass more user options and cookies
- pre-define 'TikTokUser' extractor usage

* [tiktok] Add _COOKIES entry to AUTH_MAP

* [tiktok] Always download user avatars

* [tiktok] Add more documentation to supportedsites.md

* [tiktok] Address review comments

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2025-02-25 20:10:48 +01:00
Mike Fährmann
a9853cd273 merge #6781: [bilibili] add 'user-articles-favorite' extractor (#6725) 2025-02-23 18:19:51 +01:00
mmmpipi
e4cc3419c5 add bilibili User Articles FavList support
- fix whitespace
- fix extractor names
- Add favlist url user check
- apply changes
- add test
- update docs/supportedsites
2025-02-23 18:18:45 +01:00
Mike Fährmann
fe958ed5d9 merge #6768: [boosty] add 'direct-messages' extractor 2025-02-23 18:17:10 +01:00
Dominik Prange
ff5f6fe70f [boosty] added new direct message extractor
- formatting
- fixed linting formatting errors
- fixed E999 SyntaxError: invalid syntax
- fixed class naming
- fixed mandatory extractor.boosty.metadata as true requirement
- update
  - apply changes
  - add test
  - update docs/supportedsites
- improve 'dialog' pagination logic
2025-02-23 18:14:59 +01:00
Mike Fährmann
b1487df381 [scripts/pull-request] handle branch already existing 2025-02-23 18:12:14 +01:00
Mike Fährmann
52d4e1a100 [imhentai] inherit from BaseExtractor
combine all imhentai-like sites into one module
2025-02-19 22:14:52 +01:00
Mike Fährmann
d4c56b08d7 [hentaiera] add support (#3046 #6952 #7020) 2025-02-19 17:42:04 +01:00
Mike Fährmann
4396029d36 [furry34] add support (#1078 #7018) 2025-02-19 16:35:48 +01:00
Mike Fährmann
82493a6672 [hentairox] add support (#7003) 2025-02-18 21:45:30 +01:00
Luca Russo
95c446fcd1 [discord] add support (#6836)
* first commit

* add --

* skip video embeds

* fix typo

* removed ambiguity

* add category support

* code tweaks

* more reliable embed extraction

* handle 403 errors (testing done)

* added "parent_id" keyword

* added "parent", "parent_type" keywords

the extractor should be now ready to merge!

* removed unnecessary dict unpacking

* added empty text messages extraction

* added "channel_topic"

* even more metadata extraction

can now extract all embeds images & text, as well as server banners. also code is much better.

* added user avatar and banner

* better pagination

* fix regression

* minor tweaks

* Made requested changes
2025-02-18 18:45:39 +01:00