Commit Graph

2688 Commits

Author SHA1 Message Date
Mike Fährmann
bd6ec5c352 [foolfuuka] match 4chan filenames (#2577)
introduce two new metadata fields:
- filename_media: original filename of file uploaded to 4chan
- timestamp_ms  : timestamp with millisecond precision (tim)
2022-05-15 14:39:54 +02:00
Mike Fährmann
feb470d19a [shopifx] natively support a few more sites (closes #2089)
- chelseacrew.com
- michaels.com.au
- modcloth.com
- pinupgirlclothing.com
- raidlondon.com (loveraid.com)
- unique-vintage.com
2022-05-10 15:49:36 +02:00
Mike Fährmann
60f4d59b1e [gelbooru_v01] remove 'tlb.booru.org' from supported domains
403 Forbidden
nginx

it is also no longer listed on https://booru.org/top
2022-05-10 12:23:05 +02:00
Mike Fährmann
6b6eb0b8f6 [lolisafe] implement 'domain' option (#2575) 2022-05-10 12:17:59 +02:00
Mike Fährmann
d26da3b9e5 add pre-generated 'pattern' for supported BaseExtractor sites 2022-05-09 22:20:09 +02:00
Mike Fährmann
6ae3a5cdb0 [pixiv] make retrieving ugoira metadata non-fatal (#2562) 2022-05-08 20:05:38 +02:00
Mike Fährmann
6742f3bc1e implement --cookies-from-browser (#1606)
most of the code is adapted from yt-dlp's implementation
and *should* work the same.
2022-05-07 23:06:37 +02:00
Mike Fährmann
c4b9f7bab8 update functions working with cookies.txt files
- rename
  - load_cookiestxt -> cookiestxt_load
  - save_cookiestxt -< cookiestxt_store
- in cookiestxt_load, add cookies directly to a cookie jar
  instead of storing them in a list first
- other unnoticeable performance increases
2022-05-06 13:21:29 +02:00
Mike Fährmann
f190018e37 [mangasee] use randomly generated PHPSESSID cookie (#2560) 2022-05-05 19:35:32 +02:00
Mike Fährmann
4c47dfffdd [instagram] report redirects to captcha challenges (#2543) 2022-05-05 13:18:24 +02:00
Mike Fährmann
4598d32370 [imgur] prevent exception for empty albums (closes #2557) 2022-05-04 17:34:50 +02:00
Mike Fährmann
435e9c5d2e [vk] report errors for private albums (#2556) 2022-05-04 17:34:50 +02:00
Mike Fährmann
9adea93aef [pixiv] updates to avatar/background extractors (#2495)
- add 'date' metadata to avatar/background files when available
  and use that in default filenames / archive ids
- remove deprecation warnings as their option names clash with
  subcategory names
2022-05-04 17:30:54 +02:00
Mike Fährmann
3e6aba05ab [vk] add fallback for user ID extraction (#2535) 2022-05-03 13:42:45 +02:00
Mike Fährmann
52b47c3cf9 [gelbooru_v01] add 'favorite' extractor (#2546) 2022-05-02 11:33:28 +02:00
Mike Fährmann
5b7423d14c [vk] fix URLs for older photos (#2535) 2022-05-02 11:19:18 +02:00
Mike Fährmann
3346f58a2a [twitter] use twMediaDownloader strategy for user URLs
- use media timeline + search for default user URLs like
  https://twitter.com/SCREEN_NAME
- fetches all/most media for the type of twitter URL that most users
  use with gallery-dl
- can be disabled by setting 'strategy' to any truthy value,
  like "timeline"
2022-05-02 09:03:35 +02:00
Mike Fährmann
84756982e9 [pixiv] implement 'include' option
- split 'user' extractor and its 'avatar' and 'background' options into
  separate extractors ('artworks', 'avatar', 'background')
- avatars can now be downloaded with
  https://www.pixiv.net/en/users/ID/avatar
  as URL and will use a proper archive key; similar for backgrounds
- options for the 'user' subcategory must be moved to 'artworks' to have
  the same effect as before
2022-05-02 09:03:35 +02:00
Mike Fährmann
d11e2191ae [nijie] support /history_nuita.php listings (closes #2541) 2022-05-02 09:03:34 +02:00
Mike Fährmann
4aca29b7b4 [naverwebtoon] support (best)challenge comics (closes #2542)
and update URL pattern to match URLs without '.nhn'
2022-05-02 09:03:34 +02:00
Mike Fährmann
3e926bd465 [realbooru] fix extraction (fixes #2530) 2022-05-02 09:03:34 +02:00
Mike Fährmann
82eee72b39 [pixiv] update API interface
- start all endpoints with '/'
- use extractor.wait() for rate limit
- retry with while loop instead of recursion
- in case of error, write entire response to debug log
2022-05-02 09:03:34 +02:00
Mike Fährmann
1bc77efa02 [artstation] use "browser": "firefox" by default (#2527) 2022-05-02 09:03:13 +02:00
Mike Fährmann
a39e7b7366 [vk] handle photos without width/height info (fixes #2535) 2022-05-02 09:03:00 +02:00
Federico Ravasio
0381752575 [photovogue] switch to .com, update api endpoint (#2494) 2022-04-27 22:37:53 +02:00
Mike Fährmann
3f02e483c6 [e621] fix applying request_interval_min (#2533)
Setting this property after calling Extractor.__init__() has no effect.
2022-04-27 21:10:34 +02:00
Mike Fährmann
afde76269c [weibo] fix infinite retries for deleted accounts (fixes #2521) 2022-04-27 20:23:11 +02:00
Mike Fährmann
d85e66bcac [vk] fix extraction (#2512)
Use a different API endpoint, since thumbnail URLs from the old one
cannot be transformed into URLs for "original" photos anymore.
2022-04-21 14:01:50 +02:00
Mike Fährmann
9e6ff42a9d [pixiv] implement 'background' option (#623, #1124, #2495) 2022-04-21 13:53:02 +02:00
Mike Fährmann
4d1896830f [mangadex] download chapters with 'externalUrl' (fixes #2503)
if the have pages hosted on mangadex
2022-04-18 18:09:52 +02:00
Mike Fährmann
97e8a15295 [deviantart] implement 'pagination' option (#2488) 2022-04-18 18:08:01 +02:00
Mike Fährmann
1f9a0e2fd8 update extractor test results 2022-04-18 17:24:00 +02:00
Mike Fährmann
ad5a4b1756 [twitter] fix various syndication issues
- handle retweets
- fix videos without dimensions in URL (3e942a58)
- fix '"retweets": "self"' filter (#2499)
2022-04-15 20:49:26 +02:00
Mike Fährmann
12bd9ba33a [readcomiconline] add 'quality' option (#2467) 2022-04-15 18:10:37 +02:00
Mike Fährmann
60ad46ddcc [readcomiconline] unobfuscate image URLs (#2481) 2022-04-15 18:04:09 +02:00
Mike Fährmann
a6c4ff58fb [cyberdrop] match cyberdrop.to URLs (closes #2496) 2022-04-15 15:39:29 +02:00
Mike Fährmann
13ed18b9aa [lolisafe] fix typo
LolisafelbumExtractor -> LolisafeAlbumExtractor
2022-04-15 15:02:30 +02:00
Mike Fährmann
3e942a58be [twitter] improve syndication video selection (#2354)
- ignore .m3u8 manifests
- always select largest format
2022-04-11 17:06:10 +02:00
Mike Fährmann
0794027100 [issuu] fix extraction (#2483) 2022-04-10 14:23:10 +02:00
Mike Fährmann
5d5a08cc69 [sexcom] add fallback for empty files (#2485) 2022-04-10 14:22:07 +02:00
thatfuckingbird
4527a35aba [twitter] accept fxtwitter.com URLs (#2484) 2022-04-08 14:32:08 +02:00
Mike Fährmann
c1768972c2 [newgrounds] update and fix pagination (#2456) 2022-04-07 15:38:41 +02:00
Mike Fährmann
78e5d0c423 [kissgoddess] extract all images (closes #2473)
and not only the first two per page
https://github.com/mikf/gallery-dl/issues/1052#issuecomment-1047367383
2022-04-06 21:28:40 +02:00
Mike Fährmann
0b33435da5 [pinterest] support multiple files per pin (closes #1619, #2452) 2022-04-06 21:21:33 +02:00
Mike Fährmann
9c5d2d7af3 [pinterest] add extractor for created pins (#2452) 2022-04-01 16:59:58 +02:00
Mike Fährmann
1171911dc3 [twitter] add 'syndication' option (#2354)
to fetch age-restricted content using Twitter's  syndication API
2022-04-01 16:56:47 +02:00
Mike Fährmann
a53cfc845e [newgrounds] warn about age-restricted posts (#2456) 2022-03-30 16:18:33 +02:00
Mike Fährmann
ecee315bbf [mangasee] unescape manga names (fixes #2454) 2022-03-30 16:18:18 +02:00
loragja
7e545a3ae9 [gofile] add gofile.io extractor (#2364)
* Add gofile extractor

* add gofile extractor to module list

* add support for tiny monitors and ancient python versions

* seriously, f-strings are not *that* new...

* i love flake8 :)

* add 'api-token' and 'recursive' options
* add tests
2022-03-29 17:31:57 +02:00
Layerex
625f4d4cc4 [telegraph] Add telegra.ph extractor (#2312) 2022-03-28 19:18:13 +02:00