Mike Fährmann
cb1a75eefc
[twitter] handle errors during file extraction ( #6647 )
2025-01-21 18:23:54 +01:00
Mike Fährmann
d9c4fcc7fa
[twitter] generate longer CSRF token values
2025-01-21 18:19:25 +01:00
Mike Fährmann
105c027411
[path] handle exception when using --rename-to --no-download ( #6861 )
...
Catch a possible FileExistsError exception when attempting to create a
new directory during handling of a FileNotFoundError exception.
FileNotFoundError may also occur when the file at self.temppath is
missing because it hasn't been downloaded due to --no-download.
2025-01-20 20:50:31 +01:00
Mike Fährmann
05fa6dd354
[nekohouse] add initial support ( #5241 , #6738 )
2025-01-20 20:15:34 +01:00
Mike Fährmann
6ce310d865
[weebcentral] fix extraction ( #6860 )
2025-01-19 18:14:03 +01:00
Mike Fährmann
f867e690c1
merge #6855 : [turboimagehost] add support for galleries
2025-01-19 17:51:48 +01:00
Mike Fährmann
0f50dd17ba
merge #6606 : [docs] add nix docs to README
2025-01-19 17:50:05 +01:00
arebokert
556fbb1a44
[turboimagehost] add support for galleries
...
- added support
- raise error if gallery not found
- fix test
- fix lint issues
- simplify
2025-01-19 17:28:45 +01:00
DontEatOreo
b15283cf6d
README.rst: add nix docs
2025-01-19 17:46:57 +02:00
Mike Fährmann
bb2f9b8443
[release] include 'scripts/run_tests.py' in release tarball ( #6856 )
2025-01-19 15:58:23 +01:00
Mike Fährmann
438c61601b
[xfolio] add initial support ( #5514 , #6351 , #6837 )
2025-01-18 15:57:56 +01:00
Mike Fährmann
dc7b46be21
[khinsider] add 'covers' option ( #6844 )
2025-01-18 15:57:56 +01:00
Mike Fährmann
5a31a2ad22
[khinsider] extract more 'album' metadata ( #6844 )
...
- year
- catalog
- developer
- publisher
- uploader
2025-01-18 15:57:55 +01:00
Mike Fährmann
3849b3fa92
[batoto] use 'chapter_id' in default archive IDs ( #6835 )
...
instead of '{chapter}{chapter_minor}' since some chapters have no actual
chapter number and end up as '0', potentially causing ID overlap
2025-01-15 14:52:18 +01:00
Mike Fährmann
6e919a3695
[e621] support e621.cc and e621.anthro.fr frontend URLs ( #6809 )
2025-01-15 14:35:37 +01:00
Mike Fährmann
843a39a6c6
[bunkr] extract correct 'filename' data ( #6824 )
2025-01-14 19:45:48 +01:00
Mike Fährmann
d17a423245
[xhamster] fix 'gallery' extractor ( #6818 )
2025-01-13 18:58:08 +01:00
Mike Fährmann
bde99cc6ce
[cohost] remove module
...
cohost.org now redirects to archive.org
2025-01-13 14:38:35 +01:00
Mike Fährmann
42070240ae
[tests] allow testing for types + values
2025-01-12 20:55:37 +01:00
Mike Fährmann
2b46b82f9c
[release] prevent overwriting ${CHANGELOG}.orig with truncated file
...
to avoid deleting most of CHANGELOG.md by accident when the release.sh
script gets interrupted halfway through, as happened during the v1.28.3
release in commit 7e8ca377fc
2025-01-12 18:05:35 +01:00
Mike Fährmann
6e3f51a05e
release version 1.28.4
2025-01-12 17:22:09 +01:00
Mike Fährmann
91bd3e37f2
[pexels] add support ( #2286 , #4214 , #6769 )
2025-01-12 16:50:12 +01:00
Mike Fährmann
1ae3ac5e39
[common] add '_extract_nextdata' method
2025-01-12 11:48:36 +01:00
Mike Fährmann
3f48e2f820
[common] add '_extract_jsonld' method ( #5272 )
2025-01-12 11:07:48 +01:00
Mike Fährmann
88f1ef7c3c
[bunkr] fix metadata extraction ( #6805 )
2025-01-11 12:48:41 +01:00
Mike Fährmann
1d75c8308c
[weebcentral] add support ( #6778 )
2025-01-10 23:04:51 +01:00
Mike Fährmann
4853406fe3
[common] allow MangaExtractors to skip loading manga_url
2025-01-10 21:30:58 +01:00
Mike Fährmann
af9c06f812
[bunkr] fix album extraction ( #6798 )
2025-01-10 13:01:04 +01:00
Mike Fährmann
118b994cf2
[bunkr] support '/f/...' media URLs
2025-01-10 13:01:04 +01:00
Mike Fährmann
ba0443115a
[bunkr] fix ValueError on relative redirects ( #6790 )
2025-01-10 13:00:52 +01:00
Mike Fährmann
89276c5b3e
[e621] match 'tag' search URLs with empty tag ( #6783 )
2025-01-07 20:00:26 +01:00
Mike Fährmann
d18f311fe2
[plurk] fix 'user' data extraction and make it non-fatal ( #6742 )
2025-01-06 20:27:37 +01:00
Mike Fährmann
b1ffb62644
[docs] update 'sleep-request' value for 'wallhaven'
2025-01-06 17:24:04 +01:00
Mike Fährmann
46b6b71159
[wallhaven] extract 'search[tags]' and 'search[tag_id]' metadata
...
(#6772 )
2025-01-06 17:18:04 +01:00
Mike Fährmann
270aaea8ab
[pixiv] provide fallback URLs ( #6762 )
2025-01-06 15:27:32 +01:00
Mike Fährmann
770f41eb4a
[util] support not splitting "contains" value ( #6773 )
...
by passing any "false" value as 'separator' argument except None
2025-01-06 13:47:32 +01:00
Mike Fährmann
a3b9cc7785
[options] mark '--list-extractors' argument as optional
2025-01-05 21:37:44 +01:00
Mike Fährmann
7e8ca377fc
release version 1.28.3
2025-01-04 16:42:02 +01:00
Mike Fährmann
107798eeab
[subscribestar] strip whitespace from 'content'
2025-01-04 16:19:22 +01:00
Mike Fährmann
a53ce6103c
[deviantart:tiptap] smaller fixes
...
- fix text indentation in headings
- fix deviations formats without 'c' path
- support custom 'target' in links
2025-01-03 22:48:06 +01:00
Mike Fährmann
1dcb40be7c
merge #6760 : [boosty] support 'file' post attachments ( #2387 )
...
https://github.com/mikf/gallery-dl/issues/2387#issuecomment-2564671646
2025-01-03 15:59:03 +01:00
Mike Fährmann
bce9be66c2
merge #6761 : [subscribestar] improve 'content' metadata extraction
2025-01-03 15:56:17 +01:00
Wyoh Knott
22d4e84372
[subscribestar] Better extraction of content
...
The structure of content is like this:
```
<div class="post-content" data-role="post_content-text">
<div class="trix-content">
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd ">
<html>
<body>
<div>
Unspeakable thing are written here<br />
<br />
haiiiiiiiiiiiiiiii hi hi hiii its meee back againnn, plspls leave a comment if uuuu liked it mwah
<3
</div>
</body>
</html>
</div>
</div>
<div class="post-uploads
```
Currently we extract content with:
```
(extr('<div class="post-content', '<div class="post-uploads').partition(">")[2])
```
I propose we just take the body parts:
```
extr('<body>', '</body>')
```
which only happen when surrounding actual content.
It is then easier to use it in the filename content with the `!H`
formatter: `content[:160]!H}`. Otherwise the content currently extracted
can't be decoded with it.
2025-01-03 14:57:12 +01:00
Dominik
ea6594734d
[boosty] Fixed formatting
2025-01-03 08:27:11 +01:00
Dominik
8c9221f0a6
[boosty] Added post attachment download
2025-01-03 08:18:57 +01:00
Mike Fährmann
5767c0854c
merge #6758 : [subscribestar] fix attachment downloads and add support for audio type
...
(#6721 , #>6724)
2025-01-02 18:25:37 +01:00
Mike Fährmann
671297a8cc
[subscribestar] extend fix + add test
...
some attachments are inside an element with an additional class besides
'doc_preview', e.g. 'class="doc_preview for_post"'
2025-01-02 18:22:15 +01:00
Mike Fährmann
2dd2c71c53
[docs] update configuration.rst
2025-01-02 17:54:47 +01:00
Mike Fährmann
428eb53086
[hitomi] provide 'search_tags' metadata for search/tag results
...
(#1015 , #6756 )
2025-01-02 17:49:30 +01:00
Mike Fährmann
0c584f9be7
[sankaku] support alphanumeric book/pool IDs ( #6757 )
2025-01-02 15:49:07 +01:00