Mike Fährmann
f867e690c1
merge #6855 : [turboimagehost] add support for galleries
2025-01-19 17:51:48 +01:00
arebokert
556fbb1a44
[turboimagehost] add support for galleries
...
- added support
- raise error if gallery not found
- fix test
- fix lint issues
- simplify
2025-01-19 17:28:45 +01:00
Mike Fährmann
438c61601b
[xfolio] add initial support ( #5514 , #6351 , #6837 )
2025-01-18 15:57:56 +01:00
Mike Fährmann
dc7b46be21
[khinsider] add 'covers' option ( #6844 )
2025-01-18 15:57:56 +01:00
Mike Fährmann
5a31a2ad22
[khinsider] extract more 'album' metadata ( #6844 )
...
- year
- catalog
- developer
- publisher
- uploader
2025-01-18 15:57:55 +01:00
Mike Fährmann
3849b3fa92
[batoto] use 'chapter_id' in default archive IDs ( #6835 )
...
instead of '{chapter}{chapter_minor}' since some chapters have no actual
chapter number and end up as '0', potentially causing ID overlap
2025-01-15 14:52:18 +01:00
Mike Fährmann
6e919a3695
[e621] support e621.cc and e621.anthro.fr frontend URLs ( #6809 )
2025-01-15 14:35:37 +01:00
Mike Fährmann
843a39a6c6
[bunkr] extract correct 'filename' data ( #6824 )
2025-01-14 19:45:48 +01:00
Mike Fährmann
d17a423245
[xhamster] fix 'gallery' extractor ( #6818 )
2025-01-13 18:58:08 +01:00
Mike Fährmann
bde99cc6ce
[cohost] remove module
...
cohost.org now redirects to archive.org
2025-01-13 14:38:35 +01:00
Mike Fährmann
91bd3e37f2
[pexels] add support ( #2286 , #4214 , #6769 )
2025-01-12 16:50:12 +01:00
Mike Fährmann
1ae3ac5e39
[common] add '_extract_nextdata' method
2025-01-12 11:48:36 +01:00
Mike Fährmann
3f48e2f820
[common] add '_extract_jsonld' method ( #5272 )
2025-01-12 11:07:48 +01:00
Mike Fährmann
88f1ef7c3c
[bunkr] fix metadata extraction ( #6805 )
2025-01-11 12:48:41 +01:00
Mike Fährmann
1d75c8308c
[weebcentral] add support ( #6778 )
2025-01-10 23:04:51 +01:00
Mike Fährmann
4853406fe3
[common] allow MangaExtractors to skip loading manga_url
2025-01-10 21:30:58 +01:00
Mike Fährmann
af9c06f812
[bunkr] fix album extraction ( #6798 )
2025-01-10 13:01:04 +01:00
Mike Fährmann
118b994cf2
[bunkr] support '/f/...' media URLs
2025-01-10 13:01:04 +01:00
Mike Fährmann
ba0443115a
[bunkr] fix ValueError on relative redirects ( #6790 )
2025-01-10 13:00:52 +01:00
Mike Fährmann
89276c5b3e
[e621] match 'tag' search URLs with empty tag ( #6783 )
2025-01-07 20:00:26 +01:00
Mike Fährmann
d18f311fe2
[plurk] fix 'user' data extraction and make it non-fatal ( #6742 )
2025-01-06 20:27:37 +01:00
Mike Fährmann
46b6b71159
[wallhaven] extract 'search[tags]' and 'search[tag_id]' metadata
...
(#6772 )
2025-01-06 17:18:04 +01:00
Mike Fährmann
270aaea8ab
[pixiv] provide fallback URLs ( #6762 )
2025-01-06 15:27:32 +01:00
Mike Fährmann
107798eeab
[subscribestar] strip whitespace from 'content'
2025-01-04 16:19:22 +01:00
Mike Fährmann
a53ce6103c
[deviantart:tiptap] smaller fixes
...
- fix text indentation in headings
- fix deviations formats without 'c' path
- support custom 'target' in links
2025-01-03 22:48:06 +01:00
Mike Fährmann
1dcb40be7c
merge #6760 : [boosty] support 'file' post attachments ( #2387 )
...
https://github.com/mikf/gallery-dl/issues/2387#issuecomment-2564671646
2025-01-03 15:59:03 +01:00
Wyoh Knott
22d4e84372
[subscribestar] Better extraction of content
...
The structure of content is like this:
```
<div class="post-content" data-role="post_content-text">
<div class="trix-content">
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd ">
<html>
<body>
<div>
Unspeakable thing are written here<br />
<br />
haiiiiiiiiiiiiiiii hi hi hiii its meee back againnn, plspls leave a comment if uuuu liked it mwah
<3
</div>
</body>
</html>
</div>
</div>
<div class="post-uploads
```
Currently we extract content with:
```
(extr('<div class="post-content', '<div class="post-uploads').partition(">")[2])
```
I propose we just take the body parts:
```
extr('<body>', '</body>')
```
which only happen when surrounding actual content.
It is then easier to use it in the filename content with the `!H`
formatter: `content[:160]!H}`. Otherwise the content currently extracted
can't be decoded with it.
2025-01-03 14:57:12 +01:00
Dominik
ea6594734d
[boosty] Fixed formatting
2025-01-03 08:27:11 +01:00
Dominik
8c9221f0a6
[boosty] Added post attachment download
2025-01-03 08:18:57 +01:00
Mike Fährmann
5767c0854c
merge #6758 : [subscribestar] fix attachment downloads and add support for audio type
...
(#6721 , #>6724)
2025-01-02 18:25:37 +01:00
Mike Fährmann
671297a8cc
[subscribestar] extend fix + add test
...
some attachments are inside an element with an additional class besides
'doc_preview', e.g. 'class="doc_preview for_post"'
2025-01-02 18:22:15 +01:00
Mike Fährmann
428eb53086
[hitomi] provide 'search_tags' metadata for search/tag results
...
(#1015 , #6756 )
2025-01-02 17:49:30 +01:00
Mike Fährmann
0c584f9be7
[sankaku] support alphanumeric book/pool IDs ( #6757 )
2025-01-02 15:49:07 +01:00
Wyoh Knott
a46f7981ee
[subscribestar] Fix attachment download and add support for audio type
...
- We change the text.extr 3rd argument to match current structure
('class="post-edit_form"')
- We add support for uploads-audios based on a similar structure as the
attachment type:
- id = data-upload-id
- name = audio_preview-title
- url = src
- type = audio
Fix #6721
2025-01-02 15:47:09 +01:00
Mike Fährmann
bd7320fb7d
[deviantart:tiptap] support more content block types
...
- anchor
- blockquote
- da-gif
- da-video
- lists
- listItem
- orderedList
- bulletList
- text indentation
2025-01-02 14:17:32 +01:00
Mike Fährmann
5c5b6d6276
[deviantart:tiptap] fix deviation embeds without 'token'
2024-12-28 19:47:05 +01:00
Mike Fährmann
7391dd208c
[poipiku] always query 'ShowAppendFileF' when post has warning ( #6736 )
2024-12-27 20:32:50 +01:00
Mike Fährmann
bc7e95684d
[piczel] fix extraction ( #6735 )
...
- fix pagination
- update API endpoints
- provide 'count' metadata field
- use BASE_PATTERN and self.groups[…]
2024-12-27 15:08:08 +01:00
Mike Fährmann
167a726972
[szurubooru] support 'visuabusters.com/booru' ( #6729 )
2024-12-26 19:04:16 +01:00
Mike Fährmann
998f949db1
[civitai] add 'user-videos' extractor ( #6644 )
2024-12-26 10:18:54 +01:00
Mike Fährmann
99de0e1867
[instagram] fix 'pinned' values for '/reels' results ( #6719 )
2024-12-25 19:42:50 +01:00
Mike Fährmann
3024dce06b
[8muses] skip albums without valid 'permalink' ( #6717 )
2024-12-24 13:49:19 +01:00
Mike Fährmann
09b2f8ea9e
[batoto] update domains ( #6714 )
...
- support 'fto.to' and 'jto.to'
- use 'xbato.org' for deprecated domains
2024-12-24 09:38:07 +01:00
Mike Fährmann
f9d3603bfc
[hitomi] fix searches ( #6713 )
2024-12-24 09:36:29 +01:00
Mike Fährmann
081856b9ce
[kemonoparty] handle 'discord' favorites ( #6706 )
2024-12-22 18:56:21 +01:00
Mike Fährmann
de9442ba75
[directlink] use domain as 'subcategory' ( #6703 )
2024-12-22 17:19:56 +01:00
Mike Fährmann
18491a4ce6
[tapas] fix TypeError for locked episodes ( #6700 )
2024-12-21 15:17:51 +01:00
Mike Fährmann
6059ffccf8
[deviantart] improve 'tiptap' to HTML conversion ( #6686 )
...
- fix "KeyError: 'attrs'" for links without 'href'
- support 'strike' text markers
- support 'heading' content blocks
2024-12-20 16:45:19 +01:00
Mike Fährmann
e0514817bd
[saint] support 'saint2.cr' URLs ( #6692 )
2024-12-19 11:43:35 +01:00
Mike Fährmann
8fbcdc1a3d
[instagram] extract 'date' for stories ( #6677 )
...
generalize 'date' extraction for all post types
2024-12-18 16:33:21 +01:00