[arcalive] add support (#5657 #7100)

* [arca.live] Add extractor skeleton

* [arcalive] update names and formatting

* [arcalive] implement initial file extraction code

* [arcalive] improve '_extract_media()' performance

compile and cache regex on demand

* [arcalive] improve image extraction

- extract 'data-originalurl' URLs if available
- replace URL query strings with 'type=orig'
- ignore emoticons by default

* [arcalive] update defaults

- include 'title' in filenames
- use 0.5-1.5s delay between requests

* [arcalive] use ext from 'data-orig' if available

* [arcalive] update docs/supportedsites

* [arcalive] add tests

* [arcalive] update 'board' extractor pattern

so it doesn't also match 'post' URLs

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
This commit is contained in:
hdk5
2025-03-14 11:52:21 +02:00
committed by GitHub
parent 22d46f2462
commit d900e868e4
6 changed files with 311 additions and 0 deletions

View File

@@ -97,6 +97,12 @@ Consider all listed sites to potentially be NSFW.
<td>Posts, Tag Searches</td>
<td></td>
</tr>
<tr>
<td>Arcalive</td>
<td>https://arca.live/</td>
<td>Boards, Posts</td>
<td></td>
</tr>
<tr>
<td>Architizer</td>
<td>https://architizer.com/</td>