Files
gallery-dl/gallery_dl/extractor
Wyoh Knott 22d4e84372 [subscribestar] Better extraction of content
The structure of content is like this:

```
<div class="post-content" data-role="post_content-text">
                <div class="trix-content">
                    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
                    <html>
                        <body>
                            <div>
                                Unspeakable thing are written here<br />
                                <br />
                                haiiiiiiiiiiiiiiii hi hi hiii its meee back againnn, plspls leave a comment if uuuu liked it mwah
                                &lt;3
                            </div>
                        </body>
                    </html>
                </div>
            </div>
            <div class="post-uploads
```

Currently we extract content with:

```
(extr('<div class="post-content', '<div class="post-uploads').partition(">")[2])
```

I propose we just take the body parts:

```
extr('<body>', '</body>')
```

which only happen when surrounding actual content.

It is then easier to use it in the filename content with the `!H`
formatter: `content[:160]!H}`. Otherwise the content currently extracted
can't be decoded with it.
2025-01-03 14:57:12 +01:00
..
2024-01-15 04:09:05 +01:00
2024-05-31 17:42:53 +02:00
2023-09-19 00:02:04 +02:00
2024-07-21 12:34:06 +02:00
2024-12-24 09:38:07 +01:00
2024-11-10 20:43:33 +01:00
2024-11-03 17:51:04 +01:00
2024-02-16 22:43:37 +01:00
2024-03-21 18:08:18 +01:00
2024-11-16 09:17:13 +01:00
2023-12-18 23:57:22 +01:00
2024-01-20 16:44:48 +01:00
2024-12-11 11:52:42 +01:00
2024-11-15 23:49:58 +01:00
2024-10-24 13:59:39 +02:00
2024-12-11 20:39:01 +01:00
2023-09-19 00:02:04 +02:00
2023-11-25 23:53:27 +01:00
2021-01-31 02:12:37 +01:00
2024-06-02 18:15:53 +02:00
2024-10-28 19:45:24 +01:00
2024-10-01 08:22:50 +02:00
2024-12-27 15:08:08 +01:00
2023-12-30 20:37:09 +01:00
2024-10-28 14:44:15 +01:00
2024-11-03 09:59:25 +01:00
2024-05-31 21:05:50 +02:00
2024-06-21 21:29:11 +05:30
2024-01-18 03:20:36 +01:00
2024-06-02 18:16:24 +02:00
2024-07-31 12:32:04 +02:00
2024-07-26 21:09:07 +02:00
2024-09-25 20:02:01 +02:00
2024-12-11 10:57:21 +01:00
2024-09-03 21:17:31 +02:00