Mike Fährmann
cd931e1139
update extractor test results
2022-12-08 18:58:29 +01:00
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible
2022-11-05 01:14:09 +01:00
Mike Fährmann
86cbf485ab
[webtoons] extract real episode number ( #2591 )
...
The number from the 'episode_no' query parameter
got renamed to 'episode_no'.
2022-05-17 22:33:29 +02:00
Kyle Anthony Williams
a14b72be21
[webtoons] Use swebtoon-phinf.pstatic.net instead of webtoon-phinf.pstatic.net ( #2005 )
...
* [webtoons] Use swebtoon-phinf.pstatic.net instead of webtoon-phinf.pstatic.net
This trick to avoid having to set a Referer header comes from
Webtoon's RSS feeds. The two URLs below are equivalent in content:
https://webtoon-phinf.pstatic.net/20210929_153/1632867980912DmcGK_JPEG/16328679808882705182.jpg?type=q90
https://swebtoon-phinf.pstatic.net/20210929_153/1632867980912DmcGK_JPEG/16328679808882705182.jpg?type=q90
The URL with the domain "webtoon-phinf.pstatic.net" needs a Referer
header, and the domain "swebtoon-phinf.pstatic.net" does not. This
is because of the environment "swebtoon" images live in, one without
explicit network control: RSS feeds on sites such as Feedly. This change should
make it easier for gallery-dl developers to embed Webtoon comics without
worrying about headers.
2021-11-11 20:03:34 +01:00
Mike Fährmann
8bdeb2a6dd
[webtoons] match arbitrary language codes ( closes #1643 )
2021-06-21 19:25:28 +02:00
Mike Fährmann
d88e34f17e
[webtoons] use GalleryExtractor
2021-04-18 20:28:31 +02:00
Mike Fährmann
c4210b5371
[webtoons] update agegate/GDPR cookies
2021-04-18 20:28:31 +02:00
Christian Paul
41fbc20020
[webtoons]: Add cookie rstagGDPR_DE=true ( #1431 )
2021-04-07 21:42:55 +02:00
Mike Fährmann
2919d78bfc
update extractor test results
2021-02-14 15:37:39 +01:00
Mike Fährmann
193dca2ce1
update extractor test results
2021-01-21 21:35:42 +01:00
Mike Fährmann
912eea29bc
update extractor test results
2020-12-27 17:41:08 +01:00
Mike Fährmann
47114339a2
[webtoons] update 'ageGate' cookie
2020-12-07 14:56:32 +01:00
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
bb882b8cdb
improve output of '-K' for parent extractors ( #825 )
2020-06-14 21:39:21 +02:00
Mike Fährmann
998d1d3a5c
[webtoons] generalize and improve comic extraction ( fixes #820 )
2020-06-10 21:44:42 +02:00
Leonardo Taccari
bcac31b7c7
[webtoons] make archive_fmt unique ( #779 )
...
close #778
2020-05-25 21:23:54 +02:00
Mike Fährmann
0378d079a5
[webtoons] fixes and simplifications ( #593 , #761 )
...
- fix episode listings for french comics
- allow input URLs without explicit scheme
- add 'lang'/'language' metadata
- use str.format() instead of '+' to assemble URLs
2020-05-18 20:20:03 +02:00
Leonardo Taccari
39cd389679
[webtoons] Add a new extractor for webtoons.com ( #761 )
...
The webtoons extractor can extract episode and entire comic (all
episodes) from webtoons.com.
All the logic of the extractors should be trivial except for a couple
of kludges needed:
- `ageGatePass' cookie is always set to avoid possible redirect and stop of
extraction, especially in the comic extractor
- The image URLs returned by the episode extractor could not be fetched
directly and the `Referer:' HTTP header needs to be passed to fetch them
Close #593 .
2020-05-18 19:04:20 +02:00