remove '&' from URL patterns
'/?&#' -> '/?#' and '?&#' -> '?#' According to https://www.ietf.org/rfc/rfc3986.txt, URLs are "organized hierarchically" by using "the slash ("/"), question mark ("?"), and number sign ("#") characters to delimit components"
This commit is contained in:
@@ -158,7 +158,7 @@ class HitomiTagExtractor(Extractor):
|
||||
subcategory = "tag"
|
||||
pattern = (r"(?:https?://)?hitomi\.la/"
|
||||
r"(tag|artist|group|series|type|character)/"
|
||||
r"([^/?&#]+)\.html")
|
||||
r"([^/?#]+)\.html")
|
||||
test = (
|
||||
("https://hitomi.la/tag/screenshots-japanese.html", {
|
||||
"pattern": HitomiGalleryExtractor.pattern,
|
||||
|
||||
Reference in New Issue
Block a user