Commit Graph

24 Commits

Author SHA1 Message Date
Mike Fährmann
a2af2d2965 adjust cache maxage values 2019-03-14 22:21:49 +01:00
Mike Fährmann
5530871b5a change results of text.nameext_from_url()
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)

Example: "https://example.org/path/filename.ext"

before:
- filename : filename.ext
- name     : filename
- extension: ext

now:
- filename : filename
- extension: ext
2019-02-14 16:07:17 +01:00
Mike Fährmann
61741d7333 provide type information for Queue messages
Child extractors are now directly constructed with Extractor.from_url()
if the extractor class is known beforehand, instead of using
extractor.find() and searching through all possible extractor classes.
2019-02-12 21:32:32 +01:00
Mike Fährmann
580baef72c change Chapter and MangaExtractor classes
- unify and simplify constructors
- rename get_metadata and get_images to just metadata() and images()
- rename self.url to chapter_url and manga_url
2019-02-11 18:38:47 +01:00
Mike Fährmann
4b1880fa5e propagate 'match' to base extractor constructor 2019-02-11 13:31:10 +01:00
Mike Fährmann
6284731107 simplify extractor constants
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
a303efb597 [mangadex] handle manga pages without chapters 2019-01-03 16:22:12 +01:00
Mike Fährmann
b47af4637a [mangadex] update URL pattern
Manga URLs now begin with /title/ instead of /manga/
2018-08-31 20:16:50 +02:00
Mike Fährmann
2af2bb7911 [mangadex] fix relative page URLs 2018-08-25 11:07:26 +02:00
Mike Fährmann
b55e39d1ee [mangadex] improve extraction
- cache manga API results
- add artist, author and date fields to chapter metadata
- remove Manga-/ChapterExtractor inheritance
- minor code simplifications and improvements
2018-08-10 16:50:07 +02:00
Mike Fährmann
b1c4c1e13c [mangadex] fix extraction 2018-08-08 18:08:26 +02:00
Mike Fährmann
2d1a104739 [mangadex] unescape manga names and chapter titles
pretty sure I previously tested if unescaping strings from the
embedded JSON object was necessary ... maybe they changed it
2018-06-11 17:53:21 +02:00
Mike Fährmann
a47c6136cd [simplyhentai] avoid redirects for all-pages.json (#89) 2018-06-01 22:06:34 +02:00
Mike Fährmann
15cce22d82 [mangadex] fix parsing of unusual chapter strings 2018-05-23 18:40:39 +02:00
Mike Fährmann
7f899bd5d8 Merge branch 'master' into 1.4-dev 2018-05-14 14:50:02 +02:00
Mike Fährmann
e2157f594e [mangadex] fix manga extraction (closes #84)
Chapter listings for manga now use
https://mangadex.org/manga/<id>/_/chapters/2/
as URL instead of
https://mangadex.org/manga/<id>/_//2/
2018-05-06 17:43:50 +02:00
Mike Fährmann
95392554ee use text.urljoin() 2018-04-26 17:00:26 +02:00
Mike Fährmann
2721417dd8 Merge branch 'master' into 1.4-dev 2018-04-24 11:33:02 +02:00
Mike Fährmann
e54b43be08 [mangadex] add title info for chapter extractors 2018-04-22 16:20:04 +02:00
Mike Fährmann
cc36f88586 rename safe_int to parse_int; move parse_* to text module 2018-04-20 14:53:21 +02:00
Mike Fährmann
d1c91a1f2b [mangadex] fix manga-page extraction 2018-03-25 17:22:12 +02:00
Mike Fährmann
85ed023c2e [mangadex] remove the trailing ' - MangaDex' in a better way
str.rstrip() works differently than assumed.
2018-03-10 15:54:50 +01:00
Mike Fährmann
1400868f53 [mangadex] general improvements
- support >100 chapter entries per manga
- custom archive ID format
- detect non-existing chapters
2018-03-06 14:15:15 +01:00
Mike Fährmann
749fbbfa6c [mangadex] add chapter- and manga-extractor 2018-03-05 18:37:21 +01:00