Commit Graph

226 Commits

Author SHA1 Message Date
Mike Fährmann
0bc8ef51c8 [smugmug] Handle albums with no explicit owner (#100) 2018-09-01 12:55:02 +02:00
Mike Fährmann
590c0b3ad5 re-implement and improve filename formatter
A format string now gets parsed only once instead of re-parsing it each
time it is applied to a set of data.

The initial parsing causes directory path creation to be at about 2x
slower than before, since each format string there is used only once,
but building a filename, the more common operation, is at least 2x
faster. The "directory slowness" cancels at about 5 filenames and
everything above that is significantly faster.
2018-08-25 10:45:14 +02:00
Mike Fährmann
34b556922d update/restore tests 2018-08-23 15:47:40 +02:00
Mike Fährmann
e3055d356c release version 1.5.1 2018-08-17 13:21:36 +02:00
Mike Fährmann
f9ded38d89 [test:results] add support for "range" options in tests 2018-08-15 21:49:44 +02:00
Mike Fährmann
c9e6ccbd7c [test:extractor] small fixes and improvements 2018-08-15 21:49:33 +02:00
Mike Fährmann
7f4e41c989 increase timeout during extractor tests
cloudflare's 522 response takes longer than 30 seconds
2018-08-10 16:51:05 +02:00
Mike Fährmann
b55e39d1ee [mangadex] improve extraction
- cache manga API results
- add artist, author and date fields to chapter metadata
- remove Manga-/ChapterExtractor inheritance
- minor code simplifications and improvements
2018-08-10 16:50:07 +02:00
Mike Fährmann
2a9f3341a2 [behance] fix title extraction 2018-08-08 10:48:58 +02:00
Mike Fährmann
a86f2bfc80 [pinterest] update not-found redirects 2018-08-07 12:13:19 +02:00
Mike Fährmann
7442d2940c release version 1.5.0 2018-08-03 17:50:27 +02:00
Mike Fährmann
b040ca0718 [rule34] small unit test fixes 2018-08-03 17:28:47 +02:00
Mike Fährmann
f3793660ef update tests 2018-08-02 14:57:28 +02:00
Mike Fährmann
42a346413b fix "re:" prefix for keyword tests 2018-08-02 14:48:51 +02:00
Mike Fährmann
e0dd8dff5f implement L<maxlen>/<replacement>/ format option
The L option allows for the contents of a format field to be replaced
with <replacement> if its length is greater than <maxlen>.

Example:
{f:L5/too long/} -> "foo"      (if "f" is "foo")
                 -> "too long" (if "f" is "foobar")

(#92) (#94)
2018-07-29 13:52:07 +02:00
Mike Fährmann
bb89a1e6d7 [mangahere] use http://
invalid SSL cert for quite some time now
2018-07-26 18:11:31 +02:00
Mike Fährmann
ce34d82cb4 fix skipping tests on 5xx status codes 2018-07-19 18:47:23 +02:00
Mike Fährmann
a6fe2bb594 [whatisthisimnotgoodwithcomputers] remove extractor 2018-07-14 09:53:16 +02:00
Mike Fährmann
0ba93650e0 [8chan] replace unit test URL
the other thread is no longer accessible
2018-07-14 09:53:16 +02:00
Mike Fährmann
8fe9056b16 implement string slicing for format strings
It is now possible to slice string (or list) values of format string
replacement fields with the same syntax as in regular Python code.

"{digits}"       -> "0123456789"
"{digits[2:-2]}" -> "234567"
"{digits[:5]}"   -> "01234"

The optional third parameter (step) has been left out to simplify things.
2018-07-14 09:53:15 +02:00
Mike Fährmann
269dc2bbd5 [sankaku] add 'tags' option (#94) 2018-07-14 09:53:01 +02:00
Mike Fährmann
764331823b release version 1.4.2 2018-07-06 16:02:40 +02:00
Mike Fährmann
2eefaa99a3 [mangapark] support .net and .com mirrors 2018-07-05 14:45:05 +02:00
Mike Fährmann
188e956c4e [imagefap] use HTTPS + update test results 2018-06-30 19:40:46 +02:00
Mike Fährmann
a699787d01 [deviantart] update URL patterns to new format
DeviantArt changed its URL format from
https://<name>.deviantart.com/...
to
https://www.deviantart.com/<name>/...

With this change both formats will be supported.
2018-06-28 20:21:59 +02:00
Mike Fährmann
0055fdd714 change OAuth test server
DNS record for oauthbin.com expired
2018-06-28 14:32:02 +02:00
Mike Fährmann
b8c97d2295 use 'extractor.request()' for more HTTP requests 2018-06-25 23:40:59 +02:00
Mike Fährmann
7a98cc9798 [smugmug] update tests
My test account expired and all uploaded images got deleted.
2018-06-22 15:04:31 +02:00
Mike Fährmann
4eb94aca17 [postprocessor:ugoira] pass '-f' if not present 2018-06-22 13:26:17 +02:00
Mike Fährmann
a9e276bc37 reset delete-flag
Since 'PathFormat' objects are being reused, setting `delete`
to True once caused all files downloaded after to be deleted as well.
2018-06-20 18:12:59 +02:00
Mike Fährmann
6ac403c5d3 add postprocessor config example 2018-06-08 18:31:59 +02:00
Mike Fährmann
2403c405e3 Merge branch 'postprocessor' 2018-06-08 17:43:11 +02:00
Mike Fährmann
b344f2290f fix downloader tests 2018-06-07 22:27:36 +02:00
Mike Fährmann
a47c6136cd [simplyhentai] avoid redirects for all-pages.json (#89) 2018-06-01 22:06:34 +02:00
Mike Fährmann
ae9a37a528 implement text.split_html() 2018-05-27 15:00:41 +02:00
Mike Fährmann
0a1863fce3 [pixiv] respect more query parameters for user URLs
The API endpoint responsible for user illustrations does not
provide sufficient filter capabilities* to match the actual
website, so we are spinning our own filters.

Respected parameters are
    'type': illust, manga, ugoira
    'tag' : any image tag (this was already supported)
    'p'   : the page to start on

*
- API can filter for illustrations and manga, but not for ugoira.
- 'offset' is applied before filtering
- no 'tag' filter
2018-05-18 15:36:30 +02:00
Mike Fährmann
7f899bd5d8 Merge branch 'master' into 1.4-dev 2018-05-14 14:50:02 +02:00
Mike Fährmann
4cea886177 [imgur] allow longer album hashes 2018-05-13 11:21:51 +02:00
Mike Fährmann
e1e23165a0 [pinterest] catch JSON decode errors 2018-05-11 17:37:27 +02:00
Mike Fährmann
6a31ada9e3 re-implement OAuth1.0 code
OAuth support for SmugMug needs some additional features
(auth-rebuild on redirect, query parameters in URL, ...)
and fixing this in the old code wouldn't work all that well.
2018-05-10 18:47:05 +02:00
Mike Fährmann
e2157f594e [mangadex] fix manga extraction (closes #84)
Chapter listings for manga now use
https://mangadex.org/manga/<id>/_/chapters/2/
as URL instead of
https://mangadex.org/manga/<id>/_//2/
2018-05-06 17:43:50 +02:00
Mike Fährmann
69a5e6ddb3 Merge branch 'master' into 1.4-dev 2018-05-04 10:19:02 +02:00
Mike Fährmann
3fe653d940 fix test_results for empty sets
{} is an empty dict and doesn't support set operations
2018-04-29 22:43:37 +02:00
Mike Fährmann
d96b3474e5 [puremashiro] remove module
site has been unreachable for a couple of weeks
and now the DNS record is gone as well
2018-04-28 14:24:20 +02:00
Mike Fährmann
b44a296404 [gomanga] remove module
site has been unreachable for a couple of weeks
and the cloudflare status page shows host errors
2018-04-28 14:24:21 +02:00
Mike Fährmann
2395d870dd [pinterest] unquote board and user names, better errors 2018-04-26 16:38:12 +02:00
Mike Fährmann
55d4d23860 [pinterest] use Pinterest's "Web" API (#83)
no access tokens, no user credentials of any kind ...
2018-04-24 22:28:10 +02:00
Mike Fährmann
f471161920 Merge branch 'master' into 1.4-dev 2018-04-21 12:15:40 +02:00
Mike Fährmann
cc36f88586 rename safe_int to parse_int; move parse_* to text module 2018-04-20 14:53:21 +02:00
Mike Fährmann
10cc59f3b5 fix extractor names 2018-04-18 18:12:57 +02:00