Commit Graph

1979 Commits

Author SHA1 Message Date
Mike Fährmann
1adafdd3d0 document cache file requirement for DeviantArt refresh tokens 2019-10-13 23:01:57 +02:00
Mike Fährmann
df2b3c6888 restore OAuth2 authentication error messages 2019-10-13 22:48:01 +02:00
Mike Fährmann
6779512fc7 [nozomi] add post and tag extractors (#388) 2019-10-13 22:16:03 +02:00
Mike Fährmann
6abe5f5bbb [patreon] fix pagination (#444)
The Patreon-provided URLs for the next set of posts aren't
always complete, i.e. they can be missing their scheme and
the subsequent double slash: "www.patreon.com/…"
2019-10-12 22:30:51 +02:00
Mike Fährmann
ff1e4a86aa release version 1.10.6 2019-10-11 20:15:56 +02:00
Mike Fährmann
d4ffd6c952 [yaplog] improve metadata extraction (#443)
- provide a fallback if there is no numerical image ID
- add a 'filename' field
- convert 'date' to an actual datetime object
2019-10-11 18:39:52 +02:00
Mike Fährmann
15af2f8464 [hitomi] fallback to /reader/ page if main page returns 404
Some galleries return a 404: Not Found error when trying to access
them through the main gallery URL, but their content is still
available on the respective /reader/ page.
2019-10-11 18:39:52 +02:00
Mike Fährmann
8af59a4bba fix & update docs
- update Requests links
- add example for --exec
- set '-dev' version
2019-10-11 18:36:25 +02:00
Mike Fährmann
dc6ad81e2e [yaplog] prevent crash on empty posts (#443) 2019-10-10 21:19:09 +02:00
Mike Fährmann
94eb7c6cad [deviantart] fix sta.sh extraction (436) 2019-10-10 18:40:15 +02:00
Mike Fährmann
1032cfa34b [downloader:http] extend mimetype map with archive formats 2019-10-10 18:30:23 +02:00
Mike Fährmann
27b5b2497e [deviantart] fix download URLs (#436)
... except for sta.sh content.

Instead of using the old '/api/v1/oauth2/deviation/download' endpoint,
which started delivering URLs to 404 pages a while ago,
it is also possible to get a download URL from the relatively new
'/_napi/da-browse/shared_api/deviation/extended_fetch' endpoint
used by DeviantArt's Eclipse interface.

The current strategy is therefore:
- Iterate over deviations using the OAuth2 API
- Fetch original download URLs with the new NAPI/Shared API
2019-10-09 20:35:52 +02:00
Mike Fährmann
93aac8dfea [yaplog] fix incomplete image URLs (#443) 2019-10-09 17:42:15 +02:00
Mike Fährmann
a782b009b8 [yaplog] match blog names with '-' (#443) 2019-10-09 17:40:30 +02:00
Mike Fährmann
cf5e716b9d [hitomi] fix image URLs 2019-10-09 17:21:37 +02:00
Mike Fährmann
ad81c07204 [postprocessor] match logger names of downloader modules
The logger name for a postprocessor object got changed to
"postprocessor.<module-name>" instead of just
"postprocessor"
2019-10-06 23:30:18 +02:00
Mike Fährmann
03bc8adfc7 [postprocessor:exec] run after file moved to target location
(#421)
2019-10-06 23:12:22 +02:00
Mike Fährmann
35958bebd4 [postprocessor:exec] fix filename quoting on Windows (#421) 2019-10-06 15:09:00 +02:00
Mike Fährmann
b06c372e4d [postprocessor:exec] improve; add command-line option (#421) 2019-10-05 23:46:55 +02:00
Mike Fährmann
5a54efa025 [xhamster] unescape 'title' and 'description' 2019-10-04 14:44:51 +02:00
Mike Fährmann
1b9bf4fc6e [behance] fix 'tags' extraction 2019-10-03 17:36:02 +02:00
Mike Fährmann
bb97e87989 [komikcast] ignore banner image 2019-10-03 17:34:06 +02:00
Mike Fährmann
0ff90a3f7d [gfycat] include title in default filenames (closes #434) 2019-10-02 21:46:01 +02:00
Mike Fährmann
fabdc3b0c6 release version 1.10.5 2019-09-28 22:13:41 +02:00
Mike Fährmann
de4e2029d1 [nsfwalbum] update test album
the old one is no longer available
2019-09-28 20:48:15 +02:00
Mike Fährmann
1faec285d1 [nijie] further improvements (closes #423)
- provide a 'user_name' metadata field
  - usually the same as 'artist_id', except for favorite downloads
- extract the whole description text and properly escape HTML entities
- fixed an issue with titles or tags containing double quotes
2019-09-27 23:14:32 +02:00
Mike Fährmann
6d0a533d68 [reddit] respect 'comments:0' for single submissions (#429) 2019-09-27 23:11:28 +02:00
Mike Fährmann
803d8f814e [oauth] update scope for reddit tokens (#428)
'/user/<username>/...' requires the 'history' scope to be accessible
(https://www.reddit.com/dev/api/#GET_user_{username}_{where})
2019-09-27 17:38:55 +02:00
Mike Fährmann
46ba173ded [reddit] fix documentation inconsistencies (closes #429)
- Require 'reddit.comments' to be a number and convert it to an
  integer to be extra sure
- Link to the README's OAuth section were appropriate
2019-09-27 17:34:10 +02:00
Mike Fährmann
20eb6c401f [nijie] improvements and fixes (#423)
- ignore unavailable image pages
- more metadata fields: artist_name, date, tags
- rename 'index' to 'num'
- improved code structure
2019-09-26 21:45:01 +02:00
Mike Fährmann
d1ea08c67d [weibo] fixes and improvements
- ignore unavailable videos (fixes #427)
- handle empty 'geo' fields
- consistent metadata fields for images and videos
2019-09-26 14:57:35 +02:00
Mike Fährmann
38d97f3da6 [deviantart] add debug message about API credentials (#424) 2019-09-25 21:20:55 +02:00
Mike Fährmann
80c2104fb5 [deviantart] fix 429 handling if 'fatal' is False (closes #424) 2019-09-25 21:16:35 +02:00
Mike Fährmann
913460240d [reddit] fix 'extractor.blacklist()' arguments
The second argument must support 'append()'.
2019-09-24 23:01:12 +02:00
Mike Fährmann
22bac14452 [pixiv] match '/artworks/' URLs 2019-09-24 21:53:14 +02:00
Mike Fährmann
66cac207ac [twitter] match and use 'i/web' status URLs 2019-09-24 21:18:05 +02:00
Mike Fährmann
5a1a0f5325 change text representation of user extractors to "User Profiles" 2019-09-22 22:21:48 +02:00
Mike Fährmann
946f2751e2 [reddit] add 'user' extractor (closes #350) 2019-09-22 22:18:17 +02:00
Mike Fährmann
c14abb9fb8 [reddit] improve URL parameter handling for subreddit links 2019-09-22 22:03:22 +02:00
Mike Fährmann
ee8b654464 [instagram] implement 'highlights' option (closes #329) 2019-09-21 23:38:20 +02:00
Mike Fährmann
f63c3097a9 [instagram] rework some code paths
- combine fetching an HTML page and extracting its 'shared_data'
- move 'shared_data' and field access info out of '_extract_page()'
- introduce a '_request_graphql()' method
2019-09-21 23:10:41 +02:00
Mike Fährmann
4330133114 [imgur] add 'favorite' extractor (closes #420)
… and use a newer site-internal API endpoint for user posts
2019-09-19 15:54:26 +02:00
Mike Fährmann
ee5e20221f [imgth] fix image URLs 2019-09-19 14:56:48 +02:00
Mike Fährmann
b63b126808 [hentaicafe] extend URL pattern 2019-09-18 19:08:45 +02:00
Mike Fährmann
d780f0357e [imgur] add user extractor 2019-09-17 22:58:18 +02:00
Mike Fährmann
11ea689013 [simplyhentai] fix image and video URLs 2019-09-16 21:37:16 +02:00
Mike Fährmann
15632a1570 [tsumino] fix extraction 2019-09-15 22:09:59 +02:00
Mike Fährmann
d92802fd37 [luscious] fix detection of unavailable galleries 2019-09-15 21:16:25 +02:00
Mike Fährmann
f99da2b866 [imgbb] detect invalid album and user profile links
and update test results, since the old album got deleted
2019-09-14 23:22:08 +02:00
Mike Fährmann
01bc7adadc [deviantart] improve journal detection (#419)
Some journal-like posts are not reported to be journals (isJournal
is set to False), even though they have a textContent field.

https://www.deviantart.com/gliitchlord/art/brashstrokes-812942668
2019-09-14 22:45:22 +02:00