Commit Graph

46 Commits

Author SHA1 Message Date
Mike Fährmann
d3c77cca80 [sexcom] fix 'tags' when passing cookies (#8880) 2026-01-15 11:02:03 +01:00
Mike Fährmann
00c6821a3f replace 2-element f-strings with simple '+' concatenations
Python's 'ast' module and its 'NodeVisitor' class
were incredibly helpful in identifying these
2025-12-22 11:26:04 +01:00
Mike Fährmann
d0f06be0d2 use 'operator +' when building 'pattern' 2025-12-20 22:07:44 +01:00
Mike Fährmann
e006d26c8e Revert "use f-strings when building 'pattern'"
revert d7c97d5a97.
2025-12-20 22:07:37 +01:00
Mike Fährmann
968597a302 yield 3-tuples for Message.Directory
adapt tuples to the same length and semantics as other messages
2025-12-05 21:39:52 +01:00
gengenson-code
a9018d1911 Added support for sex.com's feed (cookie required) (#8519)
* Added Extractor for sex.com feed
* Removed old comment
* add test URL
* update supportedsites
* simplify & fix flake8 newlines
* warn about missing session cookie

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2025-11-09 11:57:12 +01:00
Mike Fährmann
d7c97d5a97 use f-strings when building 'pattern' 2025-10-20 21:23:11 +02:00
Mike Fährmann
69f7cfdd0c [dt] replace 'datetime' imports 2025-10-16 11:42:42 +02:00
Mike Fährmann
b0a33d402d [sexcom] update 'search' extractor (#7807) 2025-07-18 21:02:32 +02:00
Mike Fährmann
9dbe33b6de replace old %-formatted and .format(…) strings with f-strings (#7671)
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
e08ec7e083 update copyright notices 2025-06-13 00:03:41 +02:00
Mike Fährmann
a14671992c [sexcom] prevent '.css' file downloads (#7632)
by detecting homepage redirects
and improve redirect handling in general
2025-06-11 22:32:08 +02:00
Mike Fährmann
df4845bf60 [sexcom] update
- fix 'title' and 'type' of pictures
- remove '#' from the beginning of each tag
- add 'gifs' option
2025-06-04 07:44:43 +02:00
wankio
47c7c85f46 [sexcom] support '/pics/' URLs (#7611)
* Update sexcom.py

For Pics url
https://www.sex.com/en/pics/1459016

* reorder 'pattern'

* strip '?width=…' query parameter

* add test

---------

Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>
2025-06-01 11:13:17 +02:00
Mike Fährmann
b81fc5c124 replace text.rextract() with rextr() 2025-05-23 18:28:58 +02:00
Mike Fährmann
e381c482ec [sexcom] extract 'date_url' metadata (#7239) 2025-03-28 16:30:12 +01:00
Mike Fährmann
d4ce8be1f5 [sexcom] support new-style '/videos' URLs (#7239) 2025-03-27 21:56:06 +01:00
Mike Fährmann
c6bc46f5ba [sexcom] support new-style '/gifs' URLs (#7239) 2025-03-27 20:01:57 +01:00
Mike Fährmann
12327b076e [sexcom] fix 'gif' pin extraction (#7239)
with much less metadata
2025-03-27 19:54:17 +01:00
Mike Fährmann
fc868b02f6 [sexcom] remove constructors 2025-03-27 19:05:44 +01:00
Mike Fährmann
7fe0f35998 [sexcom] add 'likes' extractor (#6149) 2024-09-06 07:44:06 +02:00
Mike Fährmann
a453335a9f remove test results in extractor modules
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
9501579279 [sexcom] fix fetching HD videos 2023-04-13 15:40:53 +02:00
Mike Fährmann
a2f7274eae [sexcom] fix pagination (#3906) 2023-04-13 15:39:15 +02:00
pubak42
e7326cdf1d [sex.com] Download videos from cdn (#3408)
The format of video sources was changed recently to be a full URL with https:// in the beginning.
The original extractor code appended the video source URL to root url of the website, thus yielding
invalid url in format ...sex.comhttps... that failed to resolve.
2022-12-16 10:16:40 +01:00
Mike Fährmann
b0cb4a1b9c replace 'text.extract()' with 'text.extr()' where possible 2022-11-05 01:14:09 +01:00
Mike Fährmann
5d5a08cc69 [sexcom] add fallback for empty files (#2485) 2022-04-10 14:22:07 +02:00
Mike Fährmann
cc7dce5755 [sexcom] add 'pins' extractor (closes #2265) 2022-02-04 20:55:00 +01:00
Mike Fährmann
efb3e65a6a [sexcom] extend URL pattern (fixes #2220) 2022-01-24 01:19:40 +01:00
Mike Fährmann
4376b39a2b [sexcom] fix and improve embed extraction (fixes #2145) 2021-12-28 21:59:39 +01:00
Mike Fährmann
bd08ee2859 remove most 'yield Message.Version' statements
only leave them in oauth.py as noop results
2021-08-16 03:10:48 +02:00
Mike Fährmann
968d3e8465 remove '&' from URL patterns
'/?&#' -> '/?#' and '?&#' -> '?#'

According to https://www.ietf.org/rfc/rfc3986.txt, URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
846d3a2466 [sexcom] replace 404ed test 2020-05-18 19:04:51 +02:00
Mike Fährmann
32e36d8f02 [sexcom] replace tests 2020-03-17 22:47:45 +01:00
Mike Fährmann
4e361b3008 add tests for specific datetime values 2020-02-23 16:48:30 +01:00
Mike Fährmann
b38cf59711 [sexcom] fix image URLs & parse 'date' fields 2020-02-04 22:52:00 +01:00
Mike Fährmann
fca87974fe [sexcom] fix video downloads by sending specific Referer headers 2019-11-19 23:52:34 +01:00
Mike Fährmann
f15eedb634 [sexcom] set Referer header for file downloads (closes #464) 2019-11-03 13:27:58 +01:00
Mike Fährmann
4409d00141 embed error messages in StopExtraction exceptions 2019-10-28 16:39:49 +01:00
Mike Fährmann
ef17d94469 update test results 2019-10-21 21:53:21 +02:00
Mike Fährmann
23251356cb require 'extension' data for each URL (#382) 2019-08-14 20:03:03 +02:00
Mike Fährmann
fdec59f8e2 replace extractor.request() 'expect' argument
with
- 'fatal': allow 4xx status codes
- 'notfound': raise NotFoundError on 404
2019-07-05 00:42:16 +02:00
Mike Fährmann
b4da8c5a97 [sexcom] add extractor for related pins (#325) 2019-07-03 21:04:23 +02:00
Mike Fährmann
69997e92db [sexcom] skip unavailable pins (#325) 2019-07-02 22:05:54 +02:00
Mike Fährmann
0318c610dc [sexcom] add extractor for search results (#147) 2019-04-24 22:10:01 +02:00
Mike Fährmann
a247c94c34 [sexcom] add pin and board extractors (#147) 2019-04-24 22:09:19 +02:00