Mike Fährmann
d3c77cca80
[sexcom] fix 'tags' when passing cookies ( #8880 )
2026-01-15 11:02:03 +01:00
Mike Fährmann
00c6821a3f
replace 2-element f-strings with simple '+' concatenations
...
Python's 'ast' module and its 'NodeVisitor' class
were incredibly helpful in identifying these
2025-12-22 11:26:04 +01:00
Mike Fährmann
d0f06be0d2
use 'operator +' when building 'pattern'
2025-12-20 22:07:44 +01:00
Mike Fährmann
e006d26c8e
Revert "use f-strings when building 'pattern'"
...
revert d7c97d5a97 .
2025-12-20 22:07:37 +01:00
Mike Fährmann
968597a302
yield 3-tuples for Message.Directory
...
adapt tuples to the same length and semantics as other messages
2025-12-05 21:39:52 +01:00
gengenson-code
a9018d1911
Added support for sex.com's feed (cookie required) ( #8519 )
...
* Added Extractor for sex.com feed
* Removed old comment
* add test URL
* update supportedsites
* simplify & fix flake8 newlines
* warn about missing session cookie
---------
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de >
2025-11-09 11:57:12 +01:00
Mike Fährmann
d7c97d5a97
use f-strings when building 'pattern'
2025-10-20 21:23:11 +02:00
Mike Fährmann
69f7cfdd0c
[dt] replace 'datetime' imports
2025-10-16 11:42:42 +02:00
Mike Fährmann
b0a33d402d
[sexcom] update 'search' extractor ( #7807 )
2025-07-18 21:02:32 +02:00
Mike Fährmann
9dbe33b6de
replace old %-formatted and .format(…) strings with f-strings ( #7671 )
...
mostly using flynt
https://github.com/ikamensh/flynt
2025-06-29 17:50:19 +02:00
Mike Fährmann
e08ec7e083
update copyright notices
2025-06-13 00:03:41 +02:00
Mike Fährmann
a14671992c
[sexcom] prevent '.css' file downloads ( #7632 )
...
by detecting homepage redirects
and improve redirect handling in general
2025-06-11 22:32:08 +02:00
Mike Fährmann
df4845bf60
[sexcom] update
...
- fix 'title' and 'type' of pictures
- remove '#' from the beginning of each tag
- add 'gifs' option
2025-06-04 07:44:43 +02:00
wankio
47c7c85f46
[sexcom] support '/pics/' URLs ( #7611 )
...
* Update sexcom.py
For Pics url
https://www.sex.com/en/pics/1459016
* reorder 'pattern'
* strip '?width=…' query parameter
* add test
---------
Co-authored-by: Mike Fährmann <mike_faehrmann@web.de >
2025-06-01 11:13:17 +02:00
Mike Fährmann
b81fc5c124
replace text.rextract() with rextr()
2025-05-23 18:28:58 +02:00
Mike Fährmann
e381c482ec
[sexcom] extract 'date_url' metadata ( #7239 )
2025-03-28 16:30:12 +01:00
Mike Fährmann
d4ce8be1f5
[sexcom] support new-style '/videos' URLs ( #7239 )
2025-03-27 21:56:06 +01:00
Mike Fährmann
c6bc46f5ba
[sexcom] support new-style '/gifs' URLs ( #7239 )
2025-03-27 20:01:57 +01:00
Mike Fährmann
12327b076e
[sexcom] fix 'gif' pin extraction ( #7239 )
...
with much less metadata
2025-03-27 19:54:17 +01:00
Mike Fährmann
fc868b02f6
[sexcom] remove constructors
2025-03-27 19:05:44 +01:00
Mike Fährmann
7fe0f35998
[sexcom] add 'likes' extractor ( #6149 )
2024-09-06 07:44:06 +02:00
Mike Fährmann
a453335a9f
remove test results in extractor modules
...
and add generic example URLs
2023-09-11 16:30:55 +02:00
Mike Fährmann
9501579279
[sexcom] fix fetching HD videos
2023-04-13 15:40:53 +02:00
Mike Fährmann
a2f7274eae
[sexcom] fix pagination ( #3906 )
2023-04-13 15:39:15 +02:00
pubak42
e7326cdf1d
[sex.com] Download videos from cdn ( #3408 )
...
The format of video sources was changed recently to be a full URL with https:// in the beginning.
The original extractor code appended the video source URL to root url of the website, thus yielding
invalid url in format ...sex.comhttps... that failed to resolve.
2022-12-16 10:16:40 +01:00
Mike Fährmann
b0cb4a1b9c
replace 'text.extract()' with 'text.extr()' where possible
2022-11-05 01:14:09 +01:00
Mike Fährmann
5d5a08cc69
[sexcom] add fallback for empty files ( #2485 )
2022-04-10 14:22:07 +02:00
Mike Fährmann
cc7dce5755
[sexcom] add 'pins' extractor ( closes #2265 )
2022-02-04 20:55:00 +01:00
Mike Fährmann
efb3e65a6a
[sexcom] extend URL pattern ( fixes #2220 )
2022-01-24 01:19:40 +01:00
Mike Fährmann
4376b39a2b
[sexcom] fix and improve embed extraction ( fixes #2145 )
2021-12-28 21:59:39 +01:00
Mike Fährmann
bd08ee2859
remove most 'yield Message.Version' statements
...
only leave them in oauth.py as noop results
2021-08-16 03:10:48 +02:00
Mike Fährmann
968d3e8465
remove '&' from URL patterns
...
'/?&#' -> '/?#' and '?&#' -> '?#'
According to https://www.ietf.org/rfc/rfc3986.txt , URLs are
"organized hierarchically" by using "the slash ("/"), question
mark ("?"), and number sign ("#") characters to delimit components"
2020-10-22 23:31:25 +02:00
Mike Fährmann
846d3a2466
[sexcom] replace 404ed test
2020-05-18 19:04:51 +02:00
Mike Fährmann
32e36d8f02
[sexcom] replace tests
2020-03-17 22:47:45 +01:00
Mike Fährmann
4e361b3008
add tests for specific datetime values
2020-02-23 16:48:30 +01:00
Mike Fährmann
b38cf59711
[sexcom] fix image URLs & parse 'date' fields
2020-02-04 22:52:00 +01:00
Mike Fährmann
fca87974fe
[sexcom] fix video downloads by sending specific Referer headers
2019-11-19 23:52:34 +01:00
Mike Fährmann
f15eedb634
[sexcom] set Referer header for file downloads ( closes #464 )
2019-11-03 13:27:58 +01:00
Mike Fährmann
4409d00141
embed error messages in StopExtraction exceptions
2019-10-28 16:39:49 +01:00
Mike Fährmann
ef17d94469
update test results
2019-10-21 21:53:21 +02:00
Mike Fährmann
23251356cb
require 'extension' data for each URL ( #382 )
2019-08-14 20:03:03 +02:00
Mike Fährmann
fdec59f8e2
replace extractor.request() 'expect' argument
...
with
- 'fatal': allow 4xx status codes
- 'notfound': raise NotFoundError on 404
2019-07-05 00:42:16 +02:00
Mike Fährmann
b4da8c5a97
[sexcom] add extractor for related pins ( #325 )
2019-07-03 21:04:23 +02:00
Mike Fährmann
69997e92db
[sexcom] skip unavailable pins ( #325 )
2019-07-02 22:05:54 +02:00
Mike Fährmann
0318c610dc
[sexcom] add extractor for search results ( #147 )
2019-04-24 22:10:01 +02:00
Mike Fährmann
a247c94c34
[sexcom] add pin and board extractors ( #147 )
2019-04-24 22:09:19 +02:00