gallery-dl

Author	SHA1	Message	Date
Mike Fährmann	c978fe18d4	[text] add 'extract_urls()' helper	2026-02-07 21:47:17 +01:00
Mike Fährmann	37aa7337dc	[text] reject long filename extensions (#8491 ) fixes regression introduced in `3252ead7c7` ref `bc868e7bb8`	2025-11-01 10:35:33 +01:00
Mike Fährmann	c8fc790028	merge branch 'dt': move datetime utils into separate module - use 'datetime.fromisoformat()' when possible (#7671) - return a datetime-compatible object for invalid datetimes (instead of a 'str' value)	2025-10-20 09:30:05 +02:00
Mike Fährmann	085616e0a8	[dt] replace 'text.parse_datetime()' & 'text.parse_timestamp()'	2025-10-17 17:43:06 +02:00
Mike Fährmann	17156ab7a2	[text] implement 'nameext_from_name()'	2025-10-15 11:14:49 +02:00
Mike Fährmann	724ae3661b	[text] add 'empty' argument to 'parse_query()' (#8377 ) enables including query parameters without value	2025-10-09 12:10:23 +02:00
Mike Fährmann	7bb4053396	[text] add 'sanitize_whitespace()'	2025-07-19 20:49:48 +02:00
Mike Fährmann	c08833aed9	[util] move 're' functions to text.py	2025-06-23 20:05:20 +02:00
Mike Fährmann	8f79ec67f4	[text] add 'build_query()'	2025-06-18 20:49:12 +02:00
Mike Fährmann	41191bb60a	'match.group(N)' -> 'match[N]' (#7671 ) 2.5x faster	2025-06-18 13:05:58 +02:00
Mike Fährmann	6d928f3805	remove some pre-3.8 workarounds (#7671 )	2025-06-17 12:56:47 +02:00
Mike Fährmann	e84df260c0	[util] generalize 'build_duration_func'	2025-06-08 20:01:16 +02:00
Mike Fährmann	fe39b7d8c8	[text] slightly improve performance of 'extract' functions by using 'None' instead of '0' as default 'pos' value this only saves a few nanoseconds per call, but still	2025-05-23 17:53:28 +02:00
Mike Fährmann	f3ed15573a	[text] add 'rextr()'	2025-05-23 17:28:58 +02:00
Mike Fährmann	04464b6cf0	[text] add second argument to 'parse_query_list()' (#7138 ) return only values whose name is in 'as_list' as a list	2025-03-10 09:36:50 +01:00
Mike Fährmann	db19990a82	[text] allow calling 'extract_iter' with invalid arguments	2025-03-02 10:44:06 +01:00
Mike Fährmann	b03ee3c4c4	[text] implement 'parse_query_list()'	2024-10-01 20:28:30 +02:00
Mike Fährmann	9f49cf16e8	[text] implement 'parse_query()' without using 'urllib.parse.parse_qsl' doesn't support bytes anymore, but is twice as fast	2024-10-01 20:28:11 +02:00
Mike Fährmann	2c7a0c3ca8	add alternatives for deprecated utc datetime functions	2024-09-19 20:47:05 +02:00
Mike Fährmann	5227bb6b1d	[text] catch general Exceptions	2024-04-13 18:51:40 +02:00
Mike Fährmann	76581c13f7	handle URLs without '/' after their TLD (#5252 )	2024-02-29 15:05:46 +01:00
Mike Fährmann	05255f5be0	add 'default' argument to 'text.extr()'	2022-11-09 11:00:32 +01:00
Mike Fährmann	eb33e6cf2d	add 'text.extr()' a stripped-down version of text.extract() that - always returns a string (like 'extract_from') - only returns a string - does not deal with 'pos' arguments - is ~20% faster	2022-11-04 21:37:36 +01:00
Mike Fährmann	67bad04dda	[formatter] add 'g' conversion to sluGify a string (#2410 )	2022-08-26 17:57:17 +02:00
Mike Fährmann	bddcec49f1	implement 'text.root_from_url()' use domain from input URL for kemono	2022-03-01 03:09:57 +01:00
Mike Fährmann	bc0e853d30	combine KeyError & IndexError to common base class LookupError	2022-02-11 00:42:49 +01:00
Mike Fährmann	bc868e7bb8	consider apparently long extensions as part of the filename (#1516)	2021-05-02 21:15:50 +02:00
Mike Fährmann	387fe415d5	unescape items in text.split_html()	2021-03-29 02:12:29 +02:00
Mike Fährmann	78fd63b8f0	remove 'text.clean_xml()' was not used anywhere	2021-03-28 04:05:16 +02:00
Mike Fährmann	8553b218d9	replace calls to 'os.path.splitext()' with 'str.rpartition()' Makes functions who used it more than twice as fast and we can get rid of an import as well.	2021-03-28 04:01:27 +02:00
Mike Fährmann	a09f42f6b3	improve filename_from_url() performance Manually extracting the part between the last '/' and '?' instead of relying on the standard libraries' 'urllib.parse.urlsplit()' increases performance by ~400%. urlsplit() : 3.64 secs per 1.000.000 iterations partition(): 0.87 secs per 1.000.000 iterations	2020-10-23 00:14:06 +02:00
Mike Fährmann	37d71f6e09	strip microseconds in text.parse_datetime()	2020-06-17 21:40:16 +02:00
Mike Fährmann	6294e2c540	add 'text.ensure_http_scheme()'	2020-05-19 22:32:53 +02:00
Mike Fährmann	a0f4c295c0	add optional 'utcoffset' argument to 'parse_datetime()'	2020-04-11 02:05:00 +02:00
Mike Fährmann	f6c5edb76b	pre-compile regex pattern for remove_html() and split_html()	2020-03-13 23:31:54 +01:00
Mike Fährmann	b1bea8aaeb	add 'restrict-filenames' option (#348 )	2019-07-23 17:41:24 +02:00
Mike Fährmann	1740086d8a	add 'repl' and 'sep' arguments to text.replace_html()	2019-07-17 14:48:24 +02:00
Mike Fährmann	b171befa87	implement 'parse_unicode_escapes()'	2019-06-16 21:47:24 +02:00
Mike Fährmann	2b1999476e	implement 'text.rextract()'	2019-05-28 21:03:41 +02:00
Mike Fährmann	2316e0ed3d	fix strptime workaround from `b0e85a4` Don't return a modified version of 'date_time' if strptime fails.	2019-05-25 23:22:26 +02:00
Mike Fährmann	b0e85a42e3	apply workaround from `4736912` in parse_datetime() itself	2019-05-09 21:53:17 +02:00
Mike Fährmann	d09864b581	implement text.parse_datetime()	2019-05-08 15:43:59 +02:00
Mike Fährmann	6264a46212	use 'utcfromtimestamp()' 'fromtimestamp()' converts its results to the local timezone and causes problems when running tests on a different machine.	2019-04-21 16:22:53 +02:00
Mike Fährmann	d670de0344	implement 'text.parse_timestamp()'	2019-04-21 15:28:27 +02:00
Mike Fährmann	21a7e395a7	implement convenience wrapper for text.extract functionality	2019-04-19 22:30:11 +02:00
Mike Fährmann	8f249f1d54	improve text.extract_iter() performance by roughly 40% through - inlining code - pre-calculating reused values - entering a try-except block only once	2019-04-18 23:37:17 +02:00
Mike Fährmann	5530871b5a	change results of text.nameext_from_url() Instead of getting a complete 'filename' from an URL and splitting that into 'name' and 'extension', the new approach gets rid of the complete version and renames 'name' to 'filename'. (Using anything other than {extension} for a filename extension doesn't really work anyway) Example: "https://example.org/path/filename.ext" before: - filename : filename.ext - name : filename - extension: ext now: - filename : filename - extension: ext	2019-02-14 16:07:17 +01:00
Mike Fährmann	e1d3e9a926	add 'ext_from_url' to text.py	2019-01-31 12:23:25 +01:00
Mike Fährmann	2d2953a5bf	add 'text.parse_float()' + cleanup in text.py	2019-01-29 16:46:21 +01:00
Mike Fährmann	ae9a37a528	implement text.split_html()	2018-05-27 15:00:41 +02:00

1 2

74 Commits