[http] Improve fetch_javascript_url (#462)

Replace `click_selector` by `wait_for,` which is a selector that we must wait for before considering the page loaded.

Also added `select_wait_for`, which returns the waited for element. Oddly this may be needed in some case (such as `artifactory.py`) where the `page.content()` does not contain the waited for element.
This commit is contained in:
Marc Wrobel
2025-07-09 22:59:37 +02:00
parent 9d63ed9252
commit 1f7a3772d6
3 changed files with 20 additions and 11 deletions

View File

@@ -17,8 +17,8 @@ necessary information. Available configuration options are:
- header_selector (mandatory, default = thead tr): A CSS selector used to locate the table's header row.
- rows_selector (mandatory, default = tbody tr): A CSS selector used to locate the table's rows.
- render_javascript (optional, default = false): A boolean value indicating whether to render JavaScript on the page.
- render_javascript_click_selector (optional, default = None): A playwright selector used to click on an element after
the JavaScript rendering. Only use when render_javascript is true.
- render_javascript_wait_for (optional, default = None): Wait until the given selector appear on the page. Only use when
render_javascript is true.
- render_javascript_wait_until (optional, default = None): Argument to pass to Playwright, one of "commit",
"domcontentloaded", "load", or "networkidle". Only use when render_javascript is true and if the script fails without it.
- ignore_empty_releases (optional, default = false): A boolean value indicating whether to ignore releases with no
@@ -154,8 +154,8 @@ class Field:
config = config_from_argv()
with ProductData(config.product) as product_data:
render_javascript = config.data.get("render_javascript", False)
render_javascript_click_selector = config.data.get("render_javascript_click_selector", None)
render_javascript_wait_until = config.data.get("render_javascript_wait_until", None)
render_javascript_wait_for = config.data.get("render_javascript_wait_for", None)
ignore_empty_releases = config.data.get("ignore_empty_releases", False)
header_row_selector = config.data.get("header_selector", "thead tr")
rows_selector = config.data.get("rows_selector", "tbody tr")
@@ -164,8 +164,8 @@ with ProductData(config.product) as product_data:
fields = [Field(name, definition) for name, definition in config.data["fields"].items()]
if render_javascript:
response_text = http.fetch_javascript_url(config.url, click_selector=render_javascript_click_selector,
wait_until=render_javascript_wait_until)
response_text = http.fetch_javascript_url(config.url, wait_until=render_javascript_wait_until,
wait_for=render_javascript_wait_for)
else:
response_text = http.fetch_url(config.url).text
soup = BeautifulSoup(response_text, features="html5lib")