Browser Actions
Web Scraping API Browser Actions
Scraper APIs support performing a number of browser actions before retrieving a desired result.
Actions
click
clickName | Arguments | Description |
|---|---|---|
| Selectors:
| Performs a click action on a specified element and waits a set count of seconds. |
Example:
{
"target": "universal",
"url": "https://www.yahoo.com/",
"browser_actions": [
{
"type": "click",
"selector": {
"type": "xpath",
"value": "//button"
}
}
]
}input
inputName | Arguments | Description |
|---|---|---|
| Selectors:
* | Inserts text into a specified input element on the page. |
Example:
{
"target": "universal",
"url": "https://www.yahoo.com/",
"browser_actions": [
{
"type": "input",
"selector": {
"type": "xpath",
"value": "//input"
},
"value": "Hello world"
}
]
}scroll
scrollName | Arguments | Description |
|---|---|---|
|
| Scrolls the content of a page a a specified number of pixels. |
Example:
{
"target": "universal",
"url": "https://www.yahoo.com/",
"browser_actions": [
{
"type": "scroll",
"x": "0",
"y": "300"
}
]
}scroll_to_bottom
scroll_to_bottom| Name | Arguments | Description |
|---|---|---|
scroll_to_bottom | timeout_s: integer | Scrolls down the page for a set amount of seconds. |
Example:
{
"target": "universal",
"url": "https://www.yahoo.com/",
"browser_actions": [
{
"type": "scroll_to_bottom",
"timeout_s": "5"
}
]
}wait
wait| Name | Arguments | Description |
|---|---|---|
wait | wait_time_s: integer | Pauses for a specified number of seconds. |
Example:
{
"target": "universal",
"url": "https://www.yahoo.com/",
"browser_actions": [
{
"type": "wait",
"wait_time_s": "5"
}
]
}wait_for_element
wait_for_elementName | Arguments | Description |
|---|---|---|
| Selectors:
* | Waits for a specified duration for element to load. |
Example:
{
"target": "universal",
"url": "https://www.yahoo.com/",
"browser_actions": [
{
"type": "wait_for_element",
"selector": {
"type": "css",
"value": ".submit-button"
},
"timeout_s": "5"
}
]
}fetch_resource
fetch_resource
fetch_resourcecannot be combined with any other instructions and should be used with separate requests.
Name | Arguments | Description |
|---|---|---|
|
| Retrieves the first Fetch or XHR resource that matches the specified pattern |
Example:
{
"target": "universal",
"url": "https://www.yahoo.com/",
"browser_actions": [
{
"type": "fetch_resource",
"filter": "https://api.example.com/products/*",
"on_error": "error"
}
]
}General Arguments
Arguments available for all actions above
type
type| Name | Description |
|---|---|
type | Type of browser action used |
timeout_s
timeout_s| Name | Description |
|---|---|
timeout_s | How much time in seconds to wait at max until the execution of the action is terminated. |
wait_time_s
wait_time_s| Name | Description |
|---|---|
wait_time_s | How much time in seconds to use explicitly to execute the action. |
on_error
on_errorName | Description |
|---|---|
| Indicates what to do with actions in case they fail: "error": Stops the execution of browser actions. "skip": Continues with the next action. |
Fetching a Network Request
If a website populates content by fetching a JSON object, you can scrape just the network request and thus avoid having to deal with HTML altogether. To do this, you can use the fetch_resource browser action, as shown below.
fetch_resourcecannot be combined with any other instructions and can only be used with separate requests.
For example, when loading yahoo.com and opening the Network tab, we can see an exp.json file being loaded:
If you wish to scrape just the contents of this request, you can use the fetch_resource browser action. Note that filter is a regular expression that matches the filename:
{
"target": "universal",
"url": "https://www.yahoo.com/",
"browser_actions": [
{
"type": "fetch_resource",
"filter": "/ybar/exp"
}
]
}Results:
{
"results": [
{
"content": "{\n \"expCount\":5,\n \"selection\":\"individual\",\n ... }",
"status_code": 200,
"url": "https://example.com/api/product/1",
"task_id": "7131940420107377665",
"created_at": "2023-11-19 09:46:41",
"updated_at": "2023-11-19 09:47:08"
}
]
}Support
Still can't find an answer? Want to say hi? We take pride in our 24/7 customer support. Alternatively, you can reach us via our support email at [email protected].
Updated 5 days ago