Tips for improving the scrolling function in Java with Selenium for optimal performance

I'm currently working on a project using Java in MAVEN. My task involves retrieving a URL, scrolling down the page, and extracting all the links to other items on that website.

So far, I have been able to achieve this dynamically using Selenium, but the process is quite slow. I'm looking for ways to optimize it. Your assistance would be greatly appreciated.

For instance, I'm focusing on a specific page with the following link: here.

My Queries:

  1. Scrolling through the webpage using selenium is inefficient. How can I speed up this process? (Please suggest an alternative method or help me enhance the existing one).

Thank you in advance for your support. Looking forward to your response.

Dynamically Fetching and Scrolling Code:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
import com.google.common.collect.*;
import java.io.File;
import java.util.ArrayList;
import java.util.Date;
import org.apache.commons.io.FileUtils;
import org.openqa.selenium.JavascriptExecutor;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.firefox.FirefoxProfile;

/**
 *
 * @author jhamb
 */
public class Scroll_down {

    private static FirefoxProfile createFirefoxProfile() {
        File profileDir = new File("/tmp/firefox-profile-dir");
        if (profileDir.exists()) {
            return new FirefoxProfile(profileDir);
        }
        FirefoxProfile firefoxProfile = new FirefoxProfile();
        File dir = firefoxProfile.layoutOnDisk();
        try {
            profileDir.mkdirs();
            FileUtils.copyDirectory(dir, profileDir);
        } catch (IOException e) {
            e.printStackTrace();
        }
        return firefoxProfile;
    }

    public static void main(String[] args)  throws InterruptedException{
        String url1 = "http://www.jabong.com/men/shoes/men-sports-shoes/?source=home-leftnav";
        System.out.println("Fetching %s..." + url1);
        WebDriver driver = new FirefoxDriver(createFirefoxProfile());

        driver.get(url1);  

        JavascriptExecutor jse = (JavascriptExecutor)driver;
        jse.executeScript("window.scrollBy(0,250)", "");
        for (int second = 0;; second++) {
            if (second >= 60) {
                break;
            }
            jse.executeScript("window.scrollBy(0,200)", "");
            Thread.sleep(1000);
        }
        String hml = driver.getPageSource();
        driver.close();

        Document document = Jsoup.parse(hml);

        Elements links = document.select("div");

        for (Element link : links) {
            System.out.println(link.attr("data-url"));
        }
    }
}

Answer №1

In the realm of Selenium, scrolling is deeply connected to Javascript. Without a specific goal in mind for your Selenium project, it's challenging to ascertain the effectiveness of the code. With confidence in the quick retrieval of data, avoiding sleep methods may be beneficial. While these methods slow down Selenium, they do ensure elements are fully loaded before proceeding. The choice is yours when determining what tests to conduct.

Answer №2

What is the method for scrolling down a page?

To scroll down a page, you can use the ele.sendKeys(Keys.PAGE_DOWN) method, where ele represents any existing element on the page.

Continuously execute this command until you locate the specific item you are searching for.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

The jQuery fadeOut function modifies or erases the window hash

While troubleshooting my website, I discovered the following: /* SOME my-web.com/index/#hash HERE... */ me.slides.eq(me.curID).fadeOut(me.options.fade.interval, me.options.fade.easing, function(){ /* HERE HASH IS CLEARED: my-web.com/index/# * ...

Adding images in ascending order according to the parent div's ID

If I have three different divs with unique IDs on a webpage <div id="product-id-001"></div> <div id="product-id-002"></div> <div id="product-id-003"></div> Is there a way to add image elements based on the ID of each d ...

Angular Material Clock Picker for 24-Hour Time Selection

Struggling to find a time picker component that supports the 24-hour format for my Angular 14 and Material UI application. Can anyone help? ...

What is the best way to resume execution from the point where the Excel file last left

I am currently automating Excel operations with the Selenium WebDriver. The username and password are extracted from an Excel file and used to log in to the application, while the pass/fail status is then written back to the same Excel sheet. However, if ...

Implementing key strokes in an HTML input field within a geckoWebBrowser

I am currently using the "geckoWebBrowser1" component to navigate to a URL that displays a login textbox with the ID: login-email Although I have successfully inserted "[email protected]" into the aforementioned textbox, it is essential to simulate k ...

Leveraging data generated by a CasperJS script within an AngularJS application

After using yeoman.io to create an angular.js single page application, I found myself with app.js managing routes, mycontroller.js scripts, and an index.html file filled with "bower_components" references for libraries installed through the command line us ...

Issue encountered when converting java.util.Collections$UnmodifiableRandomAccessList to com.google.protobuf.Message

After receiving a gRPC response in the form of a Dynamic Message with nested fields, I am attempting to extract the top-level field first and then accessing the nested fields. Here is an example of how the response is structured: field1 { key1: "val ...

Preventing an image from being repeated when using Canvas drawImage() without having to clear the entire canvas

How can I prevent multiple instances of the same image from smearing across the canvas when drawing it? The platforms seem to stick together and not separate properly. Why do I have to clear the entire rectangle for everything to disappear? Does anyone ha ...

Using Django, CSS, and Javascript, create a dynamic HTML form that shows or hides a text field based on the selection

How can I hide a text field in my Django form until a user selects a checkbox? I am a beginner in Django and web applications, so I don't know what to search for or where to start. Any guidance would be appreciated. Here is the solution I came up wi ...

Troubleshooting issue with AngularJS ng-repeat not functioning properly when using Object key value filter with ng-model

Is there a way to have an object with an ID as a key value pair? For example: friends = { 1:{name:'John', age:25, gender:'boy'}, 2:{name:'Jessie', age:30, gender:'girl'}, 3:{name:'Johanna', ag ...

Add content or HTML to a page without changing the structure of the document

UPDATE #2 I found the solution to my question through Google Support, feel free to read my answer below. UPDATE #1 This question leans more towards SEO rather than technical aspects. I will search for an answer elsewhere and share it here once I have th ...

What is the most efficient way to execute useEffect when only one specific dependency changes among multiple dependencies?

My main objective is to update a state array only when a specific state (loadingStatus) undergoes a change. Yet, if I include solely loadingStatus as a dependency, React throws an error requesting all dependencies [loadingStatus, message, messageArray, set ...

Issue with Angular 7: In a ReactiveForm, mat-select does not allow setting a default option without using ngModel

I have a Angular 7 app where I am implementing some reactive forms. The initialization of my reactive form looks like this: private initFormConfig() { return this.formBuilder.group({ modeTransfert: [''], modeChiffrement: [' ...

AngularJS is not triggering the $watch function

I'm facing an issue with the $scope.$watch function, as it's not being triggered when expected. In my HTML document, I have a paginator using bootstrap UI: <pagination total-items="paginatorTotalItems" items-per-page="paginatorItemsPerPage" ...

Unable to access a hyperlink, the URL simply disregards any parameters

When I click an a tag in React, it doesn't take me to the specified href. Instead, it removes all parameters in the URL after the "?". For example, if I'm on http://localhost:6006/iframe.html?selectedKind=Survey&selectedStory=...etc, clicking ...

Components in array not displaying in React

I've been struggling to generate a table from an array in React. Typically, I retrieve data from a database, but for testing purposes, I manually created the array to ensure the data is correct. Despite following examples by enclosing my map code with ...

JavaScript plugin specifically designed to enable playback of media content, such as videos from

I am working on a spring-based project and I require the ability to play audio/video in various formats similar to YouTube. I am currently searching for a free JS plugin that would be the best fit for my needs. Any suggestions or recommendations would be ...

The script fails to work correctly when displaying the data

Last night, I finally cracked the code on how to transfer data from my form on the index page to the content page, and then display the results inside the content div back on the index page. However, a new challenge has emerged. My Javascript function (re ...

How to Use ExpectedConditions in Selenium Webdriver with C# to Handle Nonexistent Elements

With the power of c# and selenium webdriver, I have mastered detecting the presence of an element using this code snippet: new WebDriverWait(driver, TimeSpan.FromSeconds(timeOut)).Until(ExpectedConditions.ElementExists((By.Id(login)))); But now I challen ...

Weird State / Unusual Effectiveness with NextJS Links

I'm encountering some unusual behavior in a personal project I'm working on. The project is a recipe website with buttons at the top that lead to different pages querying Firebase for recipe data. In the Index.js file, Firestore queries pass pro ...