What is the process of server-side web scraping?

Is the title clear enough for what I'm trying to convey? It's a bit challenging to explain, but here goes.

My goal: I need to verify if multiple engine numbers (from vehicles) are registered with the local transportation authority. The web interface provided allows me to check only one number at a time, which is not feasible when dealing with over 200 numbers. I previously created a python script for this using web scraping, but now I want to implement it on a server.

A user will input all the numbers in a text file and upload or paste the contents into a text field. Then, I'll automate the form submission on the transportation website for each number using web scraping and display the final status of all the numbers.

I am seeking guidance on how to achieve this on a server. What technologies would be useful? I have experience with Java and JavaScript, but no knowledge of PHP (willing to learn if necessary). I am unsure about implementing this on the server side and any assistance or ideas would be greatly welcomed.

Thank you.

Answer №1

JSoup, a popular Java library, offers an intuitive API for working with HTML using CSS selectors.

In addition to this, there are useful built-in functions available that can fetch HTML content from any given URL.

By combining these features, you can create a powerful server-side web scraper.

[update] Upon closer examination of your question, it seems you're not only interested in scraping data but also in automatically submitting an HTML form to an external server using Java. This raises an intriguing query which I've pondered myself.

You may find a solution here: How to send post form with java?

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Vue-moment displaying incorrect time despite timezone setting

Feeling a bit puzzled about my Laravel 8 application. I store time in UTC with timestamp_no_timezone in my PostgreSQL database. When I check the time in the database, it displays today's date with 13:45 as the time. However, when I use vue-moment and ...

Modifying the content within a DIV element

I want to make changes to my DIV. <div id="main"> <div id="one"> <div class="red"> ... </div> <img class="avatar" src="img/avatar1.jpg"/> <span class="name"> John < ...

The presence of decimal values within an AJAX URL

Struggling to pass decimal values in an AJAX request to a server-side endpoint. Everything runs smoothly except when trying to include a decimal value in the URL. The "." character is recognized as reserved within the URL schema, causing a 404 error. Seeki ...

Is it possible to change the background color using jQuery?

Can you assist me with this: I have a website and I am looking to modify the bg-coler when hovering over the menu (similar to how it works on the rhcp-website). I attempted using jquery for this, but unfortunately, it did not yield the desired result.. ( ...

Using Twitter bootstrap with Node.js and Express framework

Is it possible to integrate Twitter Bootstrap with Node.js and Express? I understand that I need to place CSS and Javascript files in the ./public directory (if it's set as default). However, when trying to implement Twitter Bootstrap on Node.js and ...

Tips for monitoring user interactions with buttons in a React JS Android app using Firebase?

I am a novice when it comes to react native apps and I am currently exploring how to track specific buttons within my android application. Within my react native app, there is a "submit" button that appears during the signup process, and I am looking to t ...

Having trouble with loading background images in the style tag of Vue.js. Can anyone offer a solution?

I have exhausted all options in my Vue app, but I am still unable to load the background image in my app.vue file. Despite getting the .html file to work with it, the image just won't show up. I've scoured the internet for solutions without succe ...

Execute a JavaScript function when an HTML list element is clicked

As a beginner in web development, I have a question that might seem obvious to some. I want to create a menu where each item, when clicked, triggers a JavaScript function with the item's ID as an argument. I plan to display the menu items in an unorde ...

Updating OSGi bundles and managing ResourceBundle

Consider the scenario where we have two osgi bundles: bundleA and bundleB. Within bundleB, there are some texts.properties. In bundleA, we execute the following: ResourceBundle rb= ResourceBundle.getBundle("com/foo/texts",locale, classFromBundleB ...

the width of the table body is narrower than the table itself

I'm working on a table that has a fixed first column and needs to be vertically scrollable. I'm almost there with my CSS code, but the table rows are not as wide as the columns and I'm unsure why this is happening. .table th:first-child, ...

Display the output of JSON.stringify in a neatly formatted table

After sending my table data to the database using ajax, I am now trying to retrieve it by clicking on the open button. $.ajax({ type: "POST", url: "http://localhost/./Service/GetPageInfo", dataType: "json", ...

"Troubleshooting vertical tab inconsistencies in Jquery: addressing unexpected gaps and alignment

After spending a considerable amount of time on this, I'm still struggling to make it work properly. The main issue is the gap between the tabs and the content area. I've tried adjusting the positioning, but it ends up affecting the overall layo ...

Calculate the Total Amount using Jquery

Fiddle <table border="1" class="cssTable"> <tr id="trGroup"> <td> Black, Total <asp:Label ID="lblCount" runat="server"></asp:Label> </td> </tr> <tr> <td cla ...

Is it possible to create an Android application that can function as a computer mouse?

We are in the midst of developing an innovative android application that doubles as a computer mouse. Our current roadblock is figuring out how to effectively transfer coordinate values back to the computer. ...

How to extract the complete URL from the API endpoint in nextjs

I'm curious if there is a way to fetch the complete URL of the current request within the API route (pages/api/myapi). The only response I have found that comes close to what I need is the req.headers.referer, but I am uncertain if this value will alw ...

Update the Parent CRM Page upon closing the child ASP.net page

This pertains to CRM 2016 on-premise implementation. In my CRM 2016 environment, I have a customized button in the command bar that utilizes a JavaScript web resource to invoke an external asp.net page hosted on another server. This external page facilita ...

What is the best method to update values from a JSON file that has been read?

Trying my hand at modifying values in a JSON file without changing one particular value. Below is the code I've come up with, but to provide more context, I'll add some filler text so you can skim through and analyze the code. Don't be too h ...

Retrieve a Play Scala variable in the $scope of an AngularJS application

After trying various methods recommended on StackOverflow, I am still struggling to retrieve a Play Scala variable within my Javascript $scope. The line of initialization in an HTML file is as follows: @(playVariable: String)(implicit request: play.api.mv ...

Ways to modify the end result using callback information

In my scenario, I am working with an object that has keys id and name as shown below. The goal is to retrieve the customer name based on the id and then update the object accordingly. Const arr=[{id:1,name:''},{id:2,name:''}]; ...

Using the await keyword in the useEffect() React Hook: A comprehensive guide on implementation

When using several async functions within useEffect(), I am struggling to properly include the await keyword to wait for their results // Assume func1 and func2 are defined elsewhere async function func1() {...}; async function func2() {...}; // Import f ...