Having difficulty retrieving text data from a web URL using JavaScript

I am trying to extract text data from a web URL ()

My approach involved using two node modules.

1) Using crawler-Request

it('Read Pdf Data using crawler',function(){
        const crawler = require('crawler-request');
        function response_text_size(response){
            response["size"] = response.text.length;
            return response;
        }
        crawler("http://www.africau.edu/images/default/sample.pdf",response_text_size).then(function(response){
            // handle response

            console.log("Response =" + response.size);
        });

    });

The issue here is that it does not print anything on the console as expected.

2) Using pfd2json/pdfparser

it('Read Data from url',function(){
        var request = require('request');
        var pdf = require('pfd2json/pdfparser');
        var fs = require('fs');
        var pdfUrl = "http://www.africau.edu/images/default/sample.pdf";
        let databuffer = fs.readFileSync(pdfUrl);
        pdf(databuffer).then(function(data){
            var arr:Array<String> = data.text;
            var n = arr.includes('Thursday 02 May');
            console.log("Print Array " + n);
        });

    });
  • Failed: ENOENT: no such file or directory, open ''

While I can access data from a local path successfully, extracting it from a URL seems to be causing issues.

Answer №1

The problem lies in your usage of the fs module (File System) to read a file from a remote server.

You also made a mistake with the pdf2json module, which likely resulted in an error?

Make sure you have imported the request module. This will enable you to fetch the file from the remote location. Here's one approach to achieve this:

it('Read Data from url', function () {
    var request = require('request');
    var PDFParser = require('pdf2json');

    var pdfUrl = 'http://unec.edu.az/application/uploads/2014/12/pdf-sample.pdf';

    var pdfParser = new PDFParser(this, 1);

    // Executed if there's an error during parsing
    pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError));
    // Executed when parsing is complete
    pdfParser.on("pdfParser_dataReady", pdfData => console.log(pdfParser.getRawTextContent()));

    // Send a request to get the content of the pdf file and then pass it to the pdf parser
    request({ url: pdfUrl, encoding: null }, (error, response, body) => pdfParser.parseBuffer(body));
});

By following these steps, you should be able to access the distant .pdf file within your application.

If you wish to explore further capabilities, I suggest referring to the pdf2json documentation. This will help you extract textual content from the .pdf file once the parsing process is completed.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Obtain a collection of data- attribute values using jQuery

I am attempting to extract all the values from the data-hiringurl attributes found on this particular page . When I used var data = $("li").attr('data-hiringurl'); and var data = $("li").data('hiringurl'); in the console, an error mess ...

An instructional HTML/JS dialogue for a linked page

On my website, there is a link that, when clicked, opens a new tab with a page that I don't control. I want to guide the user on what to do next after they are redirected to this new page ("Now please press the green button on this page"). Ideally, I ...

Enhancing a validation form with the 'onblur' event handler

Exploring the realm of JavaScript, I find myself intrigued by the concept of creating a validation form that activates upon clicking out of the input field. Implementing various techniques to integrate this feature into an existing form has been both chall ...

Looking for a resolution with NicEditor - Seeking advice on incorporating custom select options

I recently started using NICInline Editor and found a helpful sample at Is there a way to incorporate custom options into this editor? I would like the selected option's value to be inserted right at the cursor point of the Editor Instance. Query: H ...

What is the process for accessing an uploaded file in a server-side Classic ASP page?

I'm attempting to use Ajax to upload a file to a server-side script in classic ASP. Here is the relevant HTML and JavaScript code: <input type="file" id="fileInput" /> and function saveToServer(file) { const fd = new FormData(); fd.a ...

"Exploring the process of integrating angular-xeditable into a MeanJS project

I recently attempted to install angular-xeditable from the link provided, but encountered issues while trying to reference the JavaScript files in layout.html after downloading it with bower. According to the documentation, these files should be added auto ...

I am encountering an issue where the useState hook is returning an undefined value on separate components, even after

When setting up a login context, I wrap all my routes with the context provider and pass the initial value using useState: <userContext.Provider value={{loggedUser, setLoggedUser}}> In LogInMenu.jsx, which is responsible for setting the loggedUser ( ...

Can you recommend any open source projects with exceptionally well-written Jasmine or Jasmine-Jquery tests?

Currently, I am in the process of learning how to test a new jquery plugin that I plan to develop. I'm curious if there are any notable Github projects like Jasmine or Jasmine-jquery with impressively crafted Jasmine tests that I could explore for in ...

Firefox not rendering responsive YouTube embed properly

This method of embedding seems to be functioning well on all browsers except for Firefox; I took advantage of a free trial at crossbrowsertesting.com to verify. I’m not using a direct iFrame embed, and all the solutions I’ve come across are related to ...

Choosing multiple options from a list

I am working on a messaging app where users can compose and send messages to contacts. Currently, I am only able to send messages to one contact at a time. My goal is to enable users to select multiple contacts to create group messages. Since I am new to a ...

JS receiving a reference to an undefined variable from Flask

I referenced this helpful post on Stack Overflow to transfer data from Flask to a JS file. Flask: @app.route('/') def home(): content = "Hello" return render_template('index.html', content=content) HTML <head> ...

Python Selenium - encountering issues with interactability of elements

Trying to complete the form on () is proving tricky for me. When I utilize Javascript, document.getElementsByName("userName")[0].value = "Hello", I am successful in entering text into a form. However, when I apply the same logic in Sel ...

Unlock maximum screen viewing on your custom video player with these steps to activate fullscreen

I'm having an issue with my basic video player - when toggling fullscreen, it doesn't fill the whole screen even though I tried using .fullscreen{width:100%} without success after searching for a solution. html <div class='player-contai ...

Value of an object passed as a parameter in a function

I am trying to use jQuery to change the color of a link, but I keep getting an error when trying to reference the object. Here is my HTML : <a onmouseover="loclink(this);return false;" href="locations.html" title="Locations" class="nav-link align_nav" ...

What could be causing the undefined properties of my input variables in Angular?

Currently, I am fetching data from a service within the app component and passing it down to a child component using @Input. Oddly enough, when I log the data in ngOnInit, it appears correctly in the child component. However, when I try to assign it to a v ...

Having difficulty retrieving the city name from the website (goibibo.com)

New to using selenium, I am currently working on automating the website goibibo.com. My goal is to input "Chennai" into the "From" textbox using xpath, which prompts some suggestions. From those suggestions, I intend to select "Chennai". Utilizing Seleniu ...

What is the best way to trigger a JavaScript function using an HTML button?

I am trying to trigger a JavaScript file from an HTML component by clicking on a button, but nothing happens when I click the button: <!DOCTYPE html> <html> <head> <meta charset="utf-8"> </head> <body> < ...

Tips for enhancing contrast in MUI dark theme modals

Currently, I am implementing MUI dark mode for my Next.js application. While the MUI modal functions perfectly in light mode, I am struggling with visibility issues when using it in dark mode. The contrast is not strong enough, making it difficult to disti ...

What is the best way to attach a jQuery UI event handler to a button that has been dynamically generated?

Currently, I have a jquery ui modal dialog box form that inserts items into a table. A specific column in the table contains a button which serves as a link to edit the particular row. My goal is to attach an event handler to this button so that when a use ...

Tips for sending information from a controller to jQuery (Ajax) in CodeIgniter

Code snippet in controller: $rates['poor'] = 10; $rates['fair'] = 20; $this->load->view('search_result2', $rates); //Although I have attempted different ways, the only successful method is using the code above. Other ...