Tips for retrieving specific information from Wikipedia using AJAX

Is there a way to retrieve the information consistently displayed in the right box during searches using AJAX? I've tried using the Wikipedia API, but haven't been able to find the specific information I need.

https://i.sstatic.net/wqJEc.png

Answer №1

Oh my goodness, I never imagined spending so much time responding to this query on stackoverflow,

Presented below is a crude yet functional code snippet:

// wikipedia article (in url)
const wiki_article_title = 'Grand_Theft_Auto_V';

// please check https://www.mediawiki.org/wiki/API:Get_the_contents_of_a_page
const url_api = `https://en.wikipedia.org/w/api.php?action=parse&page=${wiki_article_title}&prop=text&formatversion=2&origin=*`;

function extractInfoboxFromWiki(doc) {
  // here we extract the json provided by api
  const json = doc.querySelector('pre');
  const obj = JSON.parse(json.innerText);
  let html = obj.parse.text;

  // for whatever reason '\n' substring are present in html text
  // so we remove them with a regex to not break 'JSON.parse()'
  html = html.replace(/\\n/gm, '');

  // get the interesting part of api reponse
  const node = document.createElement('div');
  node.innerHTML = html;
  const infobox = node.querySelector('.infobox');
  let infos = [...infobox.firstChild.children];

  let output = {};

  // parse title
  output['title'] = infos[0].querySelector('th').innerText;
  infos.shift();

  // parse image url
  output['image_url'] = infos[0].querySelector('a').getAttribute("href");
  infos.shift();

  // traverse the nodes to map captions with values
  infos.forEach( tr => {
    const key = tr.querySelector('th').innerText;

    if(tr.querySelector('ul')) {
      const lis = tr.querySelectorAll('li');
      const values = [...lis].map( li => li.innerText);
      output[key] = values;
    } else {
      const value = tr.querySelector('td').innerText;
      output[key] = value;
    }

  });

  // return beautified json
  return JSON.stringify(output, null, 4);
}

fetch(url_api)
  .then(response => response.text())
  .then(text => {
    const parser = new DOMParser();
    const doc = parser.parseFromString(text, 'text/html');

    const DESIRED_RESULT = extractInfoboxFromWiki(doc);
    const formattedOutput = `<pre>${DESIRED_RESULT}</pre>`;

    document.write(formattedOutput);
  });

If you test it with the Grand Theft Auto V article, you will observe:

{
    "title": "Grand Theft Auto V",
    "image_url": "/wiki/File:Grand_Theft_Auto_V.png",
    "Developer(s)": "Rockstar North[a]",
    "Publisher(s)": "Rockstar Games",
    "Producer(s)": [
        "Leslie Benzies",
        "Imran Sarwar"
    ],
    "Designer(s)": [
        "Leslie Benzies",
        "Imran Sarwar"
    ],
    "Programmer(s)": "Adam Fowler",
    "Artist(s)": "Aaron Garbut",
    "Writer(s)": [
        "Dan Houser",
        "Rupert Humphries",
        "Michael Unsworth"
    ],
    "Composer(s)": [
        "Tangerine Dream",
        "Woody Jackson",
        "The Alchemist",
        "Oh No"
    ],
    "Series": "Grand Theft Auto",
    "Engine": "RAGE",
    "Platform(s)": [
        "PlayStation 3",
        "Xbox 360",
        "PlayStation 4&q...</answer1>
<exanswer1><div class="answer accepted" i="66002290" l="4.0" c="1612216816" m="1612229346" v="1" a="U2ltb24gRGVoYXV0" ai="12153710">
<p>ok, omg, I never spend this much time for answering a question on stackoverflow,</p>
<p>so you have a working snippet below, it's dirty but it's working :)</p>
<p><div>
<div>
<pre class="lang-js"><code>// wikipedia article (in url)
const wiki_article_title = 'Grand_Theft_Auto_V';

// please check https://www.mediawiki.org/wiki/API:Get_the_contents_of_a_page
const url_api = `https://en.wikipedia.org/w/api.php?action=parse&page=${wiki_article_title}&prop=text&formatversion=2&origin=*`;

function extractInfoboxFromWiki(doc) {
  // here we extract the json provided by api
  const json = doc.querySelector('pre');
  const obj = JSON.parse(json.innerText);
  let html = obj.parse.text;

  // for whatever reason '\n' substring are present in html text
  // so we remove them with a regex to not break 'JSON.parse()'
  html = html.replace(/\\n/gm, '');

  // get the interesting part of api reponse
  const node = document.createElement('div');
  node.innerHTML = html;
  const infobox = node.querySelector('.infobox');
  let infos = [...infobox.firstChild.children];

  let output = {};

  // parse title
  output['title'] = infos[0].querySelector('th').innerText;
  infos.shift();

  // parse image url
  output['image_url'] = infos[0].querySelector('a').getAttribute("href");
  infos.shift();

  // traverse the nodes to map captions with values
  infos.forEach( tr => {
    const key = tr.querySelector('th').innerText;

    if(tr.querySelector('ul')) {
      const lis = tr.querySelectorAll('li');
      const values = [...lis].map( li => li.innerText);
      output[key] = values;
    } else {
      const value = tr.querySelector('td').innerText;
      output[key] = value;
    }

  });

  // return beautified json
  return JSON.stringify(output, null, 4);
}

fetch(url_api)
  .then(response => response.text())
  .then(text => {
    const parser = new DOMParser();
    const doc = parser.parseFromString(text, 'text/html');

    const WHAT_YOU_WANT = extractInfoboxFromWiki(doc);
    const formated = `<pre>${WHAT_YOU_WANT}</pre>`;

    document.write(formated);
  });

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Utilize CSS to vertically align buttons

One of my current projects involves creating a panel with buttons organized in columns side by side, similar to the layout shown below: However, I am struggling to achieve this desired arrangement. Below is the code I have been working on: <style&g ...

The Ajax submission is functioning perfectly with the exception of the notification feature

An email is still sent and delivered, yet there's no flash message displayed. form: <%= form_for :fb_comment, :url => update_reply_feedback_path, :html => { :id => "reply" }, :remote => true do |f| %> <%= f.text_area ...

Encountering difficulty in retrieving the outcome of the initial HTTP request while utilizing the switchMap function in RxJS

My goal is to make 2 HTTP requests where the first call creates a record and then based on its result, I want to decide whether or not to execute the second call that updates another data. However, despite being able to handle errors in the catchError bl ...

How can I add a blank selection at the bottom of my ng-options dropdown?

At the moment, my setup looks something like this (in a simplified form): <select ng-model=model.policyHolder ng-options="person.index as person.name for person in model.insurance.persons"> <option value>Someone else </select> This co ...

The deprecated body parser is throwing an error due to the lack of the extended option

I am encountering an issue with my code and I'm not sure how to resolve it. Since I am new to the codebase, I feel completely lost and clueless about what steps to take. body-parser deprecated undefined extended: provide extended option index.js:20:2 ...

What are some ways to utilize an empty array that has been declared in React's initial state?

I am currently in the process of developing an application that displays a collection of various lists. However, I have encountered a roadblock when attempting to access an empty array that I initialized when setting up the initial state. Here is the state ...

Calculating the mean value of the numbers provided in the input

I'm struggling with calculating the average of numbers entered through a prompt window. I want to display the numbers as I've done so far, but combining them to find the average is proving difficult. Here's the code I have: <html> &l ...

What is the optimal method for presenting data on a device from a massive database?

Currently, I am in the process of developing an iOS app that is designed to connect to a database and retrieve a JSON object containing data to be displayed in a table view asynchronously. While this approach works fine for now, I foresee potential challe ...

multer - the file uploaded by the request is not defined

I've been working on an app with Node, Express, and multer for image uploads. However, after submitting the form, req.file is coming up as undefined. I've spent hours trying to troubleshoot this issue but haven't been able to pinpoint the pr ...

I need help figuring out how to mention an id using a concatenated variable in the jquery appendTo() method

Using jQuery, I am adding HTML code to a div. One part of this code involves referencing a div's ID by concatenating a variable from a loop. $(... + '<div class="recommendations filter" id="recCards-'+ i +'">' + &apo ...

Load an iframe with Ajax without the need to refresh the page

When utilizing ajax, I am loading iframes and they are supposed to change at specific intervals. However, the page keeps refreshing whenever the iframes change. Is there a way to prevent this constant refreshing? ...

Developing a MySQL DB-driven Node.js dashboard without the need for a poller

After exploring various StackOverflow posts on the topic, I haven't been able to find a solution that fits my specific situation. We have multiple monitoring instances across our network, each monitoring different environments (such as Nagios, Icinga ...

Jquery form submission error occurring with file upload functionality

For the solution to this issue, please check out this fiddle here An inconsistency can be observed where the $('#photoform').serialize(); does not retrieve values for the form field, specifically the file field. What would be the appropriate ap ...

Retrieve a specific item from a JSON response using Node.js

When I receive a message from a WebSocket, the code snippet is triggered like this: ws.onmessage = (e) => { debugger if (e.data.startsWith('MESSAGE')) alert(JSON.stringify(e.data)) ImageReceived(e.data) c ...

Why is it necessary to type in localhost in the address bar when using node.js and socket.io instead of simply opening the client.html file directly?

I am intrigued by this topic. My suspicion is that it may be related to the temporary file created by socket.io, but I'm struggling to fully understand it... ...

Tips on effectively managing an XMLhttp request to a PHP controller

Encountering an internal server error and nothing is being logged in the PHP error log. This issue persists despite not using any frameworks. DEV - controllers |-> user -> check_email.php - public_html |-> routes ...

The issue of Facebook's AJAX system dropping user sessions

Currently, I am facing an issue with session variables while developing a Facebook app. The problem arises when I invoke a script via ajax to set a session variable, and then try to retrieve the same variable in another script called through ajax, only to ...

Unable to save data retrieved using jQuery JSONP

My current project involves fetching photo data from Flickr using a jQuery AJAX call with JSONP. However, instead of immediately using the data, I want to store it for future use. In some cases, users will be able to perform different queries on the pre-fe ...

Issue with concatenation function in JOLT causing errors

I am trying to combine two strings in JOLT. The field where I expect the result is showing up, but the value is not being populated. Desired Output: WG_REQ_FIRST_NAME- Sam WG_REQ_LAST_NAME- Jones requesterDetails- Sam Jones Current Result: https://i. ...

What is Mozilla's reasoning for deeming Conditional catch-blocks as non-conventional?

I recently came across a document on Mozilla that described the try...catch conditional as "non-standard and is not on a standards track. Do not use it in production" However, I am curious to understand the reason behind this statement. I also want t ...