Is it possible to retrieve HTML content using YQL?

Imagine the scenario where you need to retrieve information from a web page structured like this:

<table>
  <tr>
    <td><a href="Link 1">Column 1 Text</a></td>
    <td>Column 2 Text</td>
    <td>Column 3 Text</td>
  </tr>
  <tr>
    <td><a href="Link 2">Column 1 Text</a></td>
    <td>Column 2 Text</td>
    <td>Column 3 Text</td>
  </tr>
  ...
</table>

transforming it into JSON format :

[
  {
    link: 'Link 1',
    text: 'Column 1 Text',
    data: 'Column 3 Text'
  },
  {
    link: 'Link 2',
    text: 'Column 1 Text',
    data: 'Column 3 Text'
  }
]

Is it possible to achieve this using YQL? If so, could you provide an example query?

Any assistance on this matter would be greatly appreciated!

Answer №1

Take a look at this initial query as a strong foundation, utilizing the HTML table coupled with XPath query (refer to Extracting HTML Content With XPath for more insights into this method):

select * from html where url="http://cantoni.org/test/table.html" and xpath='//table/tr'

This query generates JSON results similar to the following structure:

{
 "query": {
  "count": 2,
  "created": "2012-01-06T20:16:46Z",
  "lang": "en-US",
  "results": {
   "tr": [
    {
     "td": [
      {
       "a": {
        "href": "Link%201",
        "content": "Column 1 Text"
       }
      },
      {
       "p": "Column 2 Text"
      },
      {
       "p": "Column 3 Text"
      }
     ]
    },
    {
     "td": [
      {
       "a": {
        "href": "Link%202",
        "content": "Column 1 Text"
       }
      },
      {
       "p": "Column 2 Text"
      },
      {
       "p": "Column 3 Text"
      }
     ]
    }
   ]
  }
 }
}

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Cherrypy/Chrome: Issue with jquery ajax request remaining pending after several successful requests

My current project involves a Cherrypy server that receives a JSON array from a client via AJAX call. The server then manipulates the array and sends it back to the client. However, I've noticed an issue where after a few smooth requests, the next req ...

Accessing state in Vuex modules is crucial for managing and manipulating data effectively

Feeling a bit lost here... I'm utilizing Vuex in my project and have a module called common stored in the file common.js: const initState = { fruits: [] } export default { state: initState, mutations: { SET_FRUITS(state, fruits) { cons ...

Nodejs and express authentication feature enables users to securely access their accounts by verifying their identity before they

I am currently working on a straightforward registration page that needs to save user input (name, email, password) into a database. My tools for this task are express and node. What I am experimenting with is consolidating all the database operations (suc ...

Is it achievable to dynamically generate new pages on initial load in NextJS?

Right now, I am utilizing getStaticProps and getStaticPaths to pre-render a specific number of articles before my website goes live. This method works effectively, but if a new article is created and displayed on the front page while the site is still acti ...

Why is it that using e.preventDefault() does not prevent the link from being followed?

What is the solution to prevent a link from being followed with this specific event handler? http://jsfiddle.net/chovy/rsqH7/1/ <table> <tbody> <tr class="msg"> <header><a href="http://cn ...

Do you think it's essential to have a collection of items stored within a specifically named entity?

When it comes to formatting responses in JSON, there are different approaches. Let's consider a simple scenario with a GET /users resource: { "success": true, "message": "User created successfully", "data": [ {"id": 1, "name": "Jo ...

Why is my Ajax utilizing PHP _POST not functioning as expected?

I am facing an issue with the JavaScript code below: <script src='https://ajax.googleapis.com/ajax/libs/jquery/1.12.2/jquery.min.js'></script> <script> function deletUserInfo(id_user){ console.log(id_user); ...

Utilize ramda.js to pair an identifier key with values from a nested array of objects

I am currently working on a task that involves manipulating an array of nested objects and arrays to calculate a total score for each identifier and store it in a new object. The JSON data I have is structured as follows: { "AllData" : [ { "c ...

When using LWP::Simple::get, the URL that typically functions in a browser may not yield the desired

#!/usr/bin/perl use strict; use warnings; use JSON qw( decode_json ); use LWP::Simple; my $cavirtex = get('https://www.cavirtex.com/api2/orderbook.json?currencypair=BTCCAD'); print $cavirtex; After running the code, I encountered the following ...

What are some ways to ensure that the promise from postgres is fulfilled before moving forward with my code execution?

I am currently developing a Node-js application that requires retrieving information from the database before making another database call. However, I am facing an issue where the first check is not always resolved before proceeding to the next step. I hav ...

Encountered a CSS error while trying to compile the project with npm build

When attempting to build the project, I encountered a postcss error. After some debugging, I discovered that the imports below were causing the issue: @import "@material/button/dist/mdc.button.min.css"; /*material text box css*/ @import "@material/float ...

Tips for resolving Vue.js static asset URLs in a production environment

I included the line background-image: url(/img/bg.svg); in my CSS file. During development mode, this resolves to src/img/bg.svg since the stylesheet is located at src/css/components/styles.css. However, when I switch to production mode, I encounter a 40 ...

Exploring the isolate scope within a compiled directive

I successfully managed to compile a directive using this piece of code: let element, $injector, $compile, link, scope; element = angular.element(document.getElementById(#whatever)); $injector = element.injector(); $compile = $injector.get('$compile& ...

Unable to select image inside linked container

I'm attempting to display a dropdown menu when the user clicks on the gear-img div using jQuery. However, because it's wrapped inside an a tag, clicking redirects me to a URL. I also want the entire div to be clickable. Any suggestions for a solu ...

concealed highcharts data labels

I attempted to create a bar chart using Highcharts, and initially it worked fine. However, I encountered an issue when displaying multiple indicators - the datalabels for certain data points are hidden. For example, the datalabel for "provinsi aceh" is not ...

Storing various duplicates of items in local storage

Looking for help with storage settings in HTML/JavaScript as I work on a mobile not taking app using Phonegap. My goal is to allow users to input a note name and content, save them into a jquery mobile list, and see them on the home screen. However, I&apos ...

ERROR: Running out of memory in JavaScript heap while executing a command with "npm"

Encountered a fatal error (FATAL ERROR: MarkCompactCollector: semi-space copy, fallback in old gen Allocation failed - JavaScript heap out of memory) while attempting to execute any npm command. The error persists even with the simple "npm -v" command. I ...

"Troubleshooting the issue of nested jQuery Ajax requests failing to execute properly

UPDATE: I'm getting an error saying that my li tag is considered an illegal token. I'm unsure how to resolve this issue as I believed I was appending it within a ul tag I'm tasked with fetching a JSON array of public Github repositories and ...

"Returning undefined: The outcome of using jQuery's .html() method on

I am having an issue with the JavaScript function I have created to load content from another URL into the current document. For some reason, both contentHtml and menuHtml variables are showing as undefined. Can someone please help me identify where I we ...

Shading THREE.js Color with Textures

I have added a simple box to my scene, and I am looking to create a shader that will apply a texture to it and add color to this texture. Here is my vertex shader (nothing special about it): <script id="vertexShader" type="x-shader/x-vertex"> ...