I have been diving into a solution inspired by the discussion in this thread, but I am facing a roadblock when it comes to handling the formatting disparity between my JSON data and the example provided. It could be that this challenge is simpler than I perceive it to be; any guidance or references pointing me in the right direction would be greatly appreciated.
UPDATE: The JSON structure I'm dealing with looks like this:
[{"test":{"field1":"test123"},"info""2021-10-04\nPage visit 09:57:33\n - URL: https://www.google.com/\nPage visit 09:57:50\n - URL: https://www.google.com/blah-blah-blah/\nPage visit 09:56:03\n - URL: https://google.com/random-text/blah/\n\n2021-11-04\nPage visit 13:46:03\n - URL: https://www.google.com/blah/blah-blah/\n\n"}]
My aim is to parse through this data string, extract each URL, date, and timestamp, combine the dates and timestamps into a unified "yyyy-mm-dd 00:00:00" datetime format (e.g., "2021-10-04 08:57:23"), and then store these combined datetime values along with their corresponding URLs in a two-column array.
Although I can isolate the URLs and dates using regex, the challenge lies in pairing up the correct timestamps with their respective dates since they are listed separately.
//Extract URLs, dates, times
const urlMatches = text.match(/\bhttps?:\/\/\S+/gi);
const dateMatches = text.match(/(\d{1,4}([.\-/])\d{1,2}([.\-/])\d{1,4})/g);
const timeMatches = text.match(/\d{1,2}\D\d{1,2}\D(\d{4}|\d{2})/g);