Is there a reliable method for parsing a CSV file that contains JSON strings as values?

When parsing CSV files, I encountered values that include strings representing JSON objects, along with boolean, normal strings, and other data. The CSV file has a header, and as I loop through the non-header rows, I utilize Javascript's split method with a specific regex pattern to extract the value from each 'cell' of the CSV row:

let currentLine = lines[i].split(/,(?=(?:(?:[^"]*"){2})*[^"]*$)/)

However, the current method fails to properly separate lines containing valid JSON strings. Some strings are captured correctly, but others like the one below are truncated in strange places, disrupting the parsing of the entire CSV file:

'{"an object" : [{"sub-object 1": {
    "description": "sub-object is for blah blah",
    "nestedArray": ["param1","param2","param3"]}},
  {"sub-object 2": {
    "description": "sub-object is for blah blah",
    "nestedArray": ["param1","param2","param3"]}},
  {"sub-object 3": {
    "description": "sub-object is for blah blah",
    "nestedArray": ["param1","param2","param3"]}}
  ]
}'

Upon inspection, the above appears to be valid JSON (strangely, validates it when pasted without the single quotes - the reason behind this behavior is unclear). Any suggestions on how to correctly handle this parsing issue? While I initially thought it would be straightforward, embedding JSON within CSV values is proving to be a challenge. I'm unsure if there is a standard or recommended approach to address this.

EDIT: To clarify, if anyone has a regex pattern that can correctly capture the JSON string within a CSV file as shown above, I would greatly appreciate it if you could share it with me

Answer №1

This code snippet serves as a foundation for parsing CSV files in a versatile manner.

(?:(?:^|,|\r?\n)[^\S\r\n]*)(?:("[^"\\]*(?:\\[\S\s][^"\\]*)*"|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|[^,\r\n]*)(?:[^\S\r\n]*(?=$|,|\r?\n)))

test: https://regex101.com/r/AnQqyv/1
Capture group 1 is used for trimming the fields

 (?:                           # Delimiter comma or newline
    (?: ^ | , | \r? \n )
    [^\S\r\n]*                    # leading optional whitespaces
 )
 (?:
    (                             # (1 start), field
       "                             # " Quoted
       [^"\\]* 
       (?: \\ [\S\s] [^"\\]* )*
       "
     |                              # or
       '                             # ' Quoted
       [^'\\]* 
       (?: \\ [\S\s] [^'\\]* )*
       '
     |                              # or
       [^,\r\n]*                     # Non-quoted
    )                             # (1 end)
    (?:
       [^\S\r\n]*                    # trailing optional whitespaces 
       (?= $ | , | \r? \n )          # Delimiter ahead, comma or newline
    )
 )

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Querying an array using the Contentful API

Recently, I've been experimenting with the Contentful API (content delivery npm module) and have encountered a challenge that I'm not sure how to overcome. In my Contentful setup, I have a content type called tag which consists of one field, als ...

Developing a customized ResourceConfig that mimics the default Jetty's Jersey registration process

I am currently working on an endpoint implementation that looks like this: @POST @Path("/test") @Consumes(MediaType.APPLICATION_JSON) @Produces(MediaType.APPLICATION_JSON) public String canaryTest(String JSON) { return JSON; } When I integrate this e ...

Having difficulty incorporating custom JavaScript files into a testing framework

I am facing a unique challenge while integrating the Page Object design pattern into my test suite using selenium-webdriver and node.js. The first page object, pageObject/admin/login/index.js, works seamlessly. It contains selectors and methods to fill ou ...

What is the process for transferring information from HTML to Python and then receiving the output in HTML?

I am completely unfamiliar with the connection between HTML and Python, so I am reaching out for some assistance. I hope that someone here can lend me a hand. Currently, my HTML is hosted on an Apache server, and I access the website using the address "". ...

Adjust fancybox height using jQuery

I am working on a project where I need to display a fancybox containing an iframe from another domain. The iframe has dynamic content and its height may change based on the pages it navigates to or the content it displays. I have access to the code of the ...

How to execute a system command or external command in Node.js

I am encountering an issue with Node.js. When using Python, I would typically perform an external command execution like this: import subprocess subprocess.call("bower init", shell=True) Although I have explored child_process.exec and spawn in Node.js, I ...

Retrieve the element by clicking on its individual tooltip

I am currently struggling with a jQuery UI tooltip issue. Specifically, I would like to retrieve the element that the tooltip is associated with when clicking on it. My Approach So Far $(".sample").tooltip({ content: function () { return $(t ...

AngularJS button click not redirecting properly with $location.path

When I click a button in my HTML file named `my.html`, I want to redirect the user to `about.html`. However, when I try using `$location.path("\about")` inside the controller, nothing happens and only my current page is displayed without loading `abou ...

Mysterious source utilizing leafletView for showcasing popups within the Angular Leaflet Directive

I've been attempting to implement a pop-up window on an Angular Leaflet map using Prune Cluster, but I keep running into an error stating that 'leafletView' is an unknown provider. Despite following the examples provided on this page, https: ...

Changes in a deep copy of an object in a child component are directly reflected in the parent object in VueJS

Let's begin by discussing the layout. I have a page dedicated to showcasing information about a specific company, with a component Classification.vue. This component displays categories of labels and the actual labels assigned to the current company. ...

Unlimited Possibilities in Designing Shared React Components

Seeking the most effective strategies for empowering developers to customize elements within my React shared component. For example, I have a dropdown and want developers to choose from predefined themes that allow them to define highlight color, font siz ...

Unable to refresh JSON information

My current task involves refreshing a "Bar chart" with JSON data on the server. However, I am facing an issue where Android is unable to update that JSON data even though I have tried using a runnable method. The fetched data remains constant and does not ...

Disabling the 'fixed navigation bar' feature for mobile devices

Here's a code snippet that I'm working with. It has the ability to make the navigation stick to the top of the page when scrolled to. HTML <script> $(document).ready(function() { var nav = $("#nav"); var position = nav.position(); ...

Tips for emphasizing specific sections of text in CodeMirror utilizing substring positions

I am currently utilizing CodeMirror () as a text editor with additional functionalities. One of these features includes highlighting specific words or groups of words based on their positions within the original string. I have an external structure that st ...

Encountered an issue: Error message stating that a Handshake cannot be enqueued as another Handshake has already been enqueued

I am currently working on setting up a node.js server to handle POST requests that involve inserting data into two separate MySQL tables. Below is the code snippet for my node.js server: let mysql = require("mysql"); const http = require('h ...

Disable link 2 when link 1 is clicked

Looking to create a feedback form with two exclusive links. Want to ensure that if someone clicks the first link, they cannot click the second link and vice versa. Interested in exploring options like using cookies to prevent multiple clicks or possibly ...

Ways to arrange an array in JavaScript or jQuery when each array record contains multiple objects

Before giving a negative vote, please note that I have thoroughly searched for solutions to this problem and found none similar to what I am facing. I am looking to alphabetically sort the array by image.name using JavaScript or jQuery: var myArray = [{ ...

Storing and Retrieving User Identifiers in Next.js

Currently, I am developing a project using Next.js and I have the requirement to securely store the userId once a user logs in. This unique identifier is crucial for accessing personalized user data and creating dynamic URLs for the user profile menu. The ...

What options are available for managing state in angularjs, similar to Redux?

Currently, I'm involved in an extensive project where we are developing a highly interactive Dashboard. This platform allows users to visualize and analyze various data sets through charts, tables, and more. In order to enhance user experience, we ha ...

Data from AngularFire not displaying in my list application

While going through tutorials on the Angular website, I encountered a roadblock while attempting to create a list that utilizes Firebase for data storage. Strangely, everything seems to be functional on the Angular site, but clicking on the "Edit Me" link ...