Issues arise when encountering duplicated items during the JSON import process in Google BigQuery

I've been attempting to manually upload the JSON data into BigQuery, but I keep encountering the following error message.

Error occurred while reading data, error message: Parsing error in JSON at row starting from position 0: Repeated field must be imported as a JSON array. Field: custom_fields.value.

I have already converted the file into newline delimited JSON format, so that is not the issue here. Upon inspecting the custom_field.value mentioned in the error, I found the following:

$ cat convert2working.json | jq .custom_fields[].value
0
"Basics of information security\n"
"2021"

The crux of the problem appears to lie in the fact that custom_fields.value contains data with varying types.

How can I standardize these data types? Or could you suggest an alternative solution? Preferably, I would like to stick with JavaScript.

Below is a condensed excerpt of my JSON code:

{
    "id": "example",
    "custom_fields": [
        {
            "id": "example",
            "name": "Interval",
            "type": "drop_down",
            "type_config": {
                "default": 0,
                "placeholder": null,
                "options": [
                    {
                        "id": "example",
                        "name": "yearly",
                        "color": null,
                        "orderindex": 0
                    }
                ]
            },
            "date_created": "1611228211303",
            "hide_from_guests": false,
            "value": 0,
            "required": false
        },
        {
            "id": "example",
            "name": "Description",
            "type": "text",
            "type_config": {},
            "date_created": "1611228263444",
            "hide_from_guests": false,
            "value": "Basics of information security\n",
            "required": false
        },
        {
            "id": "example",
            "name": "Year",
            "type": "number",
            "type_config": {},
            "date_created": "1611228275285",
            "hide_from_guests": false,
            "value": "2021",
            "required": false
        }
    ]
}

Answer №1

To ensure successful schema auto-detection in BigQuery, it is essential to normalize your data structure. When the `value` property contains both numbers and strings interchangeably, this can cause the auto-detection process to fail.

There are various methods to normalize your data for optimal performance in BigQuery. It may require some experimentation to determine the most effective approach, as BigQuery analyzes only the first 100 rows for schema auto-detection.

One possible solution is to segregate different value types into separate fields:

const fields = data.custom_fields.map(x => {

    const f = {
        id: x.id,
        name: x.name
    };
  
    f[x.type] = x.value;
  
    return f;
});

This approach will result in:

[{
  id: "example",
  name: "Interval",
  value_drop_down: 0
}, {
  id: "example",
  name: "Description",
  value_text: "Basics of information security\n"
}, {
  id: "example",
  name: "Year",
  value_number: "2021"
}]

However, it's uncertain whether this structure will enable BigQuery to accurately merge the inferred type schema due to potential inconsistencies in encountered values within the dataset.

A more dependable strategy (assuming knowledge of all possible `type` values) involves explicitly transforming records into a standardized format. This method allows for specialized transformations on field values if needed:

const fields2 = data.custom_fields.map(x => ({
    id: x.id,
    name: x.name,
    value_text: x.type === 'text' ? x.value : null,
    value_number: x.type === 'number' ? parseInt(x.value, 10) : null,
    value_dropdown: x.type === 'drop_down' ? x.type_config.options.find(o => o.orderindex === x.value).name : null
})
);

Depending on the dataset characteristics, adjustments to the transformation logic may be necessary to handle optional or empty values effectively. Using the provided example data, this transformation yields:

[{
  "id": "example",
  "name": "Interval",
  "value_text": null,
  "value_number": null,
  "value_dropdown": "yearly",
  "value_drop_down": 0
}, {
  "id": "example",
  "name": "Description",
  "value_text": "Basics of information security\n",
  "value_number": null,
  "value_dropdown": null
}, {
  "id": "example",
  "name": "Year",
  "value_text": null,
  "value_number": "2021",
  "value_dropdown": null
}]

For experimenting with the code, you can access a JSFiddle [here](https://jsfiddle.net/pbgstveh/2/).

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Clicking on a link initiates the dropdown menu for selecting an option

This project is specifically designed for mobile use, so there's no need to worry about how it will appear on desktop screens. In this project, I have an "a href" with an icon next to it that simulates a button. When a user clicks on it, a dropdown me ...

Fade out when the anchor is clicked and fade in the href link

Is it possible to create fade transitions between two HTML documents? I have multiple HTML pages, but for the sake of example, let's use index.html and jobs.html. index.html, jobs.html Both pages have a menu with anchor buttons. What I am aiming to ...

Creating a blank array in NodeJs using mongoose

I'm currently facing an issue while attempting to populate an array with records from a mongoDB database using mongoose. Even though I declare the array outside the function, it appears empty when logged outside the function. Here's the code snip ...

Ways to detect if a script-blocking ad blocker like Ghostery is preventing a file from loading

I am currently working on a method to detect if ghostery is preventing google doubleclick ad scripts from being loaded. I prefer not to use a listener and instead want a straightforward way to determine if the script or URL is being blocked. Although the c ...

Transferring Sound file between Django and Vue

I am attempting to transfer a audio file that was generated by gtts from Django to Vue using HttpResponse. In the views.py file of Django: f = open('path/of/mp3/file', 'rb') response = HttpResponse() response.write(f.read()) response[& ...

What is the correct way to use JMESPath and Ansible to accurately filter elements from a list?

Here is the structure that I am working with: [ { "ami_launch_index": 0, "architecture": "x86_64", "instance_type": "t2.micro", "monitoring": { ...

How big should the placeholder image be for the image area?

How can I create a loading image to replace a .gif while it loads? I need a placeholder image of about 325x325 (same size as the gif) to keep content in place. I've attempted using background: url() without success and haven't explored JS/jQuery ...

Can you tell me the distinction between using RemoteWebDriver's executeScript() and Selenium's getEval() for executing

Can you explain the distinction between these two pieces of code: RemoteWebDriver driver = new FirefoxDriver(); Object result = driver.executeScript("somefunction();"); and this: RemoteWebDriver driver = new FirefoxDriver(); Selenium seleniumDriver = ne ...

Error encountered in parsing JSON: abrupt end of data (JavaScript)

I have been working on a few functions that are responsible for parsing JSON data, both external and internal, and displaying it on a local webpage using the localStorage feature. While I have successfully displayed the external JSON data, I am running int ...

Can user-generated code execute Javascript Promises in its entirety?

Can one fully implement the JavaScript Promise class using only userspace code, without relying on any support from native code (such as the internals of JavaScript) that would typically only be accessible to those working on a JavaScript engine like the V ...

The Angular HTTP POST request is not transmitting any data - there is no body visible even in the Chrome debugger

I'm completely baffled by this situation. It's a routine task for me, yet I can't figure out why it's failing in such a strange way. It seems like it's going to be something trivial, but it's not the typical "no data on the se ...

Discovering a way to retrieve objects from an array of objects with matching IDs

Here is a code snippet I put together to illustrate my objective. arr = [ { id:1 , name:'a', title: 'qmummbw' }, { id:2 , name:'b', title: 'sdmus' }, { id:2 , name:'', title: 'dvfv' }, ...

Deserialization of Newtonsoft Array with Index Key

I am utilizing the newtonsoft.Net library for Deserializing/Serializing Objects. Is it possible to Deserialize the JSON below as an Array of "OfferPixel" objects? Each object within the array is assigned an index number on the service. Therefore, the "Of ...

What is the best way to show a timestamp on a website and automatically end a session after a set period of time

Is there a way to condense this code that displays the current timestamp (IST)? <?php echo date("D M d, Y "); ?> </b> <body onload="digiclock()"> <div id="txt"></div> <script> function digiclock() { ...

How can I display JSON data as key-value pairs in ReactJS?

Transitioning from JavaScript to React, I've come across some threads that touch on this topic but none quite hit the mark. I have a local JSON file that was created with a Python script, and it looks something like this: [{"hello": 10, "world": 15 ...

Implementation of the render function in a Node Express class

I've been working on a class with methods to retrieve domains from an API, and everything has been functioning correctly up until I tried to render it using Node Express. When I attempt to display the data, all I get is an array of numbers without the ...

Transmit information from a website to a server located nearby

Creating a home automation hub is my current project -- utilizing a Raspberry Pi as the foundation to display weather updates, control lighting, and more. This setup is connected to a website through a shared MongoDB database, with both systems running Nod ...

What is the best way to include a JavaScript variable within CSS styling?

I am currently implementing a dropdown menu system where the image of a parent div changes on mouse hover over a link, and reverts back when the mouse moves away. I have this functionality set up using a variable in my HTML code. Is there a way for me to m ...

Tips for simulating mouse events in Jasmine tests for Angular 2 or 4

New to Jasmine testing, I'm exploring how to test a directive that handles mouse events such as mouse down, up, and move. My main query is regarding passing mouse coordinates from the Jasmine spec to my directive in order to simulate the mouse events ...

What methods can be used to retrieve JSON data from a JavaScript file?

I am making a request $.ajax({ url: 'http://translate.google.ru/translate_a/t', data: { client: "x", text: sourceText, sl: langFrom, tl: langTo }, dataType: 'jsonp', success: function (data) { alert( ...