Can Node be prompted to utilize surrogate pairs for writing Unicode characters in JSON when creating a file?

After researching on this topic, it was mentioned that JSON is supposed to be automatically written using surrogate pairs.

However, this has not been the case in my personal experience.

Despite running the code below with Node.js version 6.9.2, some characters are still not encoded using surrogate pairs in the output file.

const fs = require('fs')

const infile = fs.readFile('raw.json', 'utf8', (err, data) => {
    if (err) {
        throw err
    }

    data = JSON.stringify(data)

    fs.writeFile('final.json', data, 'utf8', (err) => {
      if (err) {
        throw err
      }
      console.log('done')
    })

})

In my text editor, which supports unicode well and uses a font with glyphs for all characters, I noticed special characters like "題" in the contents of the file raw.json.

Unfortunately, even after saving the file as final.json, those characters remain unchanged without being converted into surrogate pairs.

I also attempted switching the encoding from utf8 to utf16le for the output file, but that did not solve the issue either.

Is there any method or technique to enforce the usage of surrogate pairs during JSON encoding?

Answer №1

The stated question could lead to confusion if one assumes that JSON.stringify will transform Unicode characters in a string, beyond the Basic Multilingual Plane, into a series of \u escaped surrogate pair values. An answer offers a clearer explanation, highlighting that JSON.stringify only escapes backslash (\), double quotation ("), and control characters.

As a result, when encountered with a character that spans more than one octet (like the '題' mentioned as an example), it will be directly written to the output file as that specific character. In case of successful writing followed by reading using UTF16 encoding, the input character encoded in UTF8 should ideally appear as intended.

If the objective is to convert JSON text to ASCII utilizing \u escaped characters for non-ASCII values, alongside surrogate pairs for characters outside the BMP, then processing the JSON formatted string involves straightforward character scrutiny. This is because JSON automatically handles the quote, backslash, and control characters:

var jsonComponent = '"2®π≤題😍"'; // for instance

function jsonToAscii( jsonText) {
    var s = "";
    
    for( var i = 0; i < jsonText.length; ++i) {
        var c = jsonText[ i];
        if( c >= '\x7F') {
            c = c.charCodeAt(0).toString(16);
            switch( c.length) {
              case 2: c = "\\u00" + c; break;
              case 3: c = "\\u0" + c; break;
              default: c = "\\u" + c; break;
            }
        }
        s += c;
    }
    return s;
}

console.log( jsonToAscii( jsonComponent));

This approach capitalizes on the fact that JavaScript strings are already in UTF16 format (including surrogate pairs), albeit being accessed as consecutive UCS-2 16-bit values through array notation lookup and the .charAt method. Notably, '題' falls within the BMP realm requiring only two octets in UTF16, whereas the emoji lies beyond plane 0 necessitating 4 octets (in UTF16).

If this isn't the main aim, there might be minimal cause for concern.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Tips for extracting a JSON element in an Arel query on Postgres in Rails 5.2.4

In my HTML form, I created a helper to populate a drop-down select field. The query is fairly simple, except that the name field has a JSON data type that includes translations in different languages: {"en":"New", "fr":"N ...

sending jqgrid post request with JSON payload

My grid data read is set up to use json format. Here is the configuration: url:"devitem.json", mtype: "POST", datatype: "json", ajaxGridOptions: { type : 'post', async : false, error : function() { alert ...

Passing variables from Guzzle Request in Laravel: A step-by-step guide

Is there a way to transfer a variable from a guzzle request to a view? This snippet shows my controller code: $client = new Client(); $res = $client->request('GET', 'https://api.iugu.com/v1/customers?api_token=secret'); $resu ...

The prompt "npm run build" command resulted in a 126 Vercel exit status

While attempting to upload my website with Webpack to Vercel from the repository, I encountered an error during the build process: Skipping build cache, deployment triggered without cache. Cloning completed: 2.089s Running "vercel build" Vercel CLI 31.2.3 ...

Is there a way to stop the navbar from covering the title?

Looking for help with customizing my navbar. At the moment, it resembles image one but I'm aiming for the appearance of image two. Any suggestions on how to achieve this? Is it a CSS modification that needs to be made? /* Nav Bar*/ .navbar-brand{ ...

Ensure that there is no gap between a unicode character and the following character

If I need to display degrees Celsius in R, one way is by using unicode as shown below: print("\U00B0 C") [1] "° C" However, I may not want the space between the degree symbol and the 'C', which can be achieved by removing it: print("&bso ...

Extracting data from a JSON file using Python: A step-by-step guide

I am working with a JSON file named data.json, and it contains the following structure of data: { "created_at": "Fri Oct 12 00:00:00 +0000 2012", "text": "ottimes daily top stories ghostlightning secretanimelov erojunko&q ...

The file or directory '/var/task/google-cloud-key.json' does not exist: ENOENT error on Vercel

Can you please assist me in resolving this issue? The location of my JS file is within the Next JS app > pages/api/profile, while google-cloud-key.json resides in the root folder of the Next JS app alongside the package.json file. While everything fun ...

Step back one iteration within the Array.prototype.map method

Can you reverse an iteration step in JavaScript when using Array.prototype.map or Array.prototype.forEach? For instance, if the current index is 2, is it possible to go back to index 1 within these methods? While we can easily achieve this with a standar ...

Include variables in a JavaScript statement to create conditional functionality

Currently, my code is functioning as expected and appears like this: if (Number.isInteger(number)) { ffmpeg(video.mp4) .on('end', function () { console.log('Screenshots taken'); }) .screenshots ...

Struggling to send an object through a node route for rendering a page?

Currently tackling a node.js project using express.js. I have a route that renders an ejs page and passes along the team object. Strangely, when I try to access <%= team.member.name %>, it returns as undefined despite the information being present. A ...

AngularJS unit testing with $httpBackend is impacted by conflicts with UI-Router

Here is a controller that utilizes a submit function: $scope.submit = function(){ $http.post('/api/project', $scope.project) .success(function(data, status){ $modalInstance.dismiss(true); }) .error(function(data){ ...

Unravel the encoded string to enable JSON parsing

Here is an example of my JSON string structure [{&#034;id&#034;:0,&#034;nextCallMills&#034;:0,&#034;delay&#034;:0,&#034;start&#034;:&#034;... I am facing an issue with JSON.parseString() Even after trying unescape() a ...

Notify of an Invalid CSRF Token within the Action Buttons Present in the Table

I've encountered a problem with the CSRF token when using a Handlebars each loop. I created a data table using the each loop but I'm unable to submit the delete button. The CSRF token only seems to work for submitting new data (POST) and updating ...

Issue with Material-UI Nested Checkbox causing parent DOM to not update upon selection changes

Currently, I am integrating a nested checkbox feature from a working example into my application. The functionality of the checkboxes covers seven different scenarios: - Scenario - No children, no parent selected - Select the parent -> select both pa ...

Bundling and deploying a React-Native iOS app on a physical device: Step-by-step guide

I'm having trouble locating the necessary documentation for deploying the latest version of React-Native onto a physical device. When I attempt to open the remote debugger on my iPhone, I encounter red screen errors, but the app loads successfully wi ...

Troubleshooting AngularJS: Issues arise when implementing ng-view

Here is the code snippet from my index.html file: <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1.0 ,user-scalable=no"> <sc ...

Is there a way for me to retrieve the name of a newly opened browser tab from the original tab?

I have written a code snippet to create a new tab within the same browser by clicking on a link. function newTab() { var form1 = document.createElement("form"); form1.id = "newTab1" form1.method = "GET"; form1.action = "domainname"; //My ...

Error TS7053 indicates that an element is implicitly assigned the 'any' type when trying to use a 'string' type to index a 'User_Economy' type

Struggling with a particular question, but can't seem to find a solution. Error: TS7053: Element implicitly has an 'any' type because expression of type 'string' can't be used to index type 'User_Economy'. No ind ...

Unresponsive IE browser: Issues with jQuery event changes and click functionalities

Here is the HTML code: <div class="sss"> <label class="label_radio" for="radio-01"> <input name="vtip" id="radio-01" value="50" type="radio" /> <img src="http://www.tui-travelcenter.ro/layouts/innobyte/images/radio ...