Transforming the output byte array into a Blob results in file corruption

I am currently developing an Add-in for Word using Angular and the Office Javascript API.

My goal is to retrieve a Word document through the API, convert it to a file, and then upload it to a server via POST method.

The code I have implemented closely resembles the sample code provided by Microsoft in their documentation:

The server endpoint requires multipart form uploads, so I am creating a FormData object and appending the file (as a blob) along with some metadata when making the $http call.

Although the file is successfully transmitted to the server, upon opening it, I discovered that it was corrupted and could not be opened in Word.

After inspecting the output of Office.context.document.getFileAsync, I found that the returned byte array is converted into a string named fileContent. While console logging this string seems to show compressed data as expected.

My assumption is that there might be a preprocessing step required before converting the string to a Blob. However, attempts at Base64 encoding through atob did not yield any positive outcomes.

let sendFile = (fileContent) => {

  let blob = new Blob([fileContent], {
      type: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
    }),
    fd = new FormData();

  blob.lastModifiedDate = new Date();

  fd.append('file', blob, 'uploaded_file_test403.docx');
  fd.append('case_id', caseIdReducer.data());

  $http.post('/file/create', fd, {
      transformRequest: angular.identity,
      headers: {
        'Content-Type': undefined
      }
    })
    .success(() => {

      console.log('upload succeeded');

    })
    .error(() => {
      console.log('upload failed');
    });

};


function onGotAllSlices(docdataSlices) {

  let docdata = [];

  for (let i = 0; i < docdataSlices.length; i++) {
    docdata = docdata.concat(docdataSlices[i]);
  }

  let fileContent = new String();

  for (let j = 0; j < docdata.length; j++) {
    fileContent += String.fromCharCode(docdata[j]);
  }

  // Now all the file content is stored in 'fileContent' variable,
  // you can do something with it, such as print, fax...

  sendFile(fileContent);

}

function getSliceAsync(file, nextSlice, sliceCount, gotAllSlices, docdataSlices, slicesReceived) {
  file.getSliceAsync(nextSlice, (sliceResult) => {

    if (sliceResult.status === 'succeeded') {
      if (!gotAllSlices) { // Failed to get all slices, no need to continue.
        return;
      }

      // Got one slice, store it in a temporary array.
      // (Or you can do something else, such as
      // send it to a third-party server.)
      docdataSlices[sliceResult.value.index] = sliceResult.value.data;
      if (++slicesReceived === sliceCount) {
        // All slices have been received.
        file.closeAsync();

        onGotAllSlices(docdataSlices);

      } else {
        getSliceAsync(file, ++nextSlice, sliceCount, gotAllSlices, docdataSlices, slicesReceived);
      }
    } else {

      gotAllSlices = false;
      file.closeAsync();
      console.log(`getSliceAsync Error: ${sliceResult.error.message}`);
    }
  });
}

// User clicks button to start document retrieval from Word and uploading to server process
ctrl.handleClick = () => {

  Office.context.document.getFileAsync(Office.FileType.Compressed, {
      sliceSize: 65536 /*64 KB*/
    },
    (result) => {
      if (result.status === 'succeeded') {

        // If the getFileAsync call succeeded, then
        // result.value will return a valid File Object.
        let myFile = result.value,
          sliceCount = myFile.sliceCount,
          slicesReceived = 0,
          gotAllSlices = true,
          docdataSlices = [];

        // Get the file slices.
        getSliceAsync(myFile, 0, sliceCount, gotAllSlices, docdataSlices, slicesReceived);

      } else {

        console.log(`Error: ${result.error.message}`);

      }
    }
  );
};

Answer №1

In my approach, I used the fileContent string to generate an array of bytes:

let bytes = new Uint8Array(fileContent.length);

for (let i = 0; i < bytes.length; i++) {
    bytes[i] = fileContent.charCodeAt(i);
}

These bytes were then used to create a Blob:

let blob = new Blob([bytes], { type: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document' });

Sending this Blob via a POST request resulted in a file that could be opened correctly by Word.

I believe there may be a simpler solution out there with fewer steps involved. If anyone has any suggestions for improvement, I would love to hear them.

Answer №2

Thanks for the helpful response! Using Uint8Array proved to be the perfect solution in this case. To make it even better, here's a slight tweak to prevent unnecessary string creation:

let byteArray = new Uint8Array(docData.length);
for (var index = 0; index < docData.length; index++) {
    byteArray[index] = docData[index];
}

Answer №3

Seriously, why not just use a File instance instead of the FileReader API? Come on, Microsoft!

It's risky to convert a binary blob into a string in JavaScript as it can cause errors or incorrect encoding. The safer option is to pass the byte array into the blob constructor.

Try this approach:

var byteArray = new Uint8Array(3)
byteArray[0] = 97
byteArray[1] = 98
byteArray[2] = 99
new Blob([byteArray])

If the chunk is a typed array or a blob/file instance, you can simply do:

blob = new Blob([blob, chunk])

And please avoid base64 encoding it as it makes the file larger and slower to process.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Transferring data between Promises and functions through variable passing

I am facing a challenge. I need to make two separate SOAP calls in order to retrieve two lists of vouchers, and then use these lists to perform some checks and other tasks. I have placed the two calls within different Promise functions because I want to in ...

triggering a method in an Angular controller using a Google API embedded in the view

Using the Google Places Details API, I have included a Google API with a callback function called initMap in the src attribute. Here is the code snippet: <div class="tab-pane active" id="timeline"> <p class="lead">Location</p> <hr& ...

Steps for creating an asynchronous Redis subscriber invocation

My current setup involves a Redis server within AWS ElastiCache. I am publishing messages to a Redis channel and attempting to retrieve these messages through subscription using a JavaScript script as shown below: const redis = require("redis"); const sub ...

Implement a transformation on the API endpoint's JSON data to prepare it for display in a React

I'm currently developing a React application and have created a component to display tabular data. The API endpoint I am calling returns data in the following format: { "bitcoin": { "usd": 48904, "usd_market_cap": 9252 ...

When I try to make an on-demand revalidation API call on Vercel, it takes so long that it ends up timing

Inspired by Kent C. Dodds, I have created a blog using Github as my Content Management System (CMS). All of my blog content is stored in the same repository as the code, in mdx format. To streamline the process, I set up a workflow that detects changes i ...

RS256 requires that the secretOrPrivateKey is an asymmetric key

Utilizing the jsonwebtoken library to create a bearer token. Following the guidelines from the official documentation, my implementation code appears as below: var privateKey = fs.readFileSync('src\\private.key'); //returns Buffer let ...

Showing a group of users in real-time as they connect using Socket IO

I've been working on setting up a system in Socket IO to create a list of individuals who join a 'party room'. The plan is to lock the party room once all players are present, and then display views to users. However, I've hit a roadblo ...

What is the best way to handle JSONp response parsing using JavaScript?

I'm relatively new to working with Javascript and I am currently attempting to retrieve data from an External API located on a different website. My goal is to extract the information, parse it, and then display specific parts of it within my HTML pag ...

Having trouble parsing asynchronous script with cheerio parser

Utilizing cheerio for web crawling poses a challenge when encountering websites with asynchronous scripts. When attempting to extract all the scripts from such websites, they are often missed in the process. Here is an example of the code I am currently us ...

Steps to design a unique input radio button with embedded attributes

In my current project, I am utilizing react styled components for styling. One issue that I have encountered is with the text placement within a box and the need to style it differently when checked. What have I attempted so far? I created an outer div a ...

What is the reason for the continual influx of new users being added to the database?

I have a Node.JS and MongoDB console application where I've implemented adding users in one file and outputting all collection objects to the console in another file. When running the command - node scripts/get_all_users.js, both existing users are di ...

Storage in Ionic and variable management

Hello, I'm struggling to assign the returned value from a promise to an external variable. Despite several attempts, I have not been successful. export class TestPage { test:any; constructor(private storage: Storage) { storage.get('t ...

What is the best way to extract row data from a datatable and transmit it in JSON format?

I have successfully created a code that retrieves data from a dynamic table using ajax. However, I am facing an issue where I only want to send data from checked rows. Despite trying different approaches, I keep encountering errors or receive an empty arra ...

Number input in JavaScript being disrupted by stray commas

On my webpage, there are elements that users can control. One of these is a text input field for numbers. When users enter digits like 9000, everything functions correctly. However, if they use comma notation such as 9,000, JavaScript doesn't recogniz ...

Adapting Vue.js directive based on viewport size - a guide

Having an element with the v-rellax directive, I'm utilizing it for prallax scrolling in this particular div: <div id="image" v-rellax="{ speed: -5 }"></div> Currently, there's a need to adjust the speed property ...

Protecting client-side game logic operations with web application security

I've been developing a web-based game that utilizes the Canvas feature of HTML5. However, I've come to realize that there is a significant vulnerability in my system. The scoring and gameplay statistics are currently being calculated on the clien ...

Creating interactive rows in a table using AngularJS

I am looking to automatically populate rows in table 2 based on the count value from table 1. Table 1 has a field called count which determines how many rows should be displayed in table 2. I am just starting out with AngularJS, so any guidance on how to ...

What is the best way to transfer information from an app.js file to an index.ejs file within Routes?

I'm encountering an error where the variable 'blogs' that I am passing as an object containing data to the index page is not defined. Here is the code for the index page: Any help or guidance on how to resolve this issue would be greatly ap ...

Issues with Angular route links not functioning correctly when using an Array of objects

After hard coding some routerLinks into my application and witnessing smooth functionality, I decided to explore a different approach: View: <ul class="list navbar-nav"></ul> Ts.file public links = [ { name: "Home&quo ...

What is the best way to loop through a group of WebElements, and only log the results that contain a specific substring?

In my test case, I'm utilizing Mocha to handle the scenario. The test appears to be passing successfully, however, no logs are being printed... it('Is 'Mooooooo!!!! I2MaC0W' a Substring in Results?', function() { this.timeout(50 ...