I'm working with a Uint8Array that contains the content of a PDF file. My goal is to locate a specific string within this array in order to insert additional content at that particular position.
My current approach involves converting the Uint8Array into a string and then searching for the desired string within that newly created string.
Here's a snippet of my code:
const pdfStr = new TextDecoder('utf-8').decode(array);
// find ByteRange
const byteRangePos = this.getSubstringIndex(pdfStr, '/ByteRange [', 1);
if (byteRangePos === -1) {
throw new Error(
'Failed to locate ByteRange.'
);
}
getSubstringIndex = (str, substring, n) => {
let times = 0, index = null;
while (times < n && index !== -1) {
index = str.indexOf(substring, index + 1);
times++;
}
return index;
}
array = this.updateArray(array, (byteRangePos + '/ByteRange '.length), byteRange);
The issue I'm facing is that utf-8 characters are encoded in variable-length bytes (ranging from 1 to 4 bytes). As a result, the length of the string I obtain is shorter than the actual length of the UInt8Array. This discrepancy causes the index derived from the string search to not align with where the '/ByteRange' string exists in the UInt8Array, leading to incorrect insertion placement.
Is there a method to obtain a 1-byte string representation of the UInt8Array, similar to ASCII?