I successfully submitted the example to Mozilla's pdf.js repository, accessible in the examples
directory.
The initial example I contributed to pdf.js is no longer available, but I believe that this one demonstrates text selection. The text-selection logic has been revamped within the text-layer of the reorganized pdf.js code, generated using a factory.
In particular, PDFJS.DefaultTextLayerFactory
efficiently manages the fundamental aspects of text selection.
Note: The following example is outdated and retained here for historical reference.
I encountered difficulties with this issue over the past 2-3 days, but finally managed to resolve it. View a demonstration illustrating how to load a PDF with enabled text selection here.
The challenge lied in disentangling the text-selection mechanism from the viewer-related code (viewer.js
, viewer.html
, viewer.css
). To make it functional, I had to isolate relevant code snippets and CSS properties (the JavaScript file referenced there can also be accessed here). The final outcome is a simplified demo likely to be beneficial. For proper implementation of text selection, the CSS styles in viewer.css
play a crucial role in configuring the styling for subsequently created div
s intended for text selection functionality.
The core functionality is handled by the TextLayerBuilder
object responsible for generating the selection div
s. References to this object are visible within viewer.js
.
Below you'll find both the code snippet and associated CSS. Remember, you will still require the pdf.js
file. My fiddle includes a link to a customized version sourced from Mozilla's GitHub repo for pdf.js
. I opted not to directly link to the repository's version due to ongoing developments which might cause disruptions.
HTML:
<html>
<head>
<title>Basic pdf.js text-selection showcase</title>
</head>
<body>
<div id="pdfContainer" class = "pdf-content">
</div>
</body>
</html>
CSS:
.pdf-content {
border: 1px solid #000000;
}
/* CSS classes utilized by TextLayerBuilder to stylize the text layer divs */
/* Crucial for preventing text display when selecting */
::selection { background:rgba(0,0,255,0.3); }
::-moz-selection { background:rgba(0,0,255,0.3); }
.textLayer {
position: absolute;
left: 0;
top: 0;
right: 0;
bottom: 0;
color: #000;
font-family: sans-serif;
overflow: hidden;
}
.textLayer > div {
color: transparent;
position: absolute;
line-height: 1;
white-space: pre;
cursor: text;
}
.textLayer .highlight {
margin: -1px;
padding: 1px;
background-color: rgba(180, 0, 170, 0.2);
border-radius: 4px;
}
.textLayer .highlight.begin {
border-radius: 4px 0px 0px 4px;
}
.textLayer .highlight.end {
border-radius: 0px 4px 4px 0px;
}
.textLayer .highlight.middle {
border-radius: 0px;
}
.textLayer .highlight.selected {
background-color: rgba(0, 100, 0, 0.2);
}
JavaScript:
//Demonstration of minimal PDF rendering and text selection using pdf.js by Vivin Suresh Paliath (http://vivin.net)
//This fiddle incorporates a compiled pdf.js version encompassing all necessary modules.
//
//For simplicity, PDF data retrieval does not involve external sources. Instead, the data is stored internally.
//
//Understanding text selection was challenging as the selection logic intertwines heavily with viewer.html and viewer.js.
//Relevant portions were extracted into a separate file to exclusively implement text selection. Key component is TextLayerBuilder
//managing creation of text selection divs, added as an external resource.
//
//The demo showcases a single-page PDF rendering. Customization for additional pages is possible, focusing on text selection.
//Additional importance lies in the included CSS setting up styling for selected text overlays.
//
//Reference point for rendered PDF document:
//http://vivin.net/pub/pdfjs/TestDocument.pdf
var pdfBase64 = "..."; //contains base64 representing the PDF
var scale = 1; //Set zoom factor as required.
/**
* Converts a base64 string into a Uint8Array
*/
function base64ToUint8Array(base64) {
var raw = atob(base64);
var uint8Array = new Uint8Array(new ArrayBuffer(raw.length));
for(var i = 0; i < raw.length; i++) {
uint8Array[i] = raw.charCodeAt(i);
}
return uint8Array;
}
function loadPdf(pdfData) {
PDFJS.disableWorker = true;
var pdf = PDFJS.getDocument(pdfData);
pdf.then(renderPdf);
}
function renderPdf(pdf) {
pdf.getPage(1).then(renderPage);
}
function renderPage(page) {
var viewport = page.getViewport(scale);
var $canvas = jQuery("<canvas></canvas>");
var canvas = $canvas.get(0);
var context = canvas.getContext("2d");
canvas.height = viewport.heigh...