Exploring the capabilities of Three.js Projector and Ray components

Question

Exploring the capabilities of Three.js Projector and Ray components

Recently, I've been experimenting with the Projector and Ray classes for collision detection demos. My main focus has been on using the mouse to interact with objects by selecting or dragging them. While studying examples that utilize these classes, I found a lack of comments explaining the methods used in Projector and Ray. This left me with a few unanswered questions that I hope someone can clarify.

I'm particularly curious about the functions Projector.projectVector() and Projector.unprojectVector(). What is the distinction between the two? It appears that in all instances where both projector and ray objects are employed, the unproject method is called prior to creating the ray. When would one choose to use projectVector?

In my current implementation, which you can view in this demo, I have a cube that spins when dragged with the mouse. Can someone provide a simplified explanation of what occurs when I unproject with the mouse3D and camera to create the Ray? Does the ray's behavior depend on the call to unprojectVector()?


/** Event triggered when the mouse button is pressed down */
function onDocumentMouseDown(event) {
    // Code omitted for brevity 
}

/** Event handler activated between mouse down and up events during mouse movement */
function onDocumentMouseMove(event) {
    // Code omitted for brevity 
}

/** Removes event listeners upon mouse button release */
function onDocumentMouseUp(event) {
    // Code omitted for brevity 
}

/** Removes event listeners if the mouse exits the renderer area */
function onDocumentMouseOut(event) {
    // Code omitted for brevity 
}

javascript 3d three.js

Answer 1

Answer №1

I discovered that in order to extend beyond the limitations of the provided sample code, such as implementing a canvas that doesn't cover the entire screen or adding extra effects, I needed to delve deeper into the workings beneath the surface. To share my findings, I wrote about it in a blog post which can be found here. While this is a condensed version, it should encompass all the key aspects I encountered.

How to Achieve It

The code snippet (similar to what was already available from @mrdoob) below demonstrates how to change the color of a cube upon clicking:

    var mouse3D = new THREE.Vector3( ( event.clientX / window.innerWidth ) * 2 - 1,   //x
                                    -( event.clientY / window.innerHeight ) * 2 + 1,  //y
                                    0.5 );                                            //z
    projector.unprojectVector( mouse3D, camera );   
    mouse3D.sub( camera.position );                
    mouse3D.normalize();
    var raycaster = new THREE.Raycaster( camera.position, mouse3D );
    var intersects = raycaster.intersectObjects( objects );
    // Change color if block is hit
    if ( intersects.length > 0 ) {
        intersects[ 0 ].object.material.color.setHex( Math.random() * 0xffffff );
    }

With newer releases of three.js around r55 and onwards, you can simplify this process by using pickingRay, resulting in the following streamlined code:

    var mouse3D = new THREE.Vector3( ( event.clientX / window.innerWidth ) * 2 - 1,   //x
                                    -( event.clientY / window.innerHeight ) * 2 + 1,  //y
                                    0.5 );                                            //z
    var raycaster = projector.pickingRay( mouse3D.clone(), camera );
    var intersects = raycaster.intersectObjects( objects );
    // Change color if block is hit
    if ( intersects.length > 0 ) {
        intersects[ 0 ].object.material.color.setHex( Math.random() * 0xffffff );
    }

Opting for the traditional approach offers more insight into the underlying mechanics. A live demonstration showcasing this functionality can be viewed here; simply click on the cube to alter its color.

Understanding the Process

    var mouse3D = new THREE.Vector3( ( event.clientX / window.innerWidth ) * 2 - 1,   //x
                                    -( event.clientY / window.innerHeight ) * 2 + 1,  //y
                                    0.5 );                                            //z

The x-coordinate of the click position is represented by event.clientX. By dividing it by window.innerWidth, we determine the click's proportional location within the full window width. This essentially translates screen coordinates originating at (0,0) in the top left corner through to (window.innerWidth,window.innerHeight) in the bottom right corner, transforming them to Cartesian coordinates centered at (0,0) with a range from (-1,-1) to (1,1).

Furthermore, the z-value being set to 0.5 represents the depth of the point away from the camera, projecting it into 3D space along the z-axis. More details on this will be explored later on.

... (This text has been shortened due to character limit) ...

That's the gist of it, hopefully it sheds some light on the matter.

Answer 2

I discovered that in order to extend beyond the limitations of the provided sample code, such as implementing a canvas that doesn't cover the entire screen or adding extra effects, I needed to delve deeper into the workings beneath the surface. To share my findings, I wrote about it in a blog post which can be found here. While this is a condensed version, it should encompass all the key aspects I encountered.

How to Achieve It

The code snippet (similar to what was already available from @mrdoob) below demonstrates how to change the color of a cube upon clicking:

    var mouse3D = new THREE.Vector3( ( event.clientX / window.innerWidth ) * 2 - 1,   //x
                                    -( event.clientY / window.innerHeight ) * 2 + 1,  //y
                                    0.5 );                                            //z
    projector.unprojectVector( mouse3D, camera );   
    mouse3D.sub( camera.position );                
    mouse3D.normalize();
    var raycaster = new THREE.Raycaster( camera.position, mouse3D );
    var intersects = raycaster.intersectObjects( objects );
    // Change color if block is hit
    if ( intersects.length > 0 ) {
        intersects[ 0 ].object.material.color.setHex( Math.random() * 0xffffff );
    }

With newer releases of three.js around r55 and onwards, you can simplify this process by using pickingRay, resulting in the following streamlined code:

    var mouse3D = new THREE.Vector3( ( event.clientX / window.innerWidth ) * 2 - 1,   //x
                                    -( event.clientY / window.innerHeight ) * 2 + 1,  //y
                                    0.5 );                                            //z
    var raycaster = projector.pickingRay( mouse3D.clone(), camera );
    var intersects = raycaster.intersectObjects( objects );
    // Change color if block is hit
    if ( intersects.length > 0 ) {
        intersects[ 0 ].object.material.color.setHex( Math.random() * 0xffffff );
    }

Opting for the traditional approach offers more insight into the underlying mechanics. A live demonstration showcasing this functionality can be viewed here; simply click on the cube to alter its color.

Understanding the Process

    var mouse3D = new THREE.Vector3( ( event.clientX / window.innerWidth ) * 2 - 1,   //x
                                    -( event.clientY / window.innerHeight ) * 2 + 1,  //y
                                    0.5 );                                            //z

The x-coordinate of the click position is represented by event.clientX. By dividing it by window.innerWidth, we determine the click's proportional location within the full window width. This essentially translates screen coordinates originating at (0,0) in the top left corner through to (window.innerWidth,window.innerHeight) in the bottom right corner, transforming them to Cartesian coordinates centered at (0,0) with a range from (-1,-1) to (1,1).

Furthermore, the z-value being set to 0.5 represents the depth of the point away from the camera, projecting it into 3D space along the z-axis. More details on this will be explored later on.

... (This text has been shortened due to character limit) ...

That's the gist of it, hopefully it sheds some light on the matter.

Answer 3

Answer №2

Essentially, the process involves projecting from 3D world space to 2D screen space.

When it comes to rendering, renderers utilize projectVector to translate 3D points into the 2D screen. On the other hand, unprojectVector is used for the opposite action, converting 2D points back into the 3D world. In both cases, you need to specify the camera perspective.

So, in this snippet of code, a normalized vector is created in 2D space. Personally, I've always found the logic behind z = 0.5 a bit unsure.

mouse3D.x = (event.clientX / window.innerWidth) * 2 - 1;
mouse3D.y = -(event.clientY / window.innerHeight) * 2 + 1;
mouse3D.z = 0.5;

Following that, the code utilizes the camera projection matrix to convert the vector into our 3D world space.

projector.unprojectVector(mouse3D, camera);

Now that the mouse3D point has been transformed into the 3D space, it can be used to determine the direction and launch a ray from the camera position.

var ray = new THREE.Ray(camera.position, mouse3D.subSelf(camera.position).normalize());
var intersects = ray.intersectObject(plane);

Answer 4

Essentially, the process involves projecting from 3D world space to 2D screen space.

When it comes to rendering, renderers utilize projectVector to translate 3D points into the 2D screen. On the other hand, unprojectVector is used for the opposite action, converting 2D points back into the 3D world. In both cases, you need to specify the camera perspective.

So, in this snippet of code, a normalized vector is created in 2D space. Personally, I've always found the logic behind z = 0.5 a bit unsure.

mouse3D.x = (event.clientX / window.innerWidth) * 2 - 1;
mouse3D.y = -(event.clientY / window.innerHeight) * 2 + 1;
mouse3D.z = 0.5;

Following that, the code utilizes the camera projection matrix to convert the vector into our 3D world space.

projector.unprojectVector(mouse3D, camera);

Now that the mouse3D point has been transformed into the 3D space, it can be used to determine the direction and launch a ray from the camera position.

var ray = new THREE.Ray(camera.position, mouse3D.subSelf(camera.position).normalize());
var intersects = ray.intersectObject(plane);

Answer 5

Answer №3

Starting from release r70, the methods Projector.unprojectVector and Projector.pickingRay have been marked as deprecated. A more convenient alternative is now available with raycaster.setFromCamera, simplifying the process of finding objects under the mouse pointer.

var mouse = new THREE.Vector2();
mouse.x = (event.clientX / window.innerWidth) * 2 - 1;
mouse.y = -(event.clientY / window.innerHeight) * 2 + 1; 

var raycaster = new THREE.Raycaster();
raycaster.setFromCamera(mouse, camera);
var intersects = raycaster.intersectObjects(scene.children);

The object under the mouse pointer can be accessed using intersects[0].object, while the point where the mouse pointer was clicked on the object is provided by intersects[0].point.

Answer 6

Starting from release r70, the methods Projector.unprojectVector and Projector.pickingRay have been marked as deprecated. A more convenient alternative is now available with raycaster.setFromCamera, simplifying the process of finding objects under the mouse pointer.

var mouse = new THREE.Vector2();
mouse.x = (event.clientX / window.innerWidth) * 2 - 1;
mouse.y = -(event.clientY / window.innerHeight) * 2 + 1; 

var raycaster = new THREE.Raycaster();
raycaster.setFromCamera(mouse, camera);
var intersects = raycaster.intersectObjects(scene.children);

The object under the mouse pointer can be accessed using intersects[0].object, while the point where the mouse pointer was clicked on the object is provided by intersects[0].point.

Answer 7

Answer №4

When the Projector.unprojectVector() function is called, it treats the vec3 as a position. As part of the process, the vector is translated, hence why we use .sub(camera.position) on it. After this translation, normalization is needed.

In this post, I will be adding some graphics to help illustrate these concepts. For now, let's delve into the geometry behind these operations.

Imagine the camera as a pyramid in terms of its geometric representation. This pyramid can be defined by 6 planes - left, right, top, bottom, near, and far (with near being the plane closest to the tip).

If we were visually observing these operations in a 3D environment, we would see the pyramid positioned arbitrarily with a rotation in space. Let's consider the origin of this pyramid at its tip, with its negative z-axis pointing towards the bottom.

Anything contained within these 6 planes will ultimately be rendered on our screen through the application of various matrix transformations. In OpenGL, the sequence typically looks like:

NDC_or_homogenous_coordinates = projectionMatrix * viewMatrix * modelMatrix * position.xyzw;

This series of transformations takes an object from its object space to world space, then to camera space before projecting it using a perspective projection matrix which condenses everything into a small cube within NDC ranges of -1 to 1.

Object space refers to a set of xyz coordinates where something is generated procedurally or modeled in 3D by an artist following symmetry, aligning neatly with the coordinate system. On the other hand, architectural models from programs like REVIT or AutoCAD might differ. An objectMatrix could come between the model matrix and view matrix to handle specific adjustments beforehand.

Considering our flat 2D screen but envisioning depth similar to the NDC cube, we adjust for aspect ratio based on the screen height. This scaling ensures proper alignment with x coordinates on the screen.

Now back into the 3D realm...

Suppose we are surrounded by a 3D scene featuring the pyramid structure. If we isolate the surroundings and place the pyramid at the origin while positioning its bottom along the -z axis, the resulting transformation can be represented as:

viewMatrix * modelMatrix * position.xyzw

Multiplying this with the projection matrix expands the single point at the tip into a square via adjustments in the x and y axes, essentially transforming the pyramid into a box.

During this transformation, the box scales to fit ranges of -1 and 1 providing perspective projection, turning the pyramid into a rectangular volume.

In this virtual space, we manage a 2D mouse event that exists on our 3D screen, adhering to the boundaries of the NDC cube. Given its two-dimensional nature, we only have information about X and Y, necessitating the use of ray casting for determining Z coordinates.

As rays are cast, they extend through the cube perpendicular to one of its sides aiming to intersect objects in the scene. To enable computations, these rays must be transformed from NDC space to world space for further analysis.

A ray, unlike a simple vector, represents an infinite line with direction running through a specific point in space. The Raycaster handles this setup efficiently.

By retracing steps back to the pyramid-box analogy, squeezing components back into the original pyramid produces rays originating from the tip traversing down towards intersections within predetermined ranges.

The associated method effectively transforms directions with normalized vectors facilitating computations.

Throughout these processes, consistency is upheld courtesy of the NDC cube properties, mapping near and far points onto -1 and 1 ranges.

Unpacking this, when firing a ray at coordinates [mouse.x | mouse.y | someZpositive], aiming from (0,0,someZpositive), induces reflection rooted in the camera's world matrix alignments.

For unprojection purposes, reversing the procedure converts this infinite line into a tangible position resembling a physical entity. By accounting for the camera's translations and rotations, positions get recalibrated accordingly after subtracting the camera's specified location.

Answer 8