What is the best method to retrieve data from a SQL table using knex when the row values are in consecutive order?

Question

What is the best method to retrieve data from a SQL table using knex when the row values are in consecutive order?

Imagine I have a database that represents a library, with a table storing the words within each book. Let's refer to this table as "books" and assume it includes rows like the following:

| book_name | word_in_book | word    |
|-----------|--------------|---------|
| Moby Dick | 1            | call    |
| Moby Dick | 2            | me      |
| Moby Dick | 3            | ishmael |

If I am looking for a specific sequence of words (which could vary in length), what SQL query could I execute to retrieve a list of book_names containing that exact sequence where word_in_book is consecutive? For instance, if my sequence is ["call", "me", "ishmael"], the query should return "Moby Dick" since the book contains those words in order. However, searching for ["call", "me", "ahab"] would not yield any results because those words do not form a subarray within the book's words (the query should only return books with a matching subarray, not a matching subsequence).

I am utilizing knex alongside Express to construct my SQL statements. My assumption is that I will need to use knex to iterate through the array of words being searched and dynamically add elements to my query object for each word, but I am uncertain about how to go about doing this.

Here is the approach I have considered so far:

const knex = require("knex")({
  // Connection details here ...
});
const words = ["call", "me", "ishmael"];

let query = knex("books");
words.forEach(word => {
  query = ??? // Unsure about constructing my query
});

The actual database I am dealing with at my workplace closely resembles this example. The main distinction is that there are thousands of books, yet each book does not contain an extensive number of words (typically only a few hundred at most). The challenge lies in the fact that retrieving all content from every book and cross-checking all words using JavaScript would be quite sluggish, hence why I prefer knex/SQL to manage as much of this process as possible. What would be the most effective way to achieve this?

javascript sql knex.js

Answer 1

Answer №1

To start, the query you need to execute is somewhat similar to this:

SELECT books.book_name
From books
join books bw2 on bw2.book_name = books.book_name AND bw2.word_in_book = books.word_in_book + 1 AND bw2.word = 'me'
join books bw3 on bw3.book_name = books.book_name AND bw3.word_in_book = books.word_in_book + 2 AND bw3.word = 'ishmael' 
where books.word = 'call'
Group by books.book_name -- avoid having twice the same book.

As seen, multiple joins are required in the query to find the next word. While there may be a simpler approach using user-defined variables in certain databases, Knex does not seem to support it based on the provided link.

To ensure this query runs efficiently, consider adding a composite index on three columns (assuming MySQL or MariaDB):

ALTER TABLE books ADD INDEX (word, book_name, word_in_book);

Indexing your table properly will play a crucial role in the performance of this query.

When utilizing Knex to construct the query:

const words = ["call", "me", "ishmael"];

var query = knex("books").select({
    book_name_searched: 'books.book_name'
}).where('books.word', words[0]);
words.forEach( (word, index) => {
    if (index < 1) return;
    query = query.join('books as bw' + index, function() {
        this.on('bw' + index + '.book_name', '=', 'books.book_name')
           .andOn(knex.raw('bw' + index + '.word = \'' + words[index] + '\''))
           .andOn(knex.raw('bw' + index + '.word_in_book = books.word_in_book + ' + index))
    })
});

query.groupBy('books.book_name');

query.toString();
// Outputs the SQL query string for reference

While I haven't executed this against an actual database using Knex, the generated query string appears to be correct. Please let me know if you encounter any issues, but hopefully, this provides guidance for constructing your query.

Answer 2

To start, the query you need to execute is somewhat similar to this:

SELECT books.book_name
From books
join books bw2 on bw2.book_name = books.book_name AND bw2.word_in_book = books.word_in_book + 1 AND bw2.word = 'me'
join books bw3 on bw3.book_name = books.book_name AND bw3.word_in_book = books.word_in_book + 2 AND bw3.word = 'ishmael' 
where books.word = 'call'
Group by books.book_name -- avoid having twice the same book.

As seen, multiple joins are required in the query to find the next word. While there may be a simpler approach using user-defined variables in certain databases, Knex does not seem to support it based on the provided link.

To ensure this query runs efficiently, consider adding a composite index on three columns (assuming MySQL or MariaDB):

ALTER TABLE books ADD INDEX (word, book_name, word_in_book);

Indexing your table properly will play a crucial role in the performance of this query.

When utilizing Knex to construct the query:

const words = ["call", "me", "ishmael"];

var query = knex("books").select({
    book_name_searched: 'books.book_name'
}).where('books.word', words[0]);
words.forEach( (word, index) => {
    if (index < 1) return;
    query = query.join('books as bw' + index, function() {
        this.on('bw' + index + '.book_name', '=', 'books.book_name')
           .andOn(knex.raw('bw' + index + '.word = \'' + words[index] + '\''))
           .andOn(knex.raw('bw' + index + '.word_in_book = books.word_in_book + ' + index))
    })
});

query.groupBy('books.book_name');

query.toString();
// Outputs the SQL query string for reference

While I haven't executed this against an actual database using Knex, the generated query string appears to be correct. Please let me know if you encounter any issues, but hopefully, this provides guidance for constructing your query.

Answer 3

Answer №2

Great suggestion provided by hsibboni. Here's a more simplified query you can use:

SELECT
title 
FROM novels
WHERE
(word='search' and word_in_novel=1) OR --word_in_novel=index
(word='for' and word_in_novel=2) OR
(word='captain' and word_in_novel=3) OR
GROUP BY title
HAVING count(1)=3 --words.count

Answer 4

Great suggestion provided by hsibboni. Here's a more simplified query you can use:

SELECT
title 
FROM novels
WHERE
(word='search' and word_in_novel=1) OR --word_in_novel=index
(word='for' and word_in_novel=2) OR
(word='captain' and word_in_novel=3) OR
GROUP BY title
HAVING count(1)=3 --words.count

What is the best method to retrieve data from a SQL table using knex when the row values are in consecutive order?

Answer №1

Answer №2

Similar questions

Obtaining and Assigning Filter Values in Material Table

JavaScript - Populate canvas with selected image from gallery upon clicking an image

What is the purpose of the code snippet 'jQuery.each(lines, function(lineNo, line)'?

Issue with column default not being updated after executing the query

Ways to time animations differently and activate two animations at the same time in every loop

Modifying JavaScript object values using the Object() constructor

How can we efficiently loop through all the icons in React Material-UI?

Rails 7 is missing the Toast element from Bootstrap 5

Creating a map with markers using the Google Maps API

Choose a variety of photos for each individual input file

Adjusting the visible options in ngOptions causes a disruption in the selected value of the dropdown menu

Enhanced appearance with asynchronous functions at the top level

What is the process for determining the text direction of a website's title?

Prompting for confirmation when the "Close Button" is clicked in Bootstrap V5 Alert Dismissing

Utilize linqjs to filter an array based on the values present in another array

Errors persist with Angular 2 iFrame despite attempts at sanitization

Javascript is not recognizing if and elseif conditions

Tips for using jQuery dropdown menus

Identifying whether a Alphabet or a Digit has been Pressed - JavaScript

How can I adjust the column width in OfficeGen?