How can we efficiently execute text searches on a large scale by utilizing static index files that are easily accessible online

Looking for a lightweight, yet scalable solution for implementing a full text search index in JavaScript using static files accessible via HTTP? Seeking to make about 100k documents searchable online without breaking the bank on hosting costs like Elasticsearch or private Google search servers? With limited resources, but the ability to host JSON and simple text files inexpensively, I'm exploring ways to create a basic search engine. Preferably, I'd like something that caters to simple keyword searches without complex query languages.

One approach I've considered involves parsing all documents, creating bag-of-words representations for each file, and generating index files listing document IDs and word counts. For search functionality, a straightforward JavaScript or Python script would retrieve index files for user queries, identify document IDs with the highest term counts, and generate search results accordingly.

While cost-effective and feasible for my needs, this method has its limitations in terms of efficiency due to the size and processing requirements of index files. Despite researching extensively, I haven't come across similar client-side solutions utilizing server-generated static index files. Existing options either involve expensive full text search servers or loading large indexes on the client side, neither of which are viable given my constraints.

I'm open to suggestions on optimizing the structure of index files or discovering more efficient tools or approaches for this type of search implementation. Any insights or recommendations would be greatly appreciated!

Answer №1

Optimize your SQL.js usage by incorporating a Virtual Filesystem that enables Range requests, allowing for efficient on-demand reading of filesystem pages. Check out the helpful links below for more information.

If you're looking to create a searchable catalog without relying on a query server (as part of a web3 project), consider utilizing https://github.com/rhashimoto/wa-sqlite with a custom Virtual Filesystem that supports Range requests, enabling hosting of large sqlite files on platforms like Sia Skynet.

In addition, exploring a plaintext solution where only index-to-index data is served to clients could be beneficial if ample static hosting space is available. While this approach may require significant effort, it can potentially enhance efficiency, especially considering the reasonably efficient nature of SQL.js + HTTP VFS with robust indexes already in place.

Additional tools worth mentioning:

  • Consider client-side FTS with https://github.com/tinysearch/tinysearch (no lazy loading index support)
  • Explore client-side FTS capabilities with (expect potential lazy loading index functionality in the future)

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Conceal the ::before pseudo-element when the active item is hovered over

Here is the code snippet I am working with: <nav> <ul> <li><a href="javascript:void(0)" class="menuitem active">see all projects</a></li> <li><a href="javascript:void(0)" class="menuitem"> ...

Countdown alert using JavaScript

Currently in my frontend code, I am utilizing angularjs as the javascript framework. In a specific section of my code, I need to display an error message followed by a warning message in this format: If a user inputs an incorrect month, then the following ...

What is causing the rejection to stay suppressed?

I noticed that when function uploadLogs() is rejected, the expected rejection bubbling up to be handled by function reject(reason) does not occur. Why is this happening? In the code below, a rejection handler for function uploadLogs() successfully handles ...

Hide the scroll bar in html/css while keeping the functionality of the arrows intact

Is there a way to remove the scroll bar but still be able to access overflown data using arrows? Any assistance would be greatly appreciated. Thank you in advance. ...

Learn how to implement the PATCH method specifically for the scenario when the user clicks on the "forgot password"

After clicking on the forgot password link, I want to display the password change form. However, I'm facing issues with calling the API to update the user's password. Below are the files I'm utilizing to develop the API, trigger events, mana ...

Dynamic Divider for Side-by-Side Menu - with a unique spin

I recently came across a question about creating responsive separators for horizontal lists on Stack Overflow While attempting to implement this, I encountered some challenges. document.onkeydown = function(event) { var actionBox = document.getElementB ...

What methods can I use to compare a React Component across multiple pages with Cypress?

One task I am trying to tackle is comparing a component on one page with the same component on another page using Cypress. For example, let's say I have a Pricing Component on the Home page, and I want to verify if the values on the Pricing Page are i ...

An unrecoverable error has occurred in the SetForm function: PHPMailer::SetForm() method is not defined

While working on form validation in jQuery with a WAMP server, I encountered two errors: Fatal error: Uncaught Error: Call to undefined method PHPMailer:: SetForm() and Error: Call to undefined method PHPMailer::SetForm(). I have already added PHPMailerAu ...

Creating form elements in ReactJS dynamically and storing their values in an array

I need to render 3 materialUI TextFields multiple times, depending on the integer input by the user before rendering the form fields (the integer is stored in a variable called groupMembersCount). I am using a functional component in ReactJS with an array ...

When selecting the top edge in React flow, it will automatically select the bottom

Documentation: Example from documentation: Steps to replicate issue: Drag and drop any node to a different location Select the top edge handle of the moved node Try dragging the edge out and notice that the bottom edge gets selected instead of the top e ...

What is the best way to place a 3D model at random points on the surface of a sphere while ensuring that it always faces the right direction?

I'm faced with the challenge of placing huts randomly on a spherical world. While this task is feasible, the issue arises when the huts do not sit correctly - their bottom should be in contact with the tile below. I've experimented with the &apos ...

Calculate the number of parent nodes and their respective child nodes

I am curious about how I can determine the number of children nested within parent-child relationships. For example: const parent = document.querySelectorAll('.parent'); parent.forEach(el => { const ul = el.querySelector('.child3-chi ...

How can I achieve unique spacing between rows in material-ui grid components?

What is the most effective method for creating spacing between specific rows in a grid using material-ui? Consider the following grid with spacing={0}: GridContainer row1 GridItem GridItem row2 GridItem GridItem row3 GridItem GridItem row4 Gr ...

Is it possible to pass a variable from an Axios Response in the Composition API up to the root level?

I need to fetch the headings array from an axios.get call and utilize it at the root level within my Vue component. However, when I attempt to return it, I encounter this error: ReferenceError: headings is not defined Here is the script element in my Vue3 ...

Trouble with the filter function in the component array

I am facing an issue with creating and deleting multiple components. I have successfully created the components, but for some reason, I am unable to delete them when I click on the "delete" button. state = { data: '', todoCard: [], id ...

What are the steps to execute Mike Bostock's D3 demonstrations?

I've been attempting to run Mike Bostock's See-Through Globe demonstration, but I encountered issues with the correct referencing of his json files when attempting to replicate it locally. The problem stems from this particular line of code: d3. ...

Is there an optimal method for executing shell commands quickly in Node.js?

How can I efficiently run a large number of shell commands sequentially, for example 50 or 60 commands one after the other? For instance: const arr = ['/hello', '/temp', '/temp2', '/temp3', '/temp5', ...... ...

Using framer-motion with Next.JS ensures that content remains consistent during navigation

I added a Link on my homepage that connects to the About Us page: <Link href="/about"><a>About us</a></Link> In my _app.js file, there is an AnimatePresence wrapper: <AnimatePresence exitBeforeEnter> <Component {...p ...

Placing options and a clickable element within a collapsible navigation bar

Within my angular project, there are 4 input fields where users need to enter information, along with a button labeled Set All which will populate them. https://i.sstatic.net/1GGh1.png I am wondering how I can organize these input fields and the button i ...

Node.js express version 4.13.3 is experiencing an issue where the serveStatic method is not properly serving mp3 or

I am currently utilizing Express 4.13.3 along with the serve-static npm module to serve static assets successfully, except for files with mp3 or ogg extensions. Despite reviewing the documentation, I have not come across any information indicating that thi ...