Prerender and AngularJS can cause crawlers to time out

Insight into setup:

Successfully implemented prerender (https://github.com/prerender/prerender) on my personal Ubuntu 16 server.

This is the content of my .htaccess file, which directs the url to prerender upon detection of a crawler. For instance: becomes

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^.*$ - [NC,L]

RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|redditbot|slackbot|msnbot|googlebot|duckduckbot|bingbot|rogerbot|linkedinbot|embedly|flipboard|tumblr|bitlybot|SkypeUriPreview|nuzzel|Discordbot|quora\ link\ preview|showyoubot|outbrain|pinterest [NC,OR]
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=$
RewriteRule ^(.*)$  http://example.nl:3000/http://www.example.nl/$1? [R=301,L]
#RewriteRule ^(.*)$  http://art.example.net/$1? [R=301,L] 

RewriteRule ^(.*)/(.*)$ /#$1/$2 [NC,L]

The issue at hand:

Meta data fails to load on Skype, Reddit, and Twitter while using prerender. Redirecting the url back to the old PHP website: (currently commented out in the htaccess) does function properly. Since all meta tags are consistent across both the PHP and Angular sites, it is likely that prerenderer is causing this problem.

Error encountered on Twitter ( with url: ) while using Prerender:

ERROR: Failed to fetch page due to: HttpConnectionTimeout
WARN:  this card is redirected to http://example.nl:3000/http://www.example.nl/63/Merry

Twitter successfully loads when redirecting to art.example.net (also utilizing the main URL: )

INFO:  Page fetched successfully
INFO:  19 metatags were found
INFO:  twitter:card = summary_large_image tag found
INFO:  Card loaded successfully
WARN:  this card is redirected to http://art.example.net/63/Merry

Utilizing the PHP version functions correctly and all meta data is retrieved.

In the future, I aim to completely phase out the PHP website, thus it's crucial for it to function seamlessly with Prerender. Prerender operates smoothly in Discord and Postman (with modified User Agent header). The reason behind its malfunction with certain other agents remains unclear to me.

Answer №1

To ensure proper functionality, your rewrite rule must act as a proxy instead of a redirect. Redirecting to your prerender server could lead to various issues, such as prompting Google to direct users straight to your prerender server from the search results (which is highly undesirable).

Here is the correct format for the rewrite rule:

RewriteRule ^(.*)$  http://example.nl:3000/http://www.example.nl/$1? [P,L]

Answer №2

Here's how to solve the issue:

Twitter and other web crawlers struggle with dots and colons in URLs. They do not support plain IP addresses or Port numbers.

To resolve this problem, you can set up a subdomain that redirects to your Node.js application.

This is an example of my Apache virtual host for the subdomain:

<VirtualHost *:80>
    ServerAdmin <a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="137a7d757c53766b727e637f763d">[email protected]</a>
    ServerName prerender.example.net
    ServerAlias prerender.example.net  
    ProxyPass / http://localhost:3000/ connectiontimeout=5 timeout=30   
</VirtualHost>

For more information, visit: here.

I was able to successfully implement this solution with guidance from prerender.io themselves.

Even though social media crawlers don't pay attention to proxy settings or redirects in the URL, it is still recommended to use the Proxy tag as a best practice.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Show the text beside the textbox in ASP.NET MVC

Is there a way to display text next to a Text Box using jQuery without using validation? For example, if the Name field is left blank in a form and the Submit Button is clicked, the message should appear next to the Name Text Box. Instead of traditional ...

Setting up Quasar plugins without using Quasar CLI: A step-by-step guide

I integrated Quasar into my existing Vue CLI project by running vue add quasar. Now I'm attempting to utilize the Loading plugin, but unfortunately, it's not functioning as expected. Here is the configuration setup for Quasar/Vue that I have in ...

React is unable to identify the property that was passed to a styled-component in Material UI

Custom Styled Component Using Material-UI: import { Typography } from '@material-ui/core'; const CustomText = styled(Typography)<TextProps>` margin-bottom: 10px; color: ${({ textColor }) => textColor ?? textColor}; font-size: ${( ...

Saving data to MongoDB using async operations in node.js with an array

My task involves manipulating a string and extracting specific values from it. Here is an example: var my_str = "Jenny [id:51], david, Pia [id:57], Aston [id:20], Raj, "; To achieve this, I am utilizing a function called convert_to_array(my_str) which sh ...

Incorporating a complex React (Typescript) component into an HTML page: A Step-by

I used to have an old website that was originally built with Vanilia Javascript. Now, I am in the process of converting it to React and encountering some issues. I am trying to render a compound React(Typescript) component on an HTML page, but unfortunatel ...

When the "next" button is clicked on Datatables, the next set of 10 items

I am currently working on creating a table with pagination using datatables. The data is being fetched via ajax, which returns 10 data entries at a time. However, I am facing an issue where the first call only fetches the initial 10 data from the ajax call ...

Having trouble finding the maximum and minimum dates with Moment()?

My task involves working with an array of dates and finding the maximum and minimum dates within it. Below is the code I've written for this purpose: let date_list = [ "2021-03-19T00:00:00Z", "2021-03-20T00:00:00Z", "2021-04-12T00:00:00Z", "202 ...

"Troubleshooting: Next.js useEffect and useState hooks fail to function properly in a

Working on my project in nextjs, I've implemented the useEffect and useState hooks to fetch data: export default function PricingBlock({ data }) { const [pricingItems, setPricingItems] = useState() const [featuredItem, setFeaturedItem] = useState( ...

Transforming JSON object to an array of arrays using JavaScript

Attempting to extract specific values from a JSON object and store them in an array can be quite tricky. Below is an example of what this process might look like: Example: var obj = { "1":{"class":2,"percent":0.99,"box":[0.2,0.3,0.4,0.5]}, "2 ...

I have encountered a situation where there is dynamic code being overlaid on top of my existing code. How should I go

My website was functioning properly until yesterday, when I encountered an issue where some dynamic JavaScript code was inexplicably being added on top of my HTML. This error is preventing certain JavaScript redirection codes from working correctly. Can an ...

Initialize data only when the Nuxt.js application is first loaded

Exploring the world of nuxt.js, I find myself pondering on the most efficient way to fetch data using REST api. Within my store folder, the structure is as follows: store -posts.js -categories.js -index.js Initially, I attempted to set the da ...

Is there a way to execute this process twice without duplicating the code and manually adjusting the variables?

I'm looking to modify this code so that it can be used for multiple calendars simultaneously. For example, I want something similar to the following: Here is what I currently have: This is my JavaScript code: var Calendar = { start: function () ...

Comparing the use of jQuery's data() method versus directly accessing attributes with native JavaScript using get

I want to retrieve general information from my website using javascript. I have a few different options available: I could use an html element: <input type="hidden" value="MyValue"/> Another option is to use a custom attribute in an existing h ...

Change the input value by clicking different buttons

Looking for a way to change the value or source of an input when clicking on different buttons? For example, clicking on Button 1 changes the input to "apple" and Button 2 changes it to "orange", etc. Here is a snippet of what I have tried so far: $(doc ...

How can I retrieve the total number of records (count) in an XML response using PostMan?

Hello, I'm currently attempting to determine the length of an XML response but I'm running into some issues. The error message I am encountering is as follows: "There was an error in evaluating the test script: ReferenceError: xml2json is not def ...

What is the purpose of calling Array.prototype.filter on the 'forms' array with the function(form)?

I'm struggling to grasp the inner workings of this code snippet. It appears to be a piece of form validation code taken from Bootstrap and pasted here. My confusion arises from this line var validation = Array.prototype.filter.call(forms, function(f ...

What are the steps to showcase StreetViewPanorama within a React application?

Is it possible to have a fully working streetview using google API key? I've come across some libraries online, but their documentation seems poor or outdated. I attempted to use the @react-google-maps/api library with documentation available at . Ho ...

The if-else statement doesn't seem to be functioning correctly once the else condition is included

I'm currently developing a search function for an address book that iterates through the contact array and displays them in a popup. Initially, it was working correctly by finding and displaying the contacts that were found. However, once I added the ...

Is there a way to capture real-time console output in an ExpressJS application while a script is running?

I'm facing a challenge in integrating the output of a bash script into an ExpressJS application to then send the data to a client. To address this, I have developed a simplified version of my actual script for testing purposes. My goal is to capture a ...

Pressing the tab key makes all placeholders vanish

case 'input': echo '<div class="col-md-3"> <div class="placeholder"> <img src="images/person.png" /> &l ...