0

I have a single-page application (SPA) website. All of the content is loaded via Ajax, for eg.: .load('page/subpage3.html') The url never changes at the top (which is by design, I want my visitors to always start browsing the site from the first page where I write the News).

From the menu the user can change the content of the site which is appearing under the menu. The content files are named with xyz.html so they appear formatted in my notepad++ and are in a subfolder. They have no header, no styles, no js files in their sources, since those are attached to my index.html. If a user is opening the site via the mysite.com domain, everything is appearing fine.

The (big) problem is that google is listing my content pages using their full path, like they are separate, complete working pages. But when a user is opening those links, only that content file is loaded, without any pictures, any css or js files, no menu, even the characters appearing wrong, since nothing is specified in these content files.

My question is, how could I force to "always open the index.html" instead, when clicking on (/typing) a content page's url. Those urls (my subpages - the content files) should be redirected to my domain name. Like I did with the 404 error handling.

Probably to solve the problem, the .htaccess file needs to be changed (which I have full access to, luckily). The site is running on Apache server.

Best I found is the next code, but I'm afraid to use/try it:

RewriteEngine on
RewriteCond %{REQUEST_URI} !^/index.html$
RewriteRule . index.html [R=302,L]

because I don't know if it can break any in-site features (the menu, photo gallery, ...) and I am not sure if google will be able to search and list the matches fine after these changes or it will ignore those content pages cause of the redirect? (Or even if it will work, my page could get lower ranking?) I'd like to keep the google findings on the entire site the same as now. Thank you.

JaSon
  • 148
  • 1
  • 1
  • 10

1 Answers1

2

google is listing my content pages using their full path

Since you are not changing the url when navigating, maybe google is too clever and parsed your scripts to extract the full path of the sub pages. If all you want the users is to navigate from the start, you can use the RewriteCond %{HTTP_REFERER} !<your root url> [NC] to do a redirection to the root url whenever users click a deep link from other places such as search engines (this will not affect the ajax calls from your own website, but make sure there are no direct hyperlinks, otherwise you will need to use other mechanism to determine whether it is a XHR or not).

If you don't want google to list your deep links in the search results, use the robots.txt to restrict the crawlers, see this SO thread.

I don't know if it can break any in-site features (the menu, photo gallery, ...)

Usually people would separate the views (html) and content (images, css etc.) in separate folders, so just rewriting the paths under the view folder will not affect the other contents

I am not sure if google will be able to search and list the matches fine after these changes or it will ignore those content pages cause of the redirect?

Redirecting the sub pages will of course degrade the SEO performance, but some frameworks (e.g. Angular, React etc.) can provide workarounds to solve this by doing server side rendering. I'm not sure what tools you are using to build your SPA, but you can try the Fetch as Google first to see if google can index your current SPA anyway.

Community
  • 1
  • 1
LKHO
  • 106
  • 6