Managing Correct Status Codes in Modern SPA Sites

10th Mar 19 (updated: 30th Oct 19)

Server Side Rendering to the Rescue

I love the new opportunities and abilities JavaScript frameworks are bringing us. The gap between website and app - like experience is narrowing all the time, and they give us some amazing tools to create sites that are delightful to use for our visitors. But there can be implications when it comes to how search engines crawl, index and therefore ultimately rank these works of art we have lovingly crafted.

Server Side Rendering (SSR), solves many of these indexing woes by providing the critical content of these pages directly in initial HTML served. It enables Google and other search engines to extract the relevant content from your urls without having to wait for a relatively costly (for Google) execution of your scripts and processing the resulting rendered page. This can be thought of as a second wave if you like, one that can often be delayed until those precious resources become available, Because even Google have finite resources to throw at a task, as much as we probably perceive their infrastructure to be unconstrained. Naturally, that's assuming that the search engine in question can even render your site (many don't play nicely with JavaScript, some just don't even try!). Don't forget all those other potentially critical social traffic sources, like Twitter, Facebook and so on, they all love to create a snippet when someone shares a link, and if your site is all client side, that snippet may well suck.

These benefits are massive in terms of getting your site crawled and indexed well, and are hugely important, but there's another massive benefit to SSR that I don't see people talking about quite so much, and that's the fact you regain control over status codes.

301 & 404 Still Matter!

Things move on, but some things are foundational, and correct status codes are one of those. They help the search engines understand what's going on with your site. Have you deleted that old article, and don't want it indexed anymore? 404 is the way to go! Moved that product from category A to category B? 301 that URL so search engines, and your users, know where to find it now, and you're not starting from scratch!

One of the major drawbacks of the SPA style, client - side rendered sites are that the client cannot generate a proper status code, as they can't affect the http headers, that's already done when the browser requests the shell file, i.e. index.html. So, as it stands from out of the box, you are quite limited in what you can tell the bots what's happening with the site structure when you have to make changes.

I'm Not Ready for SSR!

Ultimately, SSR offers you the opportunity to control these, and in my opinion, in the best, most flexible way. It's worth acknowledging that sometimes it's not a trivial task to go back and re-factor your site into something like Nuxt or Next.js. Angular's Universal is a little bit easier to 'retrofit', but it's not always a case of install, configure, profit! So if you are already in production with a site, and can't change right now, let's cover what you can do.

Leverage Your Routing for 404

If you are using your frameworks routing, it's probably already got a way to match catch-all URLs, use this to route to a 404 page, and configure this to return a true 404, here's an example of this using vue-router & nginx:

Your router.js


import Vue from 'vue'
import Router from 'vue-router'
import HomePage from '@/components/HomePage'
import About from '@/components/About'
import Contact from '@/components/Contact'
import NotFound from '@/components/NotFound'

Vue.use(Router)
export default new Router({
  base: process.env.ROUTER_BASE,
  routes: [
    {
      path: '/',
      name: 'HomePage',
      component: HomePage
    },
    {
      path: '/about',
      name: 'About',
      component: About
    },
    {
      path: '/contact',
      name: 'Contact',
      component: Contact
    },
    {
      path: '/404',
      name: 'NotFound',
      component: NotFound
    },
    {
      path: '*',
      beforeEnter (to, from, next) {
        window.location = 'https://example.com/404'
      }
    }
  ],
  mode: 'history'
  })

Nginx Config:

In the appropriate server {} block


       location / {
    # First attempt to serve request as file, then
    # as directory, then fall back to vue app shell
    try_files $uri /index.html;
       }
       location = /404 {
    # Return 404 status code, route to the vue app shell
    error_page 404 /index.html;
       }

What happens now is that when a bot (or visitor) hits a non-existent page, it's redirected to /404 route, this is configured in nginx to give a 404. The contents of the NotFound component are displayed, which should explain that the page is not found, in traditional 404 format. Whilst this a very manageable solution, this does have the downside that when Google hits the non-existent page, it still gets a 200, and it needs to render the page and follow the redirect to the /404 address

You can take this a little further if there are pages you know have been removed, and you really want to make sure that Google gets the 404 message upfront by adding them to your nginx configuration file. As an example, say you killed off a page /i-love-cats you could add this like so:


       location / {
    # First attempt to serve request as file, then
    # as directory, then fall back to vue app shell
    try_files $uri /index.html;
       }
       location = /i-love-cats {
    # Return 404 status code, route to the vue app shell
    error_page 404 /index.html;
       }
       location = /404 {
    # Return 404 status code, route to the vue app shell
    error_page 404 /index.html;
       }

Google is now getting a 404 as soon as it hits the URL. Naturally, this does nothing for all 404's and can soon become unwieldy.

Another option for smaller sites with a set, known and rarely changing number of urls is to change the above logic in nginx, so you are specifying urls that are ok to pass to the app shell, and defaulting all others to a normal, static html 404 page:


    location = / {
    # First attempt to serve request as file, then
    # as directory, then fall back to vue app shell
    try_files $uri /index.html;
     }
     location = /about {
    # First attempt to serve request as file, then
    # as directory, then fall back to vue app shell
    try_files $uri /index.html;
     }
     location = /contact {
    # First attempt to serve request as file, then
    # as directory, then fall back to vue app shell
    try_files $uri /index.html;
     }
       location = /404 {
    # Return 404 status code, route to the vue app shell
    error_page 404 /index.html;
       }

Redirects

Redirects can be handled with a double combination of the routing in your SPA & your web server. If you wanted to redirect the URL /i-love-cats to /i-love-all-creatures you could use something like this in the router:


  {
      path: '/i-love-all-creatures',
      name: 'LoveAllCreatures',
    components: '' 
    }
  {
      path: '/i-love-cats',
      redirect: '/i-love-all-creatures'
    }

That takes care of the everyday user of the site, but if you want the full benefit of what a 301 offers to search engines, team it up with a matching redirect in using your web server. Again, in nginx, this could look like:


    rewrite ^/i-love-cats$ https://example.com/i-love-all-creatures permanent;
    
    location = / {
    # First attempt to serve request as file, then
    # as directory, then fall back to vue app shell
    try_files $uri /index.html;
     }
     location = /about {
    # First attempt to serve request as file, then
    # as directory, then fall back to vue app shell
    try_files $uri /index.html;
     }
     location = /contact {
    # First attempt to serve request as file, then
    # as directory, then fall back to vue app shell
    try_files $uri /index.html;
     }
       location = /404 {
    # Return 404 status code, route to the vue app shell
    error_page 404 /index.html;
       }

SSR & the Power of Reverse Proxy

I'm not going to cover how to set up things like Nuxt, Next.js & Angular Universal. There's great documentation on these directly from the projects themselves, and a quick Google search will reveal a host of brilliant tutorials, no point re-inventing that wheel right now. I will quickly cover my preferred way of configuring the serving of them. Get your project running on an internal port, something like 127.0.0.1:3000, then use a web server such as nginx or apache to serve this to the world through a reverse proxy set up. Here's a basic example for nginx, in the appropriate server {} block


location / {
# Look to see if file exists, if not fall back to proxy
try_files $uri @ssrproxy;
}
location @ssrproxy { 
  proxy_pass http://127.0.0.1:3000;
  proxy_redirect                      off;
  proxy_set_header Host               $host;
  proxy_set_header X-Real-IP          $remote_addr;
  proxy_set_header X-Forwarded-For    $proxy_add_x_forwarded_for;
  proxy_set_header X-Forwarded-Proto  $scheme;

}

Here's something similar in apache, in the appropriate VirtualHost block


### Proxy to SSR Rules  ###
  RewriteEngine on
  RewriteCond %{DOCUMENT_ROOT}/$1 -f [OR]
  RewriteCond %{DOCUMENT_ROOT}/$1 -d
  RewriteRule (.*) - [L]
  RewriteRule ^/(.*)$ http://127.0.0.1:3000/$1 [P,QSA,L]
  ProxyPassReverse / http://127.0.0.1:3000/
### end proxy rules ###

Because you are now proxying requests to the server side rendering project, it now can pass along the correct status codes. Again, take a look at the individual projects to find out how that's done, but both next.js and nuxt handle 404's pretty much from out of the box. Redirects can be a little more complex, but all achievable, again, check out the documentation from the communities to find out how to best manage this for your project. I will, however, share an example of putting all this together with nuxt, using an external data source to control these.

Nuxt & the Power of Underscore

If your site is going to be large and needs to be simply manageable, you'll need a CMS. This backend can be something tailored to using a single page application front end, or something more traditional like WordpPress, Magento or even something more bespoke.

These CMS systems are built to make management easy, and adding new pages and products is much easier if it's managed by these, rather than something more static based on hard coded routes. It's here where Nuxt really shines, its dynamic routing system is simple and easy to understand. For more of an overview, take a look at the official documents here. There's a hidden gem in this; Unknown Dynamic Nested Routes. This allows you to fall back to this file and use it as your own router, passing the URL path as a slug and then using the resulting API response to show the data, do a redirect or a 404.

Here's how it works, nuxt defines routes based on the files in the pages directory for example:


  /pages
    /category
      _id.vue
    index.vue

https://example.com/ would match the index.vue file, https://example.com/category/456 would match the /category/_id.vue file, and the 456 would be available as a parameter named id.

This is great if you know the structure and parameters that you will be working with, but what if you want something more global? That's where the _.vue file comes in:


  /pages
    index.vue 
    _.vue

In this configuration, https://example.com/ would still match the index.vue, as the routing is a fall back system, but anything else would match the _.vue system, and the path part available as a parameter too. The downside is that you are now responsible for routing this bit.

Here's an example site I've thrown together using this methodology: https://routing-demo.tamethebots.com/. Its page structure is the same as the above example, and it leverages the _.vue for all other pages outside of the homepage to call a simple php / mySQL api to either show a category, individual detail page, a redirect, or if not found, a 404. You can find the code here on GitHub: https://github.com/dwsmart/gpiggy

A Category Page: https://routing-demo.tamethebots.com/good
A Individual Page: https://routing-demo.tamethebots.com/naughty/baron-von-squeak
301: https://routing-demo.tamethebots.com/terrible
404: https://routing-demo.tamethebots.com/asdafamlaknsfl

The logic behind this looks like this (see it in the git example here)


  asyncData({ app, params, error, redirect }) {
    return app.$axios
    .$post('https://routing-demo.tamethebots.com/api/route.php', {
    route: params.pathMatch
    })
    .then(function(response) {
    if (response.route.type === 'redirect') {
      redirect('301', '/' + response.route.target)
    } else if (response.route.type === 'cat') {
      return {
        type: response.route.type,
        title: response.route.meta_title,
        description: response.route.meta_description,
        cat: response.cat,
        pig: response.pig
      }
    } else if (response.route.type === 'pig') {
      return {
        type: response.route.type,
        title: response.route.meta_title, 
        description: response.route.meta_description,
        pig: response.pig
      }
    }
  })
  .catch(e => {
    error({ statusCode: 404, message: 'Page not found' })
  })
  }

asyncData is called when rendering a page server side, so the data set here still gets rendered before sending to the browser. Here, it's using the params.pathMatch value to send the url to our php backend, i.e. https://routing-demo.tamethebots.com/good would send a value of good, https://routing-demo.tamethebots.com/naughty/gerald would send naughty/gerald

The api looks this up in a table (the mySQL dump, and routing file can be found here) and returns a json array. We then parse this and find out what type of url this is, and the associated data. We then use this to perform a redirect, show the appropriate component, or if there is nothing found, return a 404 Error.

Still want to have something different on a route? That's fine! As it falls back to the _.vue, adding cart.vue to the pages directory would mean that component is loaded at /cart

It's a really simple way to mange your URLs at scale in a search engine friendly way.

About the Author:

Dave Smart

Technical SEO Consultant at Tame the Bots.

@tamethebots.com // @dwsmart@seocommunity.social // https://tamethebots.com/about-dave

Previous:
< Reviews, The Rating Guidelines & Ranking

16th Nov 18 // updated: 8th Nov 21

Back to Blog

Next:
Does Googlebot Use Etag Headers? >

10th Jun 19 // updated: 10th Jun 19