The Hidden Dangers of Using Free Site Scan tools.

Site scan tools are becoming more and more popular. There are hundreds of services and free online entry points where you can gain insight and specific information on your website. While often helpful, these tools can lead the unsuspecting professional into needless anxiety.

Pages of recommendations, suggestions, waterfalls of data, are all good things. In many instances, they can prove critical to improving your ROI when followed up on correctly. Each site scan tool promises actionable results and delivers lists of improvement suggestions, but therein lies the danger. The results you receive are not always true for your specific business, your hardware, or software setup.

How do you reconcile the dozens of scores, grades, and code algorithms that promise better performance? You don’t unless you’re interested in creating your own software like we did. You take each site scan result and interpret what it means to you, specifically.

There is no one score, scan, or grade that wholly defines your online business. It’s a combination of things that define your site’s true health.

It’s Not About The Numbers

If you focus too much on a single site scan tool or measurement you can quickly fall prey to the false promise of Absolute Scoring.

Absolute Scoring: Calculating the overall score of the submission as a sum or average of its individual scores for all test cases

Absolute Scoring doesn’t take into account your software setup, your host, or the CRM you are using. There are over 100 Content Managment Systems. That count does not include SAAS and custom-defined systems. So you can see how the results from anyone site scan can quickly lead the unsuspecting site owner on a wild goose chase.

But these scans reflect the standards of the web, right?

Yes and no. One set of site scan standards doesn’t fit all websites that may get run through them. I don’t mean to say that you shouldn’t use site scanning tools. On the contrary, we use many site scanners and they are essential to finding the things that can make the most impact on your site’s performance and user experience.

What is important is that the information or results from any single scan get read in context. Reading a report then rushing off to change all the things you see isn’t wise. You wouldn’t make a decision to start construction in your home without taking into account your location, zoning restrictions, cost, your schedule, and other things. The same goes for your web property.

Site Scan Results: At A Glance

Here is a great example of what I’m talking about. Recently we heard from a client that is using GTmetrix to monitor their site performance in between our weekly reports. They were upset about the Yslow grade they noticed on their own site scan and asked what we could do to help. This particular client is a large technology consulting company, so looking good to users from a website performance standpoint is crucial.

Side Note: We LOVE GTmetrix. We use it all the time. They offer a free site scan and amazing insights to what’s happening on your site…do you see the BUT coming? BUT the results are absolute in the free version, which is what this client was looking at.

One great thing about this scan is it provides recommendations for increasing your score based on what it finds. When we took a look at this particular list of recommendations we noticed 4 severely low grades. The four items listed in the image below were really hurting their YSlow score.  But should they be? Let’s look.

Y Slow uses 23 testable rules to determine what a website’s performance score is. You can see an itemized list of these along with links to learn more about each here at Y Slow. These are static rules and the things they are looking for may not be something your site needs or something present in your specific website setup. Below I’ll evaluate each result.

Site Scan Results: Time to Evaluate

There is much to say about each of these recommendations in general so I’ll limit my responses to directly relate to the post theme here, which is that the client’s particular setup greatly influences the interpretation of the results.

1. Make Fewer HTTP Requests.

This is a little tricky, as this is how WordPress works inherently. Each plugin comes with its own set of codes and often CSS. So, it’s a case of YSlow saying “the threshold for the HTTP request is x, does this site meet that?”

There is no way for us to say “this is WordPress so score accordingly”. What we CAN do however is try to make YSlow happy by using a plugin that loads the resources differently. This doesn’t always work, as it can mess with the default loading of certain code types, but many developers have tackled this issue with plugins.

RESULT: Accurate find by YSlow rules. Not helpful for this client, especially considering their PageSpeed score was a B here. 

2. Add Expires Headers.

The results listed “There are 5 static components without a far-future expiration date.”

All of these are live data connections that the client is loading on their homepage to either track visitors or display live data. Google Analytics, Facebook, etc.

So why the low grade? YSlow is pre-programmed to look at the structure of data sources, not the URLs. It doesn’t know what those links are serving, only that they don’t match the structure they are told to look for with ‘expire’ headers.

RESULT: Accurate find by YSlow rules. Not helpful for this client. 

3. Use a Content Delivery Network (CDN).

The scans have rated our client a 0 on this, I ran my own scan in my registered account and was able to return a grade of ‘C’ for this. This is because I was able to customize this scan to look for the CDN that is included in their hosting (That YSlow is not programmed to look for by default!).

The scan lists a dozen URLs that should be using a CDN, but the URLs are actually already using them!

Why doesn’t YSlow see this? Because YSlow doesn’t care that it says “cdn” in the URL it’s returning, it only knows that it’s not being served up by a CDN hosts on the list it was programmed to search for. (In the scan above I created an account so that I could tell YSlow where your CDN was hosted, that’s why it’s happier with this feature than in the original scans)

RESULT: Inaccurate find. This client is on one of our preferred hosts, Pagely, and does have a CDN.

4. Avoid URL Redirects.

We dug into the results here and noticed that they were being unfairly penalized for a redirect that should not be happening. The high-level view is that one of the plugins was putting in a double slash, i.e yoursite.com//twoslashes.

RESULT: Great find!

Site Scan Results: Scores

A score of 100% on one test is not a realistic picture of your site performance in general. We use GTMetrix often because it’s the most comprehensive on a snapshot level each week, but you really need context to see the big picture. So it’s a great tool, but as you can see above, it must be used with caution when you rely too much on one single score.

The idea is not to always look at getting 100 (as that 100 is relative to your goals, competition, and tools). It’s looking at what you can do better today than what you did yesterday and doing it better than your competition.

In Conclusion

We value many of the same site scan tools serious site owners do and use them to determine a small part of a bigger picture of site health. For more detail, on our specific approach to holistic site heath, or about website health tools in general, you can read more about it.