The keyword geek

Friday, 29 October 2010

Identifying your mobile visitors from web stats

As mobile browsers have moved from gimmick to the mainstream over the last few years the job of a web developer has had to evolve to service their needs. With full-featured mobile browsers replacing the cut-down early offerings we might have to worry less about our mobile users than we used to but we still have to ensure that they can use our sites with few problems.

The problem with mobile platforms from a developer perspective is that there are so many of them in use. Testing a web site on the desktop is comparatively easy, once you’ve made it work in IE and Firefox, you’re unlikely to find any issues in Chrome, Safari or Opera so once you’ve given it the once-over on a Mac there’s not much left to do. By comparison it is almost impossible to have one of each of the plethora of mobile phone platforms so once you’ve looked at the iPhone, Android, and Opera browsers, with maybe Windows Mobile and Blackberry as well (assuming you are fortunate enough to know owners of all those handsets) you have little idea how the rest of your mobile users will experience your offering. In my case I have an Android phone and a Series 40 Nokia clamshell, and I borrow a friend's iPhone 3G when I need to test that platform.

To shed some light on the matter you need to know the scale of the problem. So you go to your web stats or analytics package and look at the browser/OS combinations. If you’re lucky, your package will be capable of recognising smartphones, so you’ll pretty quickly see stats for visitors with iPhones, Android phones and maybe Blackberries. But a cursory glance at the detected user agents should tell you that is only the tip of the iceberg. A quick look at mine reveals a huge list of feature phones, some from well-known manufacturers like Nokia or Samsung and other devices I’ve never seen and in most cases never heard of.
On my site I finally settled on looking at screen resolutions. I decided anything smaller than 640x480 had to be some kind of mobile device, so added those numbers to those of a few well-known larger-screened devices. The iPhone 4's 960x640 pixel display being a good example. In the end I found that around a tenth of a percent of my site's visitors were mobile users. This was less than I expected because I think the number of different user agents had led me to believe there would be more, however it tallied with the figures for browsers and OSs so I found it to be believable.

Friday, 8 October 2010

A company: "Them", or "It"?

Consider the following phrase:

"Since it is the market leader it can be assumed that what works in relation to Google will also work for its competitors"

Google, like any other company, is an entity. "It". But Google is also a collection of offices full of people. "Them".
"Google are writing software that..." or "Google is writing software that..."? My inbuilt English language parser tells me to use the former. Is it leading me astray?

Thursday, 7 October 2010

Is there a relationship between content volume and traffic?

   My exercise in future web traffic prediction last month must have caused some interest among its target audience, because I've been asked for more. This time with a twist.
   The question: "If we add a load of extra pages to a web site, what effect will that have on the traffic?". How long was that piece of string again?
   An impossible question to answer. It depends on factors too numerous to quantify and any figures I come up with can't be trusted, I said. We know, they said, but give it a go anyway.
   So I thought about it and decided on the well-known theoretical model of a long tail web site. One with many pages, each of which scores on its own search term and each of which only brings in a few visitors, but when all are taken together the total of all the visits for the site is a very large number. In a long tail site, visits are proportional to page numbers, so all I had to do was work out the average number of visits per existing page and multiply that by the number of new pages for an estimate of the extra traffic.
   My problem was that the figure I reached looked far too optimistic. Believable I guess, but only just. An upper error margin for my new traffic estimate.
   And there lies the fatal flaw in web traffic prediction. As I remarked last time, all I am doing as I go further into the future is giving myself an ever-increasing error bar on my figures. And by adding yet another estimate on top of an existing estimate, all I am doing is increasing that error bar to the point at which the figure becomes meaningless.
   Still, it's an interesting exercise, if only to create some pretty graphs.

Saturday, 2 October 2010

A little experiment in hiding words in plain sight

   Yesterday at work I decided to try a little experiment. My desk is next to a busy thoroughfare, with a lot of people who I'd rate as fitting my target audience passing me every day. I printed out a QR code encoded with the phrase "Does anyone respond to QR codes? Email me if you're one of them" and my work email address on a piece of A4 and stuck it on to the office partition facing my passing colleagues.
   QR codes have interested me ever since I first read about them a few years ago. Beyond the steganographic appeal of hiding text in plain sight they offer an interface between printed media and the online world, two areas that seem so mutually exclusive. And since most mobile phones now have the capability to read them they should be far more common than they are.
   Hence my experiment. Employees of a large publishing company are likely to have at least heard of QR codes and also to posses smartphones, but how many will respond to one in their everyday environment?
   I'll be leaving my QR code up for a month. I'll be very disappointed if nobody responds to it.

Monday, 27 September 2010

RSS feed keyword analysis for the fun of it

What do you do when the recession hits and you are made redundant?

When it happened to me last year, I wrote an RSS feed keyword trend analyser in my new-found free time. Over a year and several million keywords and phrases later I can find associated keywords and phrases and plot graphs for almost anything that's been in the UK mainstream news. Like this one, showing the fortunes of three Labour party leaders over the past few weeks.

You can clearly see Tony Blair's book launch as the blue hump in the middle, and Ed Miliband's election as party leader in green on the right. Meanwhile Gordon Brown bumps along in the obscurity of his Scottish constituency as the red line. Funny that, the colours were allocated at random by my graphing library yet Blair got the Tory blue.

As a search engine marketeers tool it's of limited use unless you really are looking at up-to-the-minute trends for very fast moving content. But as a toy, or for finding collocated words and phrases for newsworthy themes, it's shaping up pretty well.

I'll be dipping back in to this particular well of words again on here from time to time, both from the tech side and just for the joy of playing with some words.

Saturday, 25 September 2010

Predicting future web site traffic

Recently I had the unenviable task of making an attempt to predict the traffic levels likely to be seen on a web site in the few months following a piece of search engine marketing work. Unenviable because it's a "how long is a piece of string?" question, impossible to answer with the certainty usually demanded by those who ask it. I gave it my best shot and thought it worth recording here how I did it.

Web site traffic is cyclical. That is to say that the traffic pattern seen on a site over a given period in one year is likely to be mirrored in the same period in the following year. These same cycles can be seen on different sites in the same sector, so if one site selling pies sees a traffic pattern it is likely that another pie site will see the same pattern over the same period.

The site in question has not been online for long enough to have gathered statistics for this period in a previous year. This meant that for the purposes of this exercise I had to look elsewhere in the same industry to establish the likely traffic patterns for the next few months.

A competitor graph was created using the compete.com competitor tracking service. These sites can only be seen as estimates of any site traffic levels, but they do seem to get the trend information right. Three sites from the same industry were selected for similar traffic levels. The graph axes were extended for a few months into the future and the traffic patterns for the two competitor sites from same period in the previous year was pasted onto the end of their traces for this year. The trace for that period from the site that had the worst pattern was pasted onto the end of our site's trace for the months to be predicted. This formed the baseline of our predicted traffic, in other words what we thought might happen if no promotional activity took place.

A new predicted trace for our site was then created by applying a 10% per month increase on the baseline trace. This formed an upper trace becoming increasingly divergent from the baseline trace. 10% was a figure plucked from the air as a realistically achievable upper limit target. The result was a shaded area between the 10% and baseline curves that was roughly triangular into which our future traffic should fall.

Finally a left hand y axis was created to show estimated Google Analytics visitor figures. Visitor figures from services like compete.com usually significantly under represent the true values, so using the known Analytics figures from this year as a reference, the Analytics estimates were calculated using their ratios to their corresponding compete.com figures.

So what did this graph tell us? In three months time our site could be receiving the traffic figure mid way between the baseline and 10% traces. Which sounds impressive until you realise that the two traces represent the error on that figure, about 20%. Hardly accurate.

What it really tells us is this: predicting the future is an inexact science, the further into the future we gaze, the greater the error with which we see it. In fact the graph may already be flawed at the point of its creation. Compete.com works on complete months and the Analytics figures so far this month are not as good as those for the last complete month might lead us to hope.

As an exercise though it was still worth pursuing. It is always worth knowing what the web traffic cycles are in any industry and for all its inaccuracy this method still gives some idea of what we might expect. I just wouldn't stake my career on it, that's all.

Wednesday, 22 September 2010

My compliments to the cook: SEO vs. SEM

   When I was a small child I attended a primary school in an English village. Summers were long and hot, there were jumpers for goalposts and our school meals were awful. They were the creations of the school cook, a rather nice lady whose culinary output was probably stunted by a poor budget and the dead hand of Ministry of Education dieticians. It was with great surprise then when I moved to secondary school that I found the meals were rather good, worth looking forward to in fact, for they were assembled not by a cook, but a chef. With a white hat and all, very impressive.
   My profession is usually referred to as search engine optimisation, often represented by the initialism SEO. You will rarely see either in my personal lexicon, instead I prefer search engine marketing.
   My reasons for this are twofold: to give a sense of the wider task involved in helping a web site to increase its visibility in the search engines through legitimate means and to differentiate myself from the work of the blackhats in the gutter of my industry. A few years ago while contracting as a quality rater for the large search engine you probably use daily I spent a lot of time following up keyword stuffed link farms, valueless spam blogs and hidden or misleading rubbish from people who definitely refer to themselves as being in the SEO business, so for me the distinction is an important one. I'm lucky enough now to work in-house at a large publishing business and need never ply my trade further afield, so I see no reason to associate myself with the term SEO.
   Looking at a Google Insights search comparing the two terms I find I'm at least not entirely alone. Search engine marketing is used about half as much as search engine optimisation(or optimization for a US search) but it's still a significant enough term for me to be able to describe myself thus without blank looks. Because as with the school catering staff of 1980s Oxfordshire, I'd rather be a chef than a cook.

Sunday, 19 September 2010

All blogs have to start somewhere

Keyword

a word which acts as the key to a cipher or code
a word or concept of great significance

Geek

an unfashionable or socially inept person
[usually with modifier] a knowledgeable and obsessive enthusiast

This is obviously one of those cases when you wish you hadn't looked a word up in the dictionary. I'm a search engine specialist by trade, so "A knowledgable and obsessive enthusiast for words of great significance" doesn't sound too bad. I'm not so sure about "An unfashionable or socially inept person" though.

I enjoy my job, and in doing it over the years I have frequently encountered words, phrases, techniques and bits of code that have made me think at a tangent to what I am being paid to do. These tangents sometimes stick around in my head for a while, and this blog represents a long-overdue outlet for them.