Sports Personality controversy and what it tells us about tender evaluation (part 2)

Yesterday we looked at how the shortlist for the UK’s Sports Personality of the Year (SPOTY) ended up without a single woman and suggested that the judging panel was not well chosen or briefed. Our interest is in how this reads across to procurement and supplier selection processes / decisions.

Having got their list of judges, we then saw the problem of averaging come into play.  The votes from the 27 judges were added up, and the top 10 scores made up the shortlist. This had the advantage of getting rid of Berbatov and similar outliers (see yesterday), but it introduced a new problem – that of the averaging effect.

We have often similar problems with bid evaluation. If the marks from individual evaluators are simply averaged, this can happen. One evaluator thinks a bid, or a particular response, is worth 9/10. the other makes it 1/10, then we have an average of 5/10.

Now the one thing we can say with some certainty is that 5/10 is not the “correct” score for the bid! It appears to be either great – or rubbish. There has been some very different interpretation by the two evaluators, or an error. We need to find out what the issue is, and arrive at the “correct” score for the bid.  So consensus scoring is the answer to this and most of the other problems identified here.

Imagine if this group of markers for SPOTY had got together.  The four Manchester footballers would have been laughed out of the running immediately. Then someone would have mentioned Chrissie Wellington – 4 times world champion Ironman triathlon and the toughest female sportswoman in the world.

Chrissie Wellington - one tough lady

I’m sure others who maybe didn’t know her would have said, “Wow! Of course she should be on the list!” The absence of woman would also have been noted and perhaps addressed.

And that brings us into a tricky area in terms of evaluation, for the public sector in particular. Let’s say that you want to take 5 suppliers through to tender from a pre-qualification process.  Brilliant Inc is the top scorer. In 5th place comes Boring plc  – well behind Brilliant, and they are very much the same type of firm, with a similar approach, profile, style – they’re just not as good. But in your PQQ they just outscored Maverick Ltd who came 6th. They are a very different firm, with some weaknesses, but would offer a very different option at tender stage to Brilliant Inc.

Who do you select?

I would suggest the best thing for the organisation is to take Maverick Ltd through rather than Boring plc (as well as Brilliant of course and numbers 2,3 and 4 on the ranking). Why? Well, Boring plc stand little or no chance of winning the work, as they are a pale imitation of Brilliant Inc. Whereas Maverick Ltd. might just come through, and at least by offering a different approach they may  inform the bidding process in a positive manner. Another option would be to include both Maverick and Boring.

We have the same parallel with SPOTY. No offence to him, as he seems like an excellent chap, but Luke Donald cannot win. He’s up against two other golfers, both Major winners – McIlroy who has the youth thing going for him, and Darren Clarke, Open champion at 42 after 20 years of trying, who has the back-story and huge popular support. If a golfer wins, it ain’t gonna be Donald. Sorry Luke. So why waste a short list place on him? Because of the evaluation system.

This is tricky in the public sector where the number 5 in the scoring trumps the number 6. But you can try and ask questions give the opportunity for some originality to come through even at PQQ stage. And when you come to the borderline cases, there may be a little flexibility and the consensus discussion can explore that. In the private sector, where one is less constrained, look to get a balanced, varied tender list, not necessarily just the top 5 suppliers based on raw scores.

And our SPOTY winner prediction? In the absence of Ms Wellington, Mark Cavendish.

Share on Procurious

Voices (6)

  1. Plan Bee:

    Oooh A lot of worthy and smart comments.

    I’d just like to add:

    Go Cav!

  2. RJ:

    The two dangers I see with all score-based evaluations, especially in very complex (e.g. major capital or IT projects) or subjective (e.g. professional services) work is that either one factor will completely dominate, as in the example above, or that all factors will average out, thus giving very marginal differences between vendors. There are, in my opinion, several key strategies in addition to consensus scoring that can help to avoid this:

    1. Identify as many criteria as possible that are genuine “Go/No Go” criteria, i.e. if you don’t meet the standard, you’re out. Examples would be quality thresholds, acceptance of key commercial principles, ability to meet lead times etc. Exclude these baseline factors from the scoring process – this accentuates the differences between the bidders.
    2. Identify the criteria that don’t warrant a score at all (I’ve lost count of the number of tender evaluations I’ve seen that create an automatic score for the supplier providing their address and contact details).
    3. Don’t score individual questions, group them together under overall headings (e.g. quality, account management, customer service, cost, commercial terms).
    4. Articulate in a short sentence what you are looking for overall under each heading (e.g. are you looking for the top quality or just an acceptable threshold?). Particularly in more subjective areas, this forces the assessors to set out and justify their opinions of bidders and to clarify their “gut feelings” which so often drive decisions on areas such as expertise, account management, service quality etc.
    5. Weight and sense check the criteria before you do the assessment to ensure that you don’t get examples such as the one given above.

    I’ve not got too much public sector experience but this approach did work in a recent OJEU tender for legal services so I think it can be applied across all areas. Comments welcomed, though.

  3. bitter and twisted:

    Im suspicious of mixing quality and price in one evaluation like that but cant put my finger on quite why.

  4. flog:

    Being practical, I do not believe that a public sector tender should be advertised to state that, in a 2-stage process, you will shortlist X; rather you will shortlist at least X and not more than Y which gives a degree of flexibility. When shortlisting – using a scoring approach to reach a list of ranked applicants, you can then look for a ‘natural break’ in the scores i.e. one where a mark up or down on a question isn’t going to change the order.

    I was doing a very simple tender evaluation case study in a course today. With an ongoing emphasis on price (due to current economic pressures), one group opted for a 30:70 quality:price ratio with quality determined using 4 sub-criterion (weighted 5:5:10:10. A was the lowest price, B was 25% higher tha A and B was 50% higher than A. In the responses to the 4 sub-criterion, A didn’t provide any response to the first two sub-criteria (hence scored 0) and still came out the winner because of the heavy weighting attributed to price. They were surprised and it was a worthwhile exercise as the ‘penny dropped’ regarding their need to think through and understand the impact (or potential) impact of their choosen approach to tender evaluation.

  5. Dr Gordy:

    I recall on example where someone who couldn’t win a public sector tender progressed through the PQQ to tender stage. From what I recall that bidder opportunistically went to the Ombudsman. The Ombudsman reviewed the whole process and agreed that bidder should not have won. However, the Ombudsman found a weakness in the process and that the winner should not have been awarded the contract either. That then led to the second placed bidder, who the Ombudsman to have been unjustly treated having a claim for loss of profits, along with the 3rd, and the 4th, and the 5th!
    Part of the problem with the market at the present is that, in public procurement, the stakes are so high, challenges are now to be expected, so, following your logic, which I agree with, is likely to end up with an unsuccessful defence in a challenge.

  6. Rob:

    ‘Personality’ should be replaced with a slightly abridged version: ‘person’.

    They’re stretching the imagination a bit with first, when one reflects upon some of the shortlisted contenders….

Discuss this:

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.