I’ve spent much of the last two months diving deep into how you buy AI tools as part of a report to help the UK Government better buy Generative AI. (Full details below as well as an invitation to the launch event, for anyone who is London based). It’s all too easy for external consultants to say you “just” use a piece of software and all your problems will be fixed, without realising that that “just” includes many, often tortuous stages of Market Intelligence, Evaluation, Approvals, Business Cases, Sandboxed Trials, more Approvals, Roll Out and Behaviour Change. These are fundamentally human processes and focusing overly on the technology will inevitably lead to gaps - and disappointment. As always, this newsletter aims to focus on the “How” and the “So What?” for organisations, rather than the cool things AI tools themselves can do. And, as ever, I’m keen to hear your thoughts on how we are getting on.
Enjoy the Olympics when they start next week
James
If you only read one thing
Trying well and buying well are key. Most organisations' AI strategic response is currently to eke out productivity gains and capability improvements, rather than commit - yet - to bigger projects. This is best done using tools that other people have built, and using them well. It’s a strategy more aligned with British Cycling’s theory of 1% improvements that led to so much Olympic success, rather than transformative business model change. And it’s a sensible one for many to adopt.
This isn’t to negate the importance of assessing how competitors or new entrants might be able to disrupt your value proposition. We still expect many knowledge & service industries to be significantly reshaped before the end of the decade. But doing “something”, even relatively small, is an important first step and with productivity gains in double digit percentages, the impact on the bottom line is not to be ignored. Our advice to organisations looking to respond this way tends to fall down three lines.
Buy well. Allocate more resources to horizon scanning, evaluating market offerings and in due course onboarding. Being a sophisticated buyer of anything - but particularly software - requires time and effort. If you haven’t allocated any budget to your corporate IT function for evaluating the new range of tools, then don’t expect much to change. These teams were busy managing ongoing contracts before and adding GenAI software is a big new ask. Resource them appropriately and ask them to find supplier-built software which can have a quick bottom line impact.
Try well. Give employees across a wide range of teams access to general purpose tools (like ChatGPT for Enterprise and Microsoft Co-pilot). They’re best placed to uncover “latent expertise” in the models - things that the AI tools are capable of doing but most people hadn’t yet realised. AI tools might still be software, but getting the best from them requires more communication and domain specific skills than technical ones. That means learning to wield them is best done by an appropriately incentivised and diverse group of people who can experiment, rather than by leaving responsibility solely with the IT department.
Don’t build, unless you already do it well. Tech companies have been complaining for years that the salaries of competent AI engineers are sky high and the market for talent is fierce. It’s hard enough for tech giants and startups to compete, with equity packages that offer significant upside. For more mature organisations, this type of equity compensation isn’t so appealing, leaving you disadvantaged in the war for talent. Unless you already have a team you’re confident can deliver in-house products, don’t bank on being able to hire one.
Marginal Gains. Double Digit Productivity Gains. Latent Expertise. Hiring Engineers is Hard.
Contents
If You Only Read One Thing
Trying and Buying Well
Contents
What Is GenAI Good For?
Going the extra mile, where it pays off
Passing exams
Not so good: Asking questions and getting people to expand
How To Successfully Integrate GenAI With Existing Organisations
Case Study: Goldman Sachs CIO Learnings
Building products means embedding them in checks and balances
Our Recent Work
Computers aren’t supposed to be able to do that
5th August - Report Launch
Zooming Out
Character.ai - The biggest AI product you might not have heard of
Automation Tales
Mapping AI Generated Images
Learning More
AWS’ GenAI Partner Consultancy Requirements
State of AI 1H24 Survey
The Lighter Side
What Is GenAI Good For?
Going the extra mile, where it pays off. Focusing only on doing the same but cheaper misses the point. The winners of new technology adoption have historically been the firms (incumbent or new) who were able to do more, to offer something new. This month we’ve seen a few great examples of how LLMs are being used to do the “nice to haves” that previously dropped off the To Do List, with positive results.
Turning job descriptions into “love letters” which attract applicants who will build the organisational culture you want. JDs as a Secret Weapon. Prompt to Use. Via Ethan Mollick.
Turning customer service into a concierge service. A six-fold increase in support requests as people are able to get help with things they’d never have asked before. Substack reports they now help people with marketing and growth questions, not just technical ones. Substack chatbot.
Updating documentation and code comments - the job that nobody except LLM assistants likes getting around to. Claude.
Adding additional information to datasets, like turning job descriptions (or any free text field) to industry tags (or any well defined category) for further analysis. ONS.
Passing exams, undetected. In a recent paper looking at Psychology undergraduates, AI-generated answers were inserted into student exam responses, without the knowledge of examiners. 94% of submissions went undetected and outperformed student responses by 0.5 grades. Take this as your periodic reminder that automated detection doesn’t work - the only way to control this is to control the conditions for submission. Read across to job applications and any other written processes you are responsible for reviewing. Paper.
Not So Good: Asking Questions. You can tell when someone is asking you questions off a script. It feels qualitatively different to being asked questions by someone who is engaged, listening and who wants to learn more. An engaged person pushes the boundaries of conversation rather than closing things down. They spot what hasn’t been said - the negative space - and steer conversation there if they deem it important. This skill of eliciting information from someone is vital to a whole raft of jobs - not just diagnostic jobs (doctors, detectives or journalists), but all relationship building roles, from product management to sales. AI Chatbots, it turns out, are pretty bad at it. This can be mitigated by directly instructing them to behave like this, but this only seems to last for a short time, and even then it swings too far the other way. Asking the right questions, neither pushing too hard or too soft, is highly nuanced and not easy to instruct or demonstrate in training. It relies on myriad contextual clues and often a relationship developed over time. Once again, this should signal a note of caution as Customer Service Chatbots become the most popular use of Generative AI. You might be conducting the transaction that the customer asks for, or that you pre-programmed in advance. But are you really asking the right questions? How will you know? Tweet.
How To Successfully Integrate GenAI With Existing Organisations
Case Study: Goldman Sachs CIO Shares Learnings & Strategy. If you have 20 minutes to spare and are considering your own organisation’s technical responses then this interview with Marco Argenti is worth a watch in full. A few highlights:
They don’t rely on one model provider - they have worked to build a platform which orchestrates a range of models from different providers, much like their multi-cloud strategy. This avoids lock-in, adds resilience with fall back options and (once built) allows quick allocation of tasks to the most suitable supplier.
Frame tasks as 5 second, 5 minute, 5 day or 5 month tasks. Models are really good at the first two now. Autocomplete this sentence (5 seconds) or produce a structure for this article, or specific piece of code (5 minutes). The next target is working over longer periods of time - but much quicker - where models will hopefully do 98% of the job and surface the parts of the task that need human intervention.
He estimates a whopping 20% time saving on coding tasks for equivalent quality, and that technical staff spend 50% of their time coding. Net productivity gain: 10%.
Will this lead to layoffs? He is sceptical. In decades in the industry IT spend has never gone down. Instead, he foresees IT team growth levelling off and being asked to produce more with less.
Goldman is thinking about two waves of gains. In wave 1 comes knowledge worker productivity gains, of the sort we regularly talk about. Wave 2 will allow testing and discarding of hypotheses very quickly (as the manual work required to test them is largely automated) - and consequently the generation of new trading strategies and profits.
Building useful products means embedding them in checks and balances. We have previously discussed the characteristics of tasks which you’ll want to keep a human-in-the-loop for, versus those that can be safely automated. In the middle is a wealth of developing work on hybrid systems which improve an operation without fully automating it. A survey this month showed that feedback loops from users are the most popular way of controlling the accuracy of model outputs. Language is evolving for talking consistently about this. Should a task be “AI with Human Approval” or just an “AI Recommendation”? This Delegation Framework is helpful for shaping organisational conversations and makes clear just how many options they are. Asking whether an AI can do something is the wrong question - ask how to get the best results from the overall system instead. AI Digest - Feb 24. Popular methods of control - Survey. Delegation Framework. Compound Systems.
Our Recent Work
Computers aren’t supposed to be able to do that. How closely do you have time to follow what AI tools can now do? How do you make decisions about your own job, or those you manage, without a view of what computers can do? Read the full blog.
5th August Launch Event: Buying Generative AI in Government. We’ve teamed up with PUBLIC to address many of the procurement challenges facing government teams as they try to take advantage of new tools and services. Many of these challenges will be common across any large, established organisation. If you’d like to learn more then sign up here.
A quick intro…
Buying rapidly evolving technology is challenging - especially for government. Teams must grapple with new suppliers and funding models, emerging best practice and novel pitfalls. Government must avoid getting locked into products that quickly become outdated, whilst giving suppliers enough assurance that participating is worth their while.
While GenAI doesn’t require abandoning all we have learnt about buying technology, it does present some distinct challenges when compared to other technologies that government might procure. This report is a guide for those looking to buy in the face of these challenges.
Zooming Out
Character.ai claims peak traffic of 20% of Google search volume. It is probably the biggest AI-powered service that you probably haven’t heard of, unless you have teenage children. Regardless of scepticism about the actual user numbers, there are enough reports to show that this app is heavily heavily used amongst teens in several countries. There are lots of parenting and societal concerns that come from children developing pseudo-social relationships with bots online. Youth trends are also, historically, excellent leading indicators about mainstream internet adoption. Character.ai Announcements. Traffic Stats.
From content producer, to AI manager. Two sobering accounts of changing job roles from a Copywriter and Software Developer. Copywriting. Developer.
“At the center of everything lies classical statues of celebrities & anime characters.” Mapping requests of AI image generators and the source images they were trained on (it’s a lot of kitchens). Link.
Learning More
AWS’ Partner Network Requirements. Take a look at what Amazon demand’s of suppliers who are doing GenAI implementation work on their behalf. A useful starting point for evaluating suppliers or partners who are building tools for your organisation. Link.
State of AI 1H24 from Retool. Use cases, ROI and bottlenecks. Link.
The Lighter Side
Memes vs Misinformation. The UK Election was the latest to pass without significant GenAI misinformation (contrary to widely held fears). Like the Indian and Indonesian elections, clearly fake but funny videos and images had far more reach. Sadly the videos get taken down so quickly they’re hard to share, but here’s Keir Starmer playing Minecraft if you’re quick. Link.
How AI stole the sparkles emoji. Link.
A picture says a thousand words, but both can be summarised to three key bullet points with ChatGPT. Link.