There’s a lot happening right now.
This newsletter focuses on the impact of AI at work. Two years ago, in February 2023, ChatGPT was on the cover of Time Magazine for the first time and for many people, that started this conversation with a bang. For the first time since, this month has felt similar, for the number of inbound questions, especially from people who don’t usually follow the space (see If You Only Read One Thing below).
Outside the tech world, arguably even more has been happening in the last month. Perhaps that is why the two things that have stuck with me most (included below - of course) have had so little to do with technology and its impact on the world.
Firstly, the increasing use of the phrase “go touch grass”. As Merriam-Webster puts it: “to participate in normal activities in the real world especially as opposed to online experiences and interactions”. Secondly, the outgoing Surgeon General’s parting prescription: a need for more community.
What follows in the newsletter is the usual mix (in fact a bumper crop) of case studies, economics papers and insightful forecasts about how our work life will change. My sincere hope is that reading this saves you many more hours than doing the research yourself. If you end up with a free 10 minutes as a result then I can tell you that (at least in London, as I type this) it's sunny outside. Or just have a go with the latest toy from OpenAI - that’s fun too!
See you next month
James
If You Only Read One Thing
What does the appearance of DeepSeek mean? The Chinese, Open-Source model that shot to prominence a few weeks ago triggered more inbound questions than I’ve received at any point since the launch of ChatGPT. But three weeks on the news has passed and the chatter subsided. So what does it actually mean?
Industry watchers will tell you that this jump in capability was surprising but not wildly unforeseeable. Costs are falling sharply, open source models are getting much better. DeepSeek is the next point on the curve, not a new paradigm. It’s clear that there were real technical breakthroughs, but also that the existence of western frontier models was helpful as a proof of concept (at least) and may have been a direct source of training data. Surprising? Well it’s much easier to learn science from an academic textbook, than it is to do the research yourself. I’d argue this is similar.
I’m much more interested in the “since the launch of ChatGPT” bit of the equation though. You’re reading this because you care, because you take at least some interest in how AI tools are evolving. That’s not true of the vast majority of people (even very serious professionals). Those people - sadly - don’t read me, or Twitter, or Ethan Mollick. But when the BBC leads with a story about AI they take note. And the Deepseek moment has given many people the first excuse to try AI since the original ChatGPT moment 2 years ago, when the best model available was still GPT3.5. And boy oh boy have they been surprised by what they have seen. Seeing what a frontier-ish reasoning model is capable of is a huge shock if the last thing you tried was an early ChatGPT. These people - many of them senior managers in non-technical industries - have in the last month seen what those following the space have known for a while: this AI thing is quite cool.
Change happens at the pace of adoption, not discovery. To take an example from my time in SF, it's measured in parents being willing to send their children to school in self-driving cars, not in whether they exist. Collective moments where society goes “wow, this is good” are what allows regular people - not you or me - to adopt new technologies, without needing to consider whether they were wrong for failing to do so in the past. They provide a window of opportunity.
So, if you’re the driver of change in your organisation (and if you’re reading this there is a good chance you are), then what does Deepseek mean? On the one hand, not much in terms of the frontier. On the other hand, it could well mean that your moment is now!
Adoption: A lesson from self-driving cars. DeepSeek commentary. Unsurprising: DeepSeek Censorship. BoomTimes for ChatGPT.
Contents
If You Only Read One Thing
Deepseek, China, and Trying Again
What Is GenAI Good For?
Speeding Up Things You Wanted To Do / Wish People Would Stop Doing
Healthy Scepticism
Multi-step Research Tasks
Investment Banking & Teaching
How To Successfully Integrate GenAI With Existing Organisations
Recruitment Case Studies
Call Centre Case Study
Moving Beyond Chatbots
Reasons for Low Adoption
AI Companies Doing Integration Work
Our Recent Work
Tractable Problems
Metrics and Organisational Performance
Building Resilience to Technological Change
AI Agents for Research: User Guide.
Zooming Out
Data Residency
Incumbents Being Challenged
Good Work in Government
Robotics
Labour Market Impact
Learning More
Which AI to Use
Prompt Writing Advice
Courses
Agents
Imagining What Change Looks Like
The Lighter Side
What Is GenAI Good For?
Speeding up tasks that you wanted to do anyway
Writing comments on other people’s social media posts (if you work in marketing). Link.
Applying for jobs (if you’re a candidate). Link.
Navigating call trees, so you can speak to an actual human. Link.
Speeding up things that you really wish other people would do less of
Writing inane self-promoting comments on other people’s social media posts (if you don't work in marketing). Link.
Applying for jobs (if you’re trying to screen the applications). Link.
Hacking ways through deviously constructed call trees (if you run a call centre). Link.
Exercise for the reader: Is it possible to identify similar winners and losers for all cases where AI is used for automation?
Healthy Scepticism: What GenAI is Not Good For. Two excellent analyses role model what specific opposition can look like. Dos: Closely detail the tasks human workers are undertaking, and their purpose. Cross-reference these against what a new system can do. The most effective opposition is often of the form: I know it might look like X task is only a small part of the job, but it's very important and AI can’t automate that yet. Don’ts: Make your arguments depend on resource inputs or general capability limitations, given how quickly these are changing; Assume that because ChatGPT doesn’t do something, no AI tool can; Go too hard on hallucinations, they’re improving and many tasks can cope with an error rate once turned into a proper process. 85 Things AI Can’t Do in Recruitment. Toward a sensible AI-skepticism.
A case in point: AI models have dramatically improved their ability to carry out multi-step research tasks, over several minutes, without losing their way. Google was first to the party with their new tool: Deep Research (which we wrote about last month). OpenAI followed up in recent weeks with their latest release which is called (checks notes): deep research. The naming continues to be awful, but the results are impressive. Coupled with another product release (OpenAI’s Operator, which makes strides in navigating web pages) we are a long way closer towards true delegation, of the form: Here is the problem, go away and research the best answer and come back and tell me when it’s done. The way we search for information is changed, for the first time consuming the answer can easily take longer than searching for the information. What is the right way to respond? We for one are still figuring it out. Analysis 1. Analysis 2. Examples of Operator Mode.
Briefly!
Investment Banking. Goldman Sachs has rolled out its GS AI assistant to 10,000 employees. Link.
Teaching. In a randomised-controlled trial at Harvard, students who were taught by an online AI tutor learned more, and self-reported higher engagement than those in an active learning class, even with an experienced and motivated teacher. Paper. News. Analysis.
How To Successfully Integrate GenAI With Existing Organisations
Recruitment Case Study: Using AI to open up new possibilities, rather than automate existing work. We often say the real winners will be those who use technology to do new things, not old ones better. This month we enjoyed a run down of the ways in which AI-powered interviews were enhancing candidate experience, in ways which may compensate for the loss of the human touch, few of which would have been obvious in advance. These include:
Immediate interview scheduling, no waiting until a recruiter is free.
Consistent evaluation across candidates - and the ability to go back and reassess old candidates (with consent) if recruiting guidelines change or new roles come up.
Reduced misunderstanding due to accents - from either party.
Increased ability to assess answers to technical questions (relative to non-technical interviews).
The study was done by a VC firm, so season with a pinch of salt accordingly, but the premise: that benefits can outstrip simple cost reductions is a good one. Similarly, one sourcing expert used a flagship reasoning model to write a database search for possible candidates to drive a forklift truck. The result: a massively expanded search that identified many candidates who would have been missed, but with high levels of precision that would otherwise be hard to match in such a large pool. In the past, devising and writing such an involved query would not have been worth the time it took: now a thought through request can be minutes (and $0.50 of computing costs) away. AI Interviewers. Enhanced Search.
Call Centre Case Study. Tobias Zwingmann is still well worth reading. He recently shared the roadmap he developed with a call centre company, with a key focus on Orchestration (making sure AI initiatives build on each other, aren’t isolated) and Sequencing (manageable steps in a logical order: each step makes the next one easier and more valuable). Link.
Answering Customer Service queries, but not as a chatbot. Included here because (1) It’s a great case study of thinking outside the box of what new AI-models can be used for and (2) because the architecture is so similar to work that FCDO released almost a year ago (and which we wrote about in March 2024). It’s often easy to assume that Governments are slow on the uptake with new technology, so it’s important to call out when they’re actually leading the pack. Case Study. FCDO Latest.
Reasons for low employee adoption: ineffective training and scepticism from direct managers. Recent studies find that training leads to employees using AI more, but not necessarily displaying more skills when they do. We continue to believe that effective use of AI has more to do with management skills than technical ones, so perhaps it's unsurprising that more senior employees show higher levels of assessed AI skills. Where managers are disapproving of the use of AI tools- often middle managers - the impacts are stark. Expertise (as measured by the same study) halves, whilst reported scepticism grows 3-5x. Link.
Model builders are increasingly doing the work of integration for potential clients (climbing the value chain in the process). Working out what to do with an AI model to make a business run better is no easy business. Last month we talked about AI firms poaching top talent from industry, this month there’s evidence of flow the other way (h/t Dylan), with OpenAI recruiting for engineers to work at clients, not the mothership. This follows a trend of industry specialisation: ChatGPT for civil servants, in the US National Laboratories, and for biochemistry. What next? Specialised services for Cybersecurity, accountants and lawyers? If the service is provided by OpenAI itself it certainly becomes more compelling. Whether they have the expertise to do the messy business of integration into operating businesses remains to be seen, but it's certainly one answer to open source models who keep compressing their margins (and see Robotics below for another!). Forward Deployed Engineers. ChatGPTGov. Nuclear Security Clearance. GPT-4b (biology).
Our Recent Work
Tractable problems: the benefits of starting with mundane improvements. Link.
Metrics and organisational performance: lessons from Digital Twins. Link.
Building Resilience to Technological Change. Link.
AI Agents for Research: User Guide. Link.
Zooming Out
Data Residency in Europe. OpenAI is taking steps to make it as easy as possible for compliance teams to say yes. Link.
Incumbents being challenged. How do incumbents respond to a new kid on the block, courtesy of Nokia’s archive as Apple was on the up. See DeepSeek and OpenAI above. Link.
Good work in Government. Two fascinating reads from the UK Government. State of Digital Government Review. AI Playbook.
Robotics is still bubbling under the surface. OpenAI is hiring robotics engineers and filed a bunch of trademarks applications covering humanoid robots. Moving from models to devices is potentially one way for the big model providers to move up the value chain and escape the fierce competition at the level of tokens. Hiring. Trademarks. Learning more: π0.
We are working on a longer piece of work about the state of robotics development - please reach out if you’d be interested in reading (or contributing!)
Freelancer Demand. Big data analysis of the impacts on work. Labour demand for complementary skills grew, but substitutable skills (writing, translation) saw demand fall 20-50%, with short-term work impacted most. Paper.
Learning More
Which AI to use. An updated guide from the always spot on, Ethan Mollick. Link.
Good prompt writing. Specific advice changes regularly. A useful heuristic: What would a good manager do? Link.
Courses!
I met Mary at an excellent Parliamentary Round Table organised by UKAI and was really impressed with the personalised training she has designed to run with front line teams. (h/t to Tim Flagg for the invite which led us to meet)
I met Catherine two years ago when she kindly gave me advice on going self-employed. Since then she (and a team at University of Cambridge) have built a wonderful, open-to-all course for academics who want to learn how AI will impact their work and how they can start using it to speed up their research. Link.
Computer Use Agents: More resources than is practical to read. Link.
Imagining what change looks like. Two nice models. My favourite question to ask with these is what in here doesn’t feel believable? Why? Gianni Giacomelli: World of Work. Sahar Mor: Agents and Commerce.
The Lighter Side
Problems other people have. If you’ve read this far then this couldn’t possibly apply to you, right?
Automated talking heads. Where labour shortages were really holding us back. Link.
LLMs playing Pictionary. Link. Courtesy of a longer thread of LLMs playing games. Link.
AI, Chemistry and Magic. Sounds like quite the birthday party. Link.
A dose of community, once daily until symptoms improve. The Surgeon General’s parting prescription. Link.