Large Language Models: A Series
Building products on LLMs and AI generally.
October 31, 2024
The four phases of automated evals for LLM-powered features.
I gave a talk version of this article at the first Infer meetup earlier this month. Let’s say you want to build an LLM-powered app. With a modern model and common-sense prompting, it’s easy to get a demo going with reasonable results. Of course, before going live, you test various...
9 min read →
October 2, 2024
Next week, we’ll be kicking off a new speaker series in Vancouver called Infer. The goal of the meetup is to bring together folks who are doing great AI engineering work, so we can learn from one another.
The format will be familiar to folks that have attended my previous meetups: two speakers, often one of whom will be visiting from out of town, with time to chat afterward. Events will happen roughly every two months, when we have compelling topics lined up.
If you’re building LLM-powered apps in Vancouver, you can subscribe to our event on Luma. There are still a few spots open for our first “beta” event on October 9th, and we’ll be hosting another during NeurIPS in December.
There’s something electric about getting smart people who are working in a rapidly-changing field in a room together. I recommend it.
August 16, 2024
A wild startup appears.
Last month, I started full-time on a new startup. It’s early days, but we’re having a lot of fun. A startup, fundamentally, is a search for a repeatable, scalable business model. You rapidly try things, run experiments, learn, and iterate your theories about how to build a useful product that...
2 min read →
July 31, 2024
If – and when – GPT-5 might eat your lunch
Lately I’ve been working with a lot of teams and founders that are building products on top of LLMs. It’s a lot of fun! To be an AI product engineer today is to constantly ask new questions that impact how you build products. Questions like: “Is there a way we...
5 min read →
May 31, 2024
A path to continued model improvement.
I often see a misconception when people try to reason about the capability of LLMs, and in particular how much future improvement to expect. It’s frequently said that that LLMs are “trained on the internet,” and so they’ll always be bad at producing content that is rare on the web....
5 min read →
January 10, 2024
A curious design constraint signals an ambitious future.
This morning, OpenAI launched the GPT Store: a simple way to browse and distribute customized versions of ChatGPT. GPTs – awkwardly named to solidify OpenAI’s claim to the trademark “GPT” – consist of a custom ChatGPT prompt, an icon, and optionally some reference data or hookups to external APIs. In...
5 min read →
June 30, 2023
Techniques for building products on LLMs today.
Modern instruction-tuned language models, or LLMs, are the latest tool in software engineers’ toolboxes. Joining classics like databases, networking, hypertext, and async web applications, we now have a new enabling technology that seems wickedly powerful, but whose best applications aren’t yet clear. ChatGPT lets you poke at those possibilities. You...
11 min read →
March 15, 2023
A wild large-context LLM appears.
One month ago, I wrote about on the limits of 4K-token AI models, and the wild capabilities and costs that large-context language models may one day have. Today, OpenAI not only debuted GPT-4 with a doubly large 8K token limit, but demoed and began trials of a version that supports...
2 min read →
February 16, 2023
The problem and opportunity of language model context.
It has been a wild week in AI. By now, we’re getting used to the plot twist that rather than the cold Spock-like AIs of science fiction, large language models tend to be charismatic fabulists with a tenuous understanding of facts. Into that environment, last week Microsoft launched a Bing...
13 min read →