Behind the appealing design of CapWords
CapWords certainly leverages a lot of cutting-edge technology for an app that was partially invented by a 3-year-old.
Sure, CapWords was technically launched by Ace Lee, who created the China-based HappyPlan Tech company that won the 2025 Apple Design Award for Delight and Fun. But the concept for the app — an AI-powered language learning utility that teaches words through animated stickers — originated from a more junior associate.
“The idea came from a very simple moment with my daughter,” says Lee. “Every day on the walk home from kindergarten, she’d point at things and ask me, ‘How do you say this in English?’ One day, she pointed at a road sign and asked, ‘What’s this called?’ I was stuck and had no idea. I opened a translation app, and it replied ‘Signpost’ in this kind of robotic voice. She just quietly said, ‘Oh.’ And I could tell something was missing.”
The missing element: Connection. Generally speaking, kids aren’t big into cold, robotic voices. And Lee realized that on all their walks, his daughter wasn’t really reacting to the facts; she was responding to the warmth of their natural connection.
CapWords
Available on: iPhone, iPad
Team size: 3
Based in: Beijing
Awards: Apple Design Award winner (2025), App Store Award winner (2025)
That’s why CapWords prioritizes a sense of wonder and discovery. Snap a photo of an item — a coffee mug, traffic cone, cupcake, signpost, anything — and the app uses AI to transform the photo into a sticker while telling you the name and pronunciation of each item. Stickers are collected right in the app. And that active engagement — clicking the picture, watching the animation, receiving the feedback — increases memory retention for users aged 3 and up. “CapWords is grounded in real-world physics: sound, touch, and sight cues,” says Clu Soh, a close friend of Lee who assisted in the app's development. “That’s why it works so well.”
In fact, the stickers were another key contribution to the roadmap from Lee‘s daughter. “Since she was 2, she’s been sticking them everywhere — on our fridge, the sofa, even my face,” laughs Lee. “And she did it so seriously, like she was creating art. That gave me the idea to ‘peel’ things off from the real world and collect them.”
“What is this? Please tell me”
That peeling process presented the team’s first technical challenge. “We started by researching models that would run on devices and clip items out of photos, but each model had its problems,” says Soh. “Either we had to pre-download images or download them when users launched the app, which took time and storage. Plus, sometimes the items didn’t have clear edges, so we couldn’t cut them out cleanly.”
CapWords founder Ace Lee spends a little time on the real-life stickers that provide the foundation for his app’s language learning philosophy.
Luckily, they found a quick solution in VisionKit. “It worked really well without needing to integrate big models inside the app,” says Soh. “That’s how we peeled items off easily.”
The next step was figuring out how to identify the items in each sticker — a problem that the team kicked over to the just-launched ChatGPT-4. “We didn’t even have an app yet,” says Soh, “but we kept feeding ChatGPT the different items cut out by VisionKit and asking, ‘What is this? Please tell me in Chinese, French, Spanish — all the languages we want to learn.’”
”CapWords is grounded in real-world physics: sound, touch, and sight cues. That‘s why it works so well.“
With the foundations falling into place, the team began taking advantage of other Apple frameworks. “We used AVAudioEngine to play the audio and Neural Voice to make it sound more natural. Spatial recognition from iOS APIs helped a lot with the flash card features. And of course, with CloudKit, we can sync user data to iPhone and iPad.”
What’s more, CapWords doesn’t store any user-generated imagery; photos are transmitted to an AI model for one-time recognition and then deleted. No images are saved locally, and nothing is uploaded to a server (for the very good reason that CapWords doesn’t have one).
Snap a picture of an everyday item — like a mug — and CapWords will leverage AI to turn it into a sticker and share its definition and pronuncation.
But CapWords’s Apple Design Award came for Delight and Fun — something that’s not always easy to achieve in an educational app. And that’s due in large part to the moment when a photo becomes a sticker.
“When you capture an image and the app processes it, you confirm whether this is the object you want to collect. That confirmation step actually gives the system time to get results from the API,” says Soh. “On the backend, we’re splitting the process across different stages: Capture → remove background → confirm → display the result.”
But Soh says that most users don’t notice, because while this is happening, the app is displaying a “micro-animation” that keeps them engaged while the gears are turning behind the scenes.
Remarkably, the whole process — from the initial signpost conversation to the shipped app — took four months. And while the app is certainly impressive from a technical standpoint, it’s that attention to wonder and curiosity that really makes it shine; it’s truly viewed from a child’s eyes. Soh says he’s even heard the highest of praise: A friend told him that his daughter stopped playing Pokémon Go in favor of CapWords. “That wasn’t something we planned,” he laughs, “but it showed us that CapWords isn’t just for kids or adults — it’s something families can do together. Parents and children can explore their surroundings, capture words, and review them as a shared activity.”
”It’s a privilege knowing our app helps people rediscover the joy of learning languages,” says CapWords founder and buckwheat noodles fan Ace Lee.
CapWords has truly been bringing families together since Day 1. “We’ve heard that it feels like being an alien living on Earth, trying to collect and name every object. Another called CapWords ‘the warmest and most humane AI I’ve ever used,’” says Lee. “It’s a privilege knowing our app helps people rediscover the joy of learning languages, not through textbooks, but through everyday life.”
Originally published December 4, 2025