GPT-4 Vision Featured

AI's Quantum Leap: Mimicking Attenborough's Voice

Brandon Aviram

Nov 18, 2023 • 1 min read

Charlie Holtz at Replicate pulled a mind-bending stunt using GPT-4 Vision (GPT-4V) and ElevenLabs' voice cloning tech to fake David Attenborough's voice. This tech wizardry got so good that it can narrate someone's life in real-time, mimicking Attenborough's iconic storytelling mojo.

The magic happens with a simple Python script that captures images from Holtz's webcam every five seconds. These are interpreted by GPT-4V, a souped-up version of OpenAI's language model with a knack for visual inputs. It generates text in Attenborough's narrative style, which ElevenLabs' AI voice tool, loaded with Attenborough's speech samples, transforms into speech. Witness this groundbreaking tech firsthand:

David Attenborough is now narrating my life

Here's a GPT-4-vision + @elevenlabsio python script so you can star in your own Planet Earth: pic.twitter.com/desTwTM7RS
— Charlie Holtz (@charliebholtz) November 15, 2023

This fusion of tech is not just slick; it's groundbreaking. The AI interprets visual cues, translates them into language, and then speaks in a voice nearly indistinguishable from Attenborough's. We are witnessing a giant leap in AI's ability to understand and articulate the world around us in a human-like way.

The implications are huge: think supercharged AI assistants, innovative storytelling tools, and immersive interactive entertainment. We're venturing into what was once purely the realm of sci-fi. The AI David Attenborough project is more than a viral sensation; it's a vivid showcase of just how quickly and profoundly AI technology is evolving.

Subscribe for more insights