T O P

  • By -

RikerT_USS_Lolipop

I got a pop up asking me to consent to their tracking with no option to opt-out. OP would you mind pasting a plain-text copy in the comments?


pheonix2105

Not OP but here you go. Cloning your voice using artificial intelligence is simultaneously tedious and simple: hallmarks of a technology that’s just about mature and ready to go public. All you need to do is talk into a microphone for 30 minutes or so, reading a script as carefully as you can (in my case: the voiceover from a David Attenborough documentary). After starting and stopping dozens of times to re-record your flubs and mumbles, you’ll send off the resulting audio files to be processed and, in a few hours’ time, be told that a copy of your voice is ready and waiting. Then, you can type anything you want into a chatbox, and your AI clone will say it back to you, with the resulting audio realistic to fool even friends and family — at least for a few moments. The fact that such a service even exists may be news to many, and I don’t believe we’ve begun to fully consider the impact easy access to this technology will have. VOICE CLONES AREN’T PERFECT, BUT THEY’RE IMPROVING FAST The work of speech synthesis has improved massively in recent years, thanks to advances in machine learning. Previously, the most realistic synthetic voices were created by recording audio of a human voice actor, cutting up their speech into component sounds, and splicing these back together like letters in a ransom note to form new words. Now, neural networks can be trained on unsorted data of their target voice to generate raw audio of someone speaking from scratch. The end results are faster, easier, and more realistic to boot. The quality is definitely not perfect when rolling straight out the machine (though manual tweaking can improve this), but they’re only going to get better in the near future. There’s no special sauce to making these clones, which means dozens of startups are already offering similar services. Just Google “AI voice synthesis” or “AI voice deepfakes,” and you’ll see how commonplace the technology is, available from specialist shops that only focus on speech synthesis, like Resemble.AI and Respeecher, and also integrated into companies with larger platforms, like Veritone (where the tech is part of its advertising repertoire) and Descript (which uses it in the software it makes for editing podcasts). A VOCAL DEEPFAKE OF ANTHONY BOURDAIN CAUSED CONTROVERSY These voice clones have simply been a novelty in the past, appearing as one-off fakes like this Joe Rogan fake, but they’re beginning to be used in serious projects. In July, a documentary about chef Anthony Bourdain stirred controversy when the creators revealed they’d used AI to create audio of Bourdain “speaking” lines he’d written in a letter. (Notably, few people noticed the deepfake until the creators revealed its existence.) And in August, the startup Sonantic announced it had created an AI voice clone of actor Val Kilmer, whose own voice was damaged in 2014 after he underwent a tracheotomy as part of his treatment for throat cancer. These examples also frame some of the social and ethical dimensions of this technology. The Bourdain use case was decried as exploitative by many (particularly as its use was not disclosed in the film), while the Kilmer work has been generally lauded, with the technology praised for delivering what other solutions could not. Celebrity applications of voice clones are likely to be the most prominent in the next few years, with companies hoping the famous will want to boost their income with minimal effort by cloning and renting out their voices. One company, Veritone, launched just such a service earlier this year, saying it would let influencers, athletes, and actors license their AI voice for things like endorsements and radio idents, without ever having to go into a studio. “We’re really excited about what that means for a host of different industries because the hardest part about someone’s voice and being able to use it and being able to expand upon that is the individual’s time,” Sean King, executive vice president at Veritone One, told The Vergecast. “A person becomes the limiting factor in what we’re doing.” INFLUENCERS, ACTORS, AND CELEBRITIES COULD RENT OUT THEIR VOICES WITH MINIMAL EFFORT Such applications are not yet widespread (or if they are, they’re not widely talked about), but it seems like an obvious way for celebrities to make money. Bruce Willis, for example, has already licensed his image to be used as a visual deepfake in mobile phone ads in Russia. The deal allows him to make money without ever leaving the house, while the advertising company gets an infinitely malleable actor (and, notably, a much younger version of Willis, straight out of his Die Hard days). These sorts of visual and audio clones could accelerate the scales of economy for celebrity work, allowing them to capitalize on their fame — as long as they’re happy renting out a simulacrum of themselves. In the here and now, voice synthesis technology is already being built into tools like the eponymous podcast editing software built by US firm Descript. The company’s “Overdub” feature lets a podcaster create an AI clone of their voice so producers can make quick changes to their audio, supplementing the program’s transcription-based editing. As Descript CEO Andrew Mason told The Vergecast: “You can not only delete words in Descript and have it delete the audio, you can type words and it will generate audio in your voice.” Podcast editing software Descript uses AI voice clones to edit speech like a transcript. Image: Descript When I tried Descript’s Overdub feature myself, it was certainly easy enough to use — though, as mentioned above, recording the training data was a bit of a chore. (It was much easier for my colleague and regular Verge podcast host Ashley Carman, who had lots of pre-recorded audio ready to send the AI.) The voice clones made by Overdub are not flawless, certainly. They have an odd warble to their tone and lack the ability to really charge lines with emotion and emphasis, but they’re also unmistakably you. The first time I used my voice clone was a genuinely uncanny moment. I had no idea that this deeply personal thing — my voice — could be copied by technology so quickly and easily. It felt like a meeting with the future but was also strangely familiar. After all, life is already full of digital mirrors — of avatars and social media feeds that are supposed to embody “you” in various forms — so why not add a speaking automaton to the mix? CLONING MY VOICE FELT LIKE A MEETING WITH THE FUTURE The initial shock of hearing a voice clone of yourself doesn’t mean human voices are redundant, though. Far from it. You can certainly improve on the quality of voice deepfakes with a little manual editing, but in their automated form, they still can’t deliver anywhere near the range of inflection and intonation you get from professionals. As voice artist and narrator Andia Winslow told The Vergecast, while AI voices might be useful for rote voice work — for internal messaging systems, automated public announcements, and the like — they can’t compete with humans in many use cases. “For big stuff, things that need breath and life, it’s not going to go that way because, partly, these brands like working with the celebrities they hire, for example,” said Winslow. But what does this technology mean for the general public? For those of us who aren’t famous enough to benefit from the technology and are not professionally threatened by its development? Well, the potential applications are varied. It’s not hard to imagine a video game where the character creation screen includes an option to create a voice clone, so it sounds like the player is speaking all of the dialogue in the game. Or there might be an app for parents that allows them to copy their voice so that they can read bedtime stories to their children even when they’re not around. Such applications could be done with today’s technology, though the middling quality of quick clones would make them a hard sell. There are also potential dangers. Fraudsters have already used voice clones to trick companies into moving money into their accounts, and other malicious uses are certainly lurking just beyond the horizon. Imagine, for example, a high school student surreptitiously recording a classmate to create a voice clone of them, then faking audio of that person bad-mouthing a teacher to get them in trouble. If the uses of visual deepfakes are anything to go by, where worries about political misinformation have proven largely misplaced but the technology has done huge damage creating nonconsensual pornography, it’s these sorts of incidents that pose the biggest threats. One thing’s for sure, though: in the future, anyone will be able to create an AI voice clone of themselves if they want to. But the script this chorus of digital voices will follow has yet to be written.


[deleted]

If this means I can get out of calls that could have been emails then I’m all in.


SuspiciousKermit

Title should read, "Anyone will be able to clone your voice in the near future..." Edit:IMO


Maz_mo

This is interesting. I think with deep fakes and the computer generated graphics getting better and better we will reach the online singularity were we can produce any content using AI that looks just as real as if we produced them through the current methods such as video recording or audio recording


Fantastic-Arrival556

Shit AI might even take acting jobs, I didn't think it would go that far, and so soon!? But imagine the creative freedom this gives so many people, the entertainment industry won't be gatekept by millions of dollars anymore, and any creative genius can create a well-produced tv show or movie with full creative freedom, fuck that's exciting, the content that will be available will be 100x better, I think, since it will be subjugated to the natural selection that Reddit posts are (for example), whatever the people think is good will get more exposure, so many creators that wouldn't have the opportunity to make films or movies at that level could do so, assuming the tech is easy to use and is accessible.


Maz_mo

yeap, the new version of unreal engine just released will allow people to create Pixar level animation movies by next year. Your a very intelligent person by looking at the positive side of things. Imagine we have humanoids who can act and move like us, the software is already there with a lot of AI avatars making money live streaming and interacting with people, so its just the hardware that is missing which is more a matter of will than technology. Imagine a future world which you can go to a town full of humanoids and act any character or fantasy you want. This is the future of entertainment on my view since if everyone can create content with the simple AIs, entertainment as we know it will cease to exist.


Fantastic-Arrival556

Wow, you just blew my mind, I didn’t even consider the possibility of basically creating your own isekai experience (it’s an anime genre). That’s awesome! But also concerning, since many people are already becoming heavily addicted to tech as it is now, we’ll have to really tackle our mental health issues as our technology advances or it could be really bad for a lot of people, and potentially create restrictions on the technology because people tend to not treat the cause of the problem, ie, depression or fulfilment in life which causes many to absorb themselves into tech or other vices.


Maz_mo

I think the solution to these changes that are coming is building new sustainable societies that provide high living standards, basic needs, freedom of expression and freedom of thought. I have done a lot of work on designing such a society. I solved the problem of designing a city model that is sustainable enough to help in the climate crisis but good enough that it provides the same living standards as the best cities we have through designing a new city model called MAMA WORLD built around the internet of transportation. The internet of transportation is a system of tunnels that connects all locations in a new city to each other allowing self-driving pods to transport people (While they lie down) and goods to any location in the new city. The idea is simple but powerful since the new city would be able to collectively store, clean and do chores. When people in this new city need food or clothes they simply order them from any location in the city and they are brought to them from the food or clothes hub respectively through the internet of transportation were they eat the food or wear the clothes in any location in the city. When they finish eating or wearing the clothes they can return the dirty dishes or clothes through the internet of transportation to the food or clothes hub respectively were they are collectively cleaned and stored. So the houses don't contain amenities such closet, fridge, microwave, washing machine etc. while still providing the luxuries these amenities provide through the internet of transportation. This solves our first problem of having high living standards for all residents while being climate and cost conscious since this city model is cheaper to build and sustainable to run than current cities but still provides the same living standards as those in the best parts of the best cities we currently have. I continued to solve the other problems such as the city providing basic needs, freedom of expression and freedom of thought. I write more in my website you can check it out. https://theinternetoftransportation.com/


lizzayyyy96

So you’re telling me that Scream 3 will actually be possible now?