- Loop
- Posts
- Voice cloning is now here. But did we ever ask for it?
Voice cloning is now here. But did we ever ask for it?
Plus more on GPT-4o’s new competitor, OpenAI’s security breach, and robot workers in Japan.
Welcome to this edition of Loop!
To kick off your week, we’ve rounded-up the most important technology and AI updates that you should know about.
HIGHLIGHTS
Why Microsoft’s GraphRAG is important for companies
Japan’s humanoid robot that maintains their train lines
How Moshi is GPT-4o’s new competitor
… and much more
Let's jump in!
1. How to clone someone’s voice with just 5 seconds of audio
CAMB has released a tool that now lets you clone anyone’s voice with just 5 seconds of audio. Other companies, such as ElevenLabs, are working on similar tools.
MARS5 is their new text-to-speech model that supports two methods. You can either generate a fast, but shallow clone; or a slower and much high-quality version.
During testing, I’ve found the audio quality to be a bit hit-and-miss - but it’s something that will get better over time.
Of course, there’s plenty of positive and negative ways this technology could be used - with CAMB mostly focusing on the positives, such as instant language translation and breaking down barriers between people.
However, there could be much bigger consequences for ordinary citizens.
If you’ve ever uploaded a video online of you speaking - which most of us have on social media - this could allow fraudsters to target your friends and family.
While some criminals try to steal thousands from people by pretending to be their bank, others will send voice messages to your family and urgently ask for money.
Yes, there will undoubtedly be some benefits to this technology - but it’s important that we’re fully aware of the damage it could cause as well.
2. SpaceX wants to launch rockets up to 120 times a year
The company has ambitious plans to launch its Starship rocket up to 120 times per year from two sites in Florida, which has led to concerns among their competitors.
Blue Origin and United Launch Alliance are concerned that these plans could disrupt their own launches in the area.
SpaceX’s CEO, Elon Musk also wants them to launch multiple rockets per day. Although, there’s a long way to go before that’s even remotely realistic.
His company is also working to increase their manufacturing capabilities and wants to produce key components on a daily basis.
3. Hackers gained access to OpenAI’s internal communications
While the hackers only gained access to their internal chat rooms, this news really highlights how vulnerable the US’ top AI companies are - especially to foreign agents.
Every company is vulnerable to attack, but these are no ordinary companies. Startups like OpenAI have collected huge amounts of data and created some of the most advanced AI models.
Some of this data is used to train their models. However, they also have access to the private data that’s uploaded by their enterprise customers.
While none of this data was accessed in the latest security breach, it raises questions about what we provide to these companies.
Understandably, many companies have rushed to use ChatGPT and have it analyse their data for insights. But given that OpenAI is still a pretty young startup, that also comes with its own risks.
The US is currently seen as the world-leader in AI research, but its top startups also need to be wary of the threats they face from attackers.
4. Microsoft’s new GraphRAG tool is finally released
GraphRAG is a new approach for LLMs to extract information. It combines both knowledge graphs and RAG to more efficiently search documents and generate better answers.
I was recently at a conference talk by Neo4J’s CTO, who made the case that GraphRAG is a better way to model the real-world.
This is because we can create links between topics and show how complex those relationships are.
When the Panama Paper journalists were given tens of thousands of documents, they used knowledge graphs to work out the links between hidden bank accounts and their wealthy owners.
Microsoft’s research demonstrates that GraphRAG does outperform RAG and generates answers that are more complete.
If you’re building your own Generative AI applications, you really need to explore GraphRAG and how it can improve your tool’s performance. I’ve included a link to the GitHub repo below.
5. Apple’s Phil Schiller is joining OpenAI’s board
Apple has appointed Phil Schiller to represent it on OpenAI's nonprofit board. He will act as an observer in their board meetings, but will not be able to vote.
Schiller is one of Apple’s most senior leaders and has been an executive there since Steve Jobs returned in 1997.
The position will give him a good insight into how OpenAI operates and their broader ambitions going forward.
It follows on from Apple’s announcement that ChatGPT will be integrated into iOS and macOS, as they release dozens of new AI features for their users.
Interestingly, Microsoft also has an observer role on the board - but cannot vote either.
This is a pretty unusual step for Apple, but it highlights just how important it is for them to work with the emerging AI startups.
Japan will use humanoid robots to maintain train lines
West Japan Railway is introducing a huge humanoid robot to maintain its train network - with it able to chop down trees and re-paint surfaces.
Just imagine WALL-E, but he’s 12 metres tall instead and one of his arms is a chainsaw…
It’s attached to a truck and can carry objects that weigh up to 40kg. To operate it, the person uses both a VR headset and joysticks.
Since Japan is facing an ageing population and worker shortages, the company is hoping that the robot can be modified for other tasks as well.
It’s an interesting robot and there’s no doubt that this is where the industry is going. Although it is a change compared to what we’ve seen before.
Most robotics companies in the US are focused on humanoid robots that will work in factories and warehouses. It’s pretty rare that we see a robot that is attached to a truck and is still operated by a human.
Regardless, it’s fascinating to see where else this technology could be used.
GPT-4o has a new competitor
Moshi is a real-time voice assistant and aims to be a rival for GPT4-o, which was delayed by OpenAI last week.
It’s powered by the Helium 7B large language model, offers different accents and supports 70 different speaking styles.
Kyutai is the startup behind the technology and they say that their AI assistant can handle two audio streams simultaneously. This allows it to both listen and talk at the same time.
To fine-tune the model, they used over 100,000 examples of dialogue. These audio files were created using AI, thanks to Text-to-Speech (TTS) technology.
They then worked with a professional voice artist to improve the Moshi voice over time.
Interestingly, it’s an open-source model and can be run on your local computer - without the need to host it in the cloud.
🌍 Google's carbon emissions are increasing as it invests in AI
🎮 Meta to bring Generative AI to metaverse games
🎨 Figma turns off its AI feature, after it seemed to copy Apple's Weather app
🔍 Amazon faces more EU scrutiny over their algorithms and ads
🤖 MIT develops a robot to pack your groceries
🚀 Meta unveils AI that can make 3D textured models quicker than ever
🌄 3D environments can now be created from blurry images
🛡️ CloudFlare can block AI bots, scrapers and crawlers from your website
Altrove
The French startup has just raised $4 million to boost their work, which uses artificial intelligence to identify new materials.
It follows similar advances from the teams at DeepMind, Microsoft, and Orbital Materials - who are all working to solve similar issues.
Altrove is particularly focusing on rare earth elements, which are really difficult to source and often come from China.
To do this, the company creates a list of interesting materials, then uses their AI model to suggest different compounds, and finally tests each of them.
The hope is that they will speed up the discovery process and lead to new breakthroughs. It could also reduce global reliance on China for these rare elements.
This Week’s Art
Loop via Midjourney V6
We’ve covered quite a bit this week, including:
How you can clone someone’s voice with MARS5
Why SpaceX wants to launch rockets 120 times a year
OpenAI’s latest security breach
Microsoft’s new GraphRAG tool and the performance improvements
Phil Schiller’s new role on OpenAI’s board
How Japan will use humanoid robots to maintain train lines
GPT-4o’s new competitor, called Moshi
And how Altrove are using AI models to create new materials
Have a good week!
Liam
Share with Others
If you found something interesting in this week’s edition, feel free to share this newsletter with your colleagues.
About the Author
Liam McCormick is a Senior AI Engineer and works within Kainos' Innovation team. He identifies business value in emerging technologies, implements them, and then shares these insights with others.