• Loop
  • Posts
  • Voice cloning is now here. But did we ever ask for it?

Voice cloning is now here. But did we ever ask for it?

Plus more on GPT-4o’s new competitor, OpenAI’s security breach, and robot workers in Japan.

Image - Loop relaxing in space

Welcome to this edition of Loop!

To kick off your week, we’ve rounded-up the most important technology and AI updates that you should know about.

‏‏‎ ‎ HIGHLIGHTS ‏‏‎ ‎

  • Why Microsoft’s GraphRAG is important for companies

  • Japan’s humanoid robot that maintains their train lines

  • How Moshi is GPT-4o’s new competitor

  • … and much more

Let's jump in!

Image of Loop character reading a newspaper
Image title - Top Stories

1. How to clone someone’s voice with just 5 seconds of audio

CAMB has released a tool that now lets you clone anyone’s voice with just 5 seconds of audio. Other companies, such as ElevenLabs, are working on similar tools.

MARS5 is their new text-to-speech model that supports two methods. You can either generate a fast, but shallow clone; or a slower and much high-quality version.

During testing, I’ve found the audio quality to be a bit hit-and-miss - but it’s something that will get better over time.

Of course, there’s plenty of positive and negative ways this technology could be used - with CAMB mostly focusing on the positives, such as instant language translation and breaking down barriers between people.

However, there could be much bigger consequences for ordinary citizens.

If you’ve ever uploaded a video online of you speaking - which most of us have on social media - this could allow fraudsters to target your friends and family.

While some criminals try to steal thousands from people by pretending to be their bank, others will send voice messages to your family and urgently ask for money.

Yes, there will undoubtedly be some benefits to this technology - but it’s important that we’re fully aware of the damage it could cause as well.

Image divider - Loop

2. SpaceX wants to launch rockets up to 120 times a year

The company has ambitious plans to launch its Starship rocket up to 120 times per year from two sites in Florida, which has led to concerns among their competitors.

Blue Origin and United Launch Alliance are concerned that these plans could disrupt their own launches in the area.

SpaceX’s CEO, Elon Musk also wants them to launch multiple rockets per day. Although, there’s a long way to go before that’s even remotely realistic.

His company is also working to increase their manufacturing capabilities and wants to produce key components on a daily basis.

Image divider - Loop

3. Hackers gained access to OpenAI’s internal communications

While the hackers only gained access to their internal chat rooms, this news really highlights how vulnerable the US’ top AI companies are - especially to foreign agents.

Every company is vulnerable to attack, but these are no ordinary companies. Startups like OpenAI have collected huge amounts of data and created some of the most advanced AI models.

Some of this data is used to train their models. However, they also have access to the private data that’s uploaded by their enterprise customers.

While none of this data was accessed in the latest security breach, it raises questions about what we provide to these companies.

Understandably, many companies have rushed to use ChatGPT and have it analyse their data for insights. But given that OpenAI is still a pretty young startup, that also comes with its own risks.

The US is currently seen as the world-leader in AI research, but its top startups also need to be wary of the threats they face from attackers.

Image divider - Loop

4. Microsoft’s new GraphRAG tool is finally released

GraphRAG is a new approach for LLMs to extract information. It combines both knowledge graphs and RAG to more efficiently search documents and generate better answers.

I was recently at a conference talk by Neo4J’s CTO, who made the case that GraphRAG is a better way to model the real-world.

This is because we can create links between topics and show how complex those relationships are.

When the Panama Paper journalists were given tens of thousands of documents, they used knowledge graphs to work out the links between hidden bank accounts and their wealthy owners.

Microsoft’s research demonstrates that GraphRAG does outperform RAG and generates answers that are more complete.

If you’re building your own Generative AI applications, you really need to explore GraphRAG and how it can improve your tool’s performance. I’ve included a link to the GitHub repo below.

Image divider - Loop

5. Apple’s Phil Schiller is joining OpenAI’s board

Apple has appointed Phil Schiller to represent it on OpenAI's nonprofit board. He will act as an observer in their board meetings, but will not be able to vote.

Schiller is one of Apple’s most senior leaders and has been an executive there since Steve Jobs returned in 1997.

The position will give him a good insight into how OpenAI operates and their broader ambitions going forward.

It follows on from Apple’s announcement that ChatGPT will be integrated into iOS and macOS, as they release dozens of new AI features for their users.

Interestingly, Microsoft also has an observer role on the board - but cannot vote either.

This is a pretty unusual step for Apple, but it highlights just how important it is for them to work with the emerging AI startups.



Image title - Closer Look

Japan will use humanoid robots to maintain train lines

Robot cutting down a tree

West Japan Railway is introducing a huge humanoid robot to maintain its train network - with it able to chop down trees and re-paint surfaces.

Just imagine WALL-E, but he’s 12 metres tall instead and one of his arms is a chainsaw…

It’s attached to a truck and can carry objects that weigh up to 40kg. To operate it, the person uses both a VR headset and joysticks.

Since Japan is facing an ageing population and worker shortages, the company is hoping that the robot can be modified for other tasks as well.

It’s an interesting robot and there’s no doubt that this is where the industry is going. Although it is a change compared to what we’ve seen before.

Most robotics companies in the US are focused on humanoid robots that will work in factories and warehouses. It’s pretty rare that we see a robot that is attached to a truck and is still operated by a human.

Regardless, it’s fascinating to see where else this technology could be used.



Image title - Announcement

GPT-4o has a new competitor

Moshi responding to questions

Moshi is a real-time voice assistant and aims to be a rival for GPT4-o, which was delayed by OpenAI last week.

It’s powered by the Helium 7B large language model, offers different accents and supports 70 different speaking styles.

Kyutai is the startup behind the technology and they say that their AI assistant can handle two audio streams simultaneously. This allows it to both listen and talk at the same time.

To fine-tune the model, they used over 100,000 examples of dialogue. These audio files were created using AI, thanks to Text-to-Speech (TTS) technology.

They then worked with a professional voice artist to improve the Moshi voice over time.

Interestingly, it’s an open-source model and can be run on your local computer - without the need to host it in the cloud.



Image title - Byte Sized Extras

🌍 Google's carbon emissions are increasing as it invests in AI

🎮 Meta to bring Generative AI to metaverse games

🎨 Figma turns off its AI feature, after it seemed to copy Apple's Weather app

🔍 Amazon faces more EU scrutiny over their algorithms and ads

🤖 MIT develops a robot to pack your groceries

🚀 Meta unveils AI that can make 3D textured models quicker than ever

🌄 3D environments can now be created from blurry images

🛡️ CloudFlare can block AI bots, scrapers and crawlers from your website

Image of Loop character with a cardboard box
Image title - Startup Spotlight
Mineral in a person's hand

Altrove

The French startup has just raised $4 million to boost their work, which uses artificial intelligence to identify new materials.

It follows similar advances from the teams at DeepMind, Microsoft, and Orbital Materials - who are all working to solve similar issues.

Altrove is particularly focusing on rare earth elements, which are really difficult to source and often come from China.

To do this, the company creates a list of interesting materials, then uses their AI model to suggest different compounds, and finally tests each of them.

The hope is that they will speed up the discovery process and lead to new breakthroughs. It could also reduce global reliance on China for these rare elements.



This Week’s Art

Sunrise over London

Loop via Midjourney V6



Image title - End note

We’ve covered quite a bit this week, including:

  • How you can clone someone’s voice with MARS5

  • Why SpaceX wants to launch rockets 120 times a year

  • OpenAI’s latest security breach

  • Microsoft’s new GraphRAG tool and the performance improvements

  • Phil Schiller’s new role on OpenAI’s board

  • How Japan will use humanoid robots to maintain train lines

  • GPT-4o’s new competitor, called Moshi

  • And how Altrove are using AI models to create new materials

Have a good week!

Liam

Image of Loop character waving goodbye

Share with Others

If you found something interesting in this week’s edition, feel free to share this newsletter with your colleagues.

About the Author

Liam McCormick is a Senior AI Engineer and works within Kainos' Innovation team. He identifies business value in emerging technologies, implements them, and then shares these insights with others.