🦾 Tesla has revealed their new humanoid robot

Hello,

Welcome to this edition of Loop! We aim to keep you informed about technology advances, without making you feel overwhelmed.

To kick off your week, we’ve rounded-up the most important technology and AI updates that you should know about.

In this edition, we’ll explore:

- How Agility are using LLMs to communicate with robots
- DeepMind’s new discoveries in mathematics
- Google’s generative AI models for healthcare
- … and much more

Let's jump in!

Top Stories

1. DeepMind debuts Imagen 2, an image generator that can even create text and logos [Link]

The enhanced model offers improved image quality and new features, such as the ability to render text and logos in multiple languages. Imagen 2 will also be using SynthID, which adds an invisible watermark to images made using the generator - allowing them to be more easily identified in the future. Based on the images that have been released so far, there looks to be a huge improvement in how images of humans are generated - which previously looked good, but not quite right.

Perhaps unsurprisingly, DeepMind has been tight-lipped about the dataset they used to train the model, given there are growing calls for artists to be compensated and the potential for legal challenges down the line. If you’re a developer and want to try it out, you can access the model by using Google’s Vertex AI service.

2. OpenAI signs a deal with Axel Springer, will licence their news articles for model training [Link]

OpenAI has entered into an agreement with Berlin-based media company Axel Springer to use the publisher's content for training its generative AI models. The deal allows ChatGPT to provide summaries of select Axel Springer articles, even those behind paywalls, with proper attribution and links to full articles. In exchange, Axel Springer will receive payments from OpenAI under a multi-year agreement that supports Axel Springer's AI-driven ventures.

It’s important to remember that huge tech companies, like OpenAI and Google, are facing legal issues for using copyrighted material to create their AI models. But from the other side, media publishers are fighting to remain financially viable. Traditional media companies have faced declining physical sales in the last decade, which has led to a need to gain digital views instead - driving them to use increasingly sensational headlines and click-bait.

With the rise of Generative AI, they face a new problem - tech companies worth trillions of dollars are copying their news articles and using it to train their new AI tools, without giving them any compensation. But this has the double impact of reducing traffic to the media company’s website, which then impacts the ad revenue they badly need to survive. While this partnership by OpenAI and Axel Springer isn’t an easy move for either party, it’s a necessary one.

3. Runway start research into General World Models [Link]

The company has indicated that they aim to broaden their research significantly over the next few years. Currently, they focus on generating videos that were made with AI. But they want to widen that scope to include multi-modal models that can better understand the world around us.

Essentially, this announcement from Runway aims to do a few things. They want to signal to future investors about their plans for competing with giants like Google and OpenAI on the technology front. They also want to attract top AI researchers from those companies, who might not have previously considered joining the company given their narrow focus on video generators.

While the announcement itself doesn’t have much detail, it gives a good insight into the behind-the-scenes conversations they’re having with investors and talent - along with showing that multi-modal development isn’t something that can only be funded by the mega tech companies.

4. Agility are using large language models to communicate with robots [Link]

Over the last year, there has been a lot of research into how Large Language Models (LLMs) can be used to control robots. Microsoft has explored using ChatGPT to control drones. Meta, Google, and Microsoft have looked at using LLMs to command robotic arms.

And now Agility have used LLMs to control their robot, known as Digit, which is already being tested in Amazon’s warehouses. If you missed our summary on Digit a few weeks back, you can read it here.

In their demo, Digit is given natural language commands to perform different tasks, such as picking up a specific colored box and moving it to a designated location. Given that robots will likely be working alongside humans, especially in a warehouse environment, they will need to understand and follow the human’s instructions - which can see a large variation in phrasing between different people. It’ll be interesting to see how this research progresses in the coming year.

5. Google unveils MedLM, a family of healthcare-focused generative AI models [Link]

The company is venturing into the healthcare sector with MedLM, a family of generative AI models designed for medical applications, which is available for a select number of Google Cloud customers. MedLM includes a larger model for complex tasks and a smaller, fine-tunable model for a broader range of tasks.

HCA Healthcare is already testing MedLM for drafting patient notes and BenchSci are looking at how it could be used to enhance biomarker research. However, there will be concerns about hallucinations - which could lead to people being provided with incorrect medical information by the model.

Announcement

Tesla unveils its latest humanoid robot, Optimus

Tesla has released a new demo video of its prototype humanoid robot, Optimus, which seems to have taken a good step forward compared to the previous generation. The robot can be seen walking, crouching, and delicately manipulating objects - which shows the improvements they’ve been able to achieve with hand and arm movements.

The company says it has reduced the weight of Optimus by 10kg (22lb) and have shown the bot delicately picking up and handling an egg, with what it describes as “tactile sensing on all fingers.” They’ve also shown the bot walking through a Tesla facility, with Cybertrucks parked nearby, and performing squats in a gym.

The goal for the bipedal autonomous humanoid is to replace humans in the performance of “unsafe or boring tasks”, but there’s no immediate plans for commercialising the bot. Tesla has asserted that no CGI was used in their video, but we should still be a little skeptical until there’s a live demo.

You can see the full video on YouTube.

Announcement

DeepMind use LLMs to make new discoveries in mathematics

Google DeepMind's large language model, FunSearch, has made a significant breakthrough in pure mathematics by solving the long-standing “cap set problem”. This achievement marks the first time that a large language model has been used to discover a solution to a scientific puzzle - with it producing new, verifiable knowledge that wasn’t previously in the model’s training data.

FunSearch works by sketching out problems in Python and then uses a language model called Codey to suggest potential solutions, which are then evaluated and refined through iterative feedback.

Essentially, the model will suggest lots of different ideas. A separate program will then check if they’re correct and provide feedback. The best ideas are then selected and used in the next round, which leads to a self-improvement loop.

Interestingly, the model seems to be quite versatile. Researchers used FunSearch to approach another difficult problem in mathematics, known as the bin packing problem. This involves trying to pack items into as few bins as possible and has a lot of downstream applications - such as managing data centers, or more efficiently packaging items for shipping. FunSearch was able to outperform human solutions for solving the problem, with its method being even faster.

If you want to read more, you can check out the announcement on their website.

Byte-Sized Extras

❌ OpenAI suspends ByteDance’s account after it used GPT to train its own AI model [Link]

🚕 Cruise, the self-driving company, cuts 24% of their workforce [Link]

🚕 Waymo, who are Cruise’s main competitor, launch a new robotaxi pickup service at Phoenix airport [Link]

📸 Instagram introduces GenAI powered background editing tool [Link]

🚨 Microsoft disrupts a huge cybercrime operation, which sells fraudulent accounts to a notorious hacking gang [Link]

Startup Spotlight

Leonardo AI

This is a Generative AI startup that’s focused on the creative industry, such as game development, advertising, and fashion design. Typically, today’s AI generators will create completely different images even if the prompt remains the same. This can be quite frustrating, but is a much bigger issue for creatives who need that consistency.

Using their platform, you can save, edit, and build multiple assets in the same style - which allows them to be reused. They also have an AI Canvas feature, which allows users to draw on an image and their AI model will turn into an asset. For example, drawing a yellow ball in the sky will tell the model to generate an image of the sun.

Leonardo AI is based in Sydney and has already raised $31 million from investors, despite only being founded in 2022. The platform has seen rapid growth, with it having over 7 million users already - who have generated over 700 million images.

If you want to find out more about Leonardo, you can check out their website.

This Week’s Art

Prompt: An ultra-realistic, ultra-wide image of Papai Noel, traditionally known as Santa Claus, in a snowy outdoor setting. He is busily filling his sleigh with colorful, wrapped gifts under the bright, shimmering moonlight. The scene should be expansive, showing a wide view of the picturesque snowy landscape, with soft, glistening snow covering the ground and surrounding pine trees. The moonlight casts a gentle, silvery glow over the entire scene, enhancing the magical and festive atmosphere. Papai Noel, dressed in his classic red and white outfit, exhibits a jolly expression as he prepares for his Christmas journey.
Platform: DALL-E 3

End Note

While many people are making plans for the Christmas holidays, the new product announcements haven’t really slowed down. This week we’ve looked at Tesla’s humanoid robot, Google’s unveiling of Imagen 2, OpenAI’s deal with Axel Springer, Agility using LLMs to communicate with their robots, Runway broadening into multi-modal research, Google’s MedLM for healthcare, Leonardo’s generative AI tools for creatives, and DeepMind’s work on using LLMs for solving mathematical problems.

Have a good week and see you again in 2024!

Liam

Feedback

How did we do this week?

Your feedback helps us to create the best newsletter possible.

Share with Others

If you found something interesting, feel free to share this newsletter with your colleagues.

About the Author

Liam McCormick is a Senior Software Engineer and works within Kainos' Innovation team.
He identifies business value in emerging technologies, implements them, and then shares these insights with others.

🦾 Tesla has revealed their new humanoid robot

Leonardo AI

How did we do this week?

About the Author

Keep reading

The leading newsletter for AI and innovation.

Get the latest insights.