• Loop
  • Posts
  • 📈 Researchers show that AI agents increase LLM performance

📈 Researchers show that AI agents increase LLM performance

Plus more on Google’s release of Gemini Ultra, Disney’s partnership with Epic Games, and the EU’s action on deepfakes.

Image - Loop relaxing in space

Welcome to this edition of Loop!

To kick off your week, we’ve rounded-up the most important technology and AI updates that you should know about.

‏‏‎ ‎ HIGHLIGHTS ‏‏‎ ‎

  • Disney and Epic Games join forces on the “metaverse”

  • Google’s Gemini chatbot can now compete with ChatGPT

  • How AI is changing the way we analyse satellite imagery

  • … and much more

Let's jump in!

Image of Loop character reading a newspaper
Image title - Top Stories

1. Google launches their most powerful LLM, Gemini Ultra

Finally, Google’s closest competitor to GPT-4 has been released - following the initial release of their less powerful models back in December.

The Bard chatbot has also been rebranded to use the Gemini name, as Google hopes to shake off any previous comparisons that customers made between Bard and ChatGPT.

For those on Android, you can replace the Google Assistant with Gemini or use the dedicated app that’s being launched.

If you want access to Gemini Ultra you’ll need to pay for the new $20 Google One subscription, similar to OpenAI’s ChatGPT Plus.

The clear advantage that Google has here is their huge ecosystem. In their announcement, they showed how you could ask Gemini to “plan a date night and find a nice sushi restaurant”.

It then sends the request to Google Maps and shows a list of options, which you can narrow down with further questions.

If you compare that to OpenAI, they have been forced to rely on third-party plugins and the custom GPT store - which often work fine, but these offerings from the public will inevitably pale in comparison to what Google can build.

Image divider - Loop

2. Apple’s new model uses text prompts to edit images

The company’s researchers have developed MGIE, which can edit photos through simple text prompts - without the need for traditional photo editing software.

The model can perform different tasks like cropping, resizing, flipping, and adding filters - alongside more complex edits, such as adding new objects into the original image.

For example, they show how the model could make the sky look bluer or add vegetable toppings to a pizza to make it appear healthier.

To achieve this, MGIE will first decode what the user wants and then “imagine” what the end result would look like.

While this isn’t a fully fledged feature on iOS just yet, it gives us a glimpse into what might be coming down the line. It’s especially relevant given Tim Cook’s comments last week, when he said that more generative AI research will be unveiled “later this year”.

Image divider - Loop

3. EU drafts new guidelines to limit political deepfakes

These new guidelines are aimed at huge tech platforms - such as Facebook, Google, TikTok, and X - and hope to reduce the risk of AI being used to interfere in elections.

The EU recommends that these platforms are able to detect AI content, clearly label when this type of content is shown, and modify their moderation systems to spot when there are malicious campaigns.

Given that this is a year where 75% of the world’s democracies will hold elections, the EU is keen to prevent hostile states - and individuals - from using AI to interfere in the democratic process.

Time will tell if this actually helps. But given that anyone can access AI content generators, which often don’t have any form of watermarking, this seems to be an almost impossible task.

Image divider - Loop

4. OpenAI will start adding watermarks to DALL-E 3 images

Speaking of watermarks, OpenAI are finally adding C2PA’s watermark standard to images that were made using DALL-E 3.

This initiative hopes to make it easier for others to verify where content has originated from. C2PA includes many of the top technology companies, such as Adobe, Microsoft, and Google.

Adobe has been especially active in promoting watermarks, which are already being used within their Firefly image editor.

Although, they aren’t the silver bullet we might hope for. Some watermarks can be evaded by removing metadata from the image or by simply taking a screenshot.

Google has introduced their own standard that can reduce these risks, but identifying AI generated content is a very tricky task - and as these models become better over time, it’s only going to get more difficult.

Image divider - Loop

5. Disney wants to build an ‘entertainment universe’ with Fortnite, invests $1.5 billion

The company has recently been under pressure from activist investors, like Nelson Peltz, who claims that Disney has “woefully underperformed” compared to its competitors.

The new partnership between Epic Games and Disney is intentionally vague, but will likely involve Disney characters from Marvel, Star Wars, and Pixar being featured within Epic’s future games.

Epic have regularly spoken about their desire to create a gaming “metaverse” and this tie-up with Disney could help make it a reality.

To put this into context, Disney are facing mounting pressure in the streaming wars - with Netflix introducing games as a way to lure new subscribers and stop others from leaving.

Given this and the pressure from Peltz, Disney is paying $1.5 billion for equity in Epic and to strengthen ties.



Image title - Closer Look

Using AI agents can dramatically
improve LLM performance

Image - research projecy's setup

A very interesting paper has shown that we can increase Large Language Model (LLM) performance by using multiple AI agents and a sampling-and-voting approach.

This meant that the LLMs were more capable at solving complex problems, which is a common limitation today.

Previously, we had to use a mixture of techniques to try and get around this - such as Chain of Thought pipelines that give the model more time to “think” and process the information.

However, there’s a reduced need for these techniques when agents are used instead.

Image - performance results graph

Interestingly, this research shows that complex tasks saw the largest performance increases.

There’s a lot of potential here, especially as companies start to move away from simple LLM use cases - such as chatbots and document processing - and instead focus on bigger, more difficult tasks.

I’ve been creating AI agents for several months now and have seen first-hand the improved reasoning they can achieve, when allocated a specific role.

This often makes the model more focused at solving the task and prevents them from going off-script. If you want to read more about their research, you can check it out below.


Image title - Closer Look

Meta’s AI model is changing
how we analyse satellite imagery

Gif - satellite detections

When Meta released their Segment Anything Model (SAM) almost a year ago, it was a huge advancement for computer vision applications.

But while the initial release got a lot of attention from the industry, very little has been paid to the tools that have since been built on top of SAM.

The most interesting way SAM is being currently used is to analyse satellite imagery.

Previously, we needed huge models to detect cars, buildings, ships, and aeroplanes from above. But thanks to ESRI’s release of Text-SAM, that’s no longer the case.

We now have the ability to type text commands, such as “plane” or “building”, and this SAM-related tool can immediately highlight where the object is within the image.

It’s fascinating to see and makes you feel like you’re actually in an episode of CSI - simply describe what you’re looking for and click search.

Accessing satellite imagery has gradually become cheaper over time, but there’s still some way to go. In the past, it was only accessible to nation states with huge budgets - but that’s starting to change.

Increasingly, private companies are downloading this imagery and using AI to analyse it.

We can use these images to reveal new insights into carbon emissions from buildings and industrial factories, or you can use it to get a much wider view and calculate congestion at a city-wide level.

Now that we can use text prompts to spot objects from the sky above, these use cases are much more accessible than ever before.



Image title - Byte Sized Extras

📱 Apple explores the release of a foldable iPhone

📉 Snap’s stock falls by 30% after Q4 results

¢ Adam Neumann is trying to buy WeWork back again

🇬🇧 UK Government announces $120 million plan for responsible AI research

🔉 FCC officially declares that AI-voiced robocalls are illegal

🪪 Entrust to buy Onfido, an ID verification startup, for more than $400 million

🤖 Meta will expand labelling of AI-generated imagery, ahead of elections

🇺🇸 Chinese hackers have been in US critical infrastructure for at least five years

Image of Loop character with a cardboard box
Image title - Startup Spotlight
Gif - detections in waste centre

Greyparrot

This is a startup that aims to make recycling centres more efficient and uses computer vision to provide them with better analytics.

Greyparrot’s cameras are positioned above the waste conveyor belt and can detect the different items flowing through it - such as metals, plastics, electronics, or fibres.

This is critical for the waste company, since they can spot if valuable materials have been placed into the wrong area and recover them.

The data is also important for when the material is being re-sold, as it gives them insights into the batch’s quality.

As we start to see recycling targets increase over time, technology like this will be important for companies to optimise their processes and demonstrate that they’re complying with regulations.

Greyparrot are based in the UK and have recently seen Bollegraaf Group, who are a huge recycling company in the Netherlands, invest almost $13 million in the startup.

As part of the deal, engineers from Bollegraaf’s AI unit will be working to support Greyparrot going forward.



This Week’s Art

Image - space ship taking off as an astronaut watches

Loop via Midjourney V6



Image title - End note

There’s been plenty to talk about this week, from deepfakes to AI agents, but there’s no doubt that Google’s release of Gemini has been the main talking point.

Their integrations with Google’s products, which we all use on a daily basis, could help them shift the dial away from OpenAI.

That is, provided that OpenAI don’t release a new agent framework anytime soon…

Lots has been covered this week, such as:

  • Google’s release of Gemini Ultra and their updated chatbot

  • Apple’s research on how to edit photos with text commands

  • EU’s guidelines on how platforms should reduce the risk of deepfakes

  • OpenAI’s plan to add watermarks to DALL-E 3 images

  • Disney’s tie-up with Epic Games as part of a “metaverse”

  • How AI agents can dramatically improve LLM performance

  • The ways Text SAM could change how we analyse satellite imagery

  • And how Greyparrot are using computer vision to make recycling more efficient

Have a good week!

Liam



Feedback

Image of Loop character waving goodbye

Share with Others

If you found something interesting in this week’s edition, feel free to share this newsletter with your colleagues.

About the Author

Liam McCormick is a Senior Software Engineer and works within Kainos' Innovation team. He identifies business value in emerging technologies, implements them, and then shares these insights with others.