- Loop
- Posts
- š¤Ā Microsoft reveals new work on AI agents
š¤ Microsoft reveals new work on AI agents
Plus more DeepMindās robotics research, Stanfordās robot that can clean-up, and Appleās Vision Pro.
Welcome to this edition of Loop!
To kick off your week, weāve rounded-up the most important technology and AI updates that you should know about.
āāā ā HIGHLIGHTS āāā ā
Stanfordās work on creating a household robot
OpenAIās app store for custom GPTs
Googleās new generative AI models for retailers
ā¦ and much more
Let's jump in!
1. Apple Vision Pro headset will be released next month
This marks Apple's first major product launch since the Apple Watch in 2014. After unveiling the new headset last June, it will launch on February 2nd - with pre-orders opening on January 19th.
At launch, it will support over a million iOS and iPadOS apps - along with apps that have been specially designed to take advantage of the headsetās features. At $3,499 it wonāt be accessible for most people, but Apple hopes to demonstrate how spatial computing could change the way we interact with technology.
Itās a significant move by the company, who have been rumoured to be developing an AR/VR headset for several years. Those who tried it at the preview event have praised the device for its visuals and superior user experience.
To control the headset, you simply look at an element and then tap your fingers to click on it - no additional joysticks are needed. But there are plenty of scenarios where you might need more fine-grained control, such as gaming or tasks at work. To accommodate these, the headset supports game controllers, keyboards, and trackpads - which opens up a lot of different use cases.
2. DeepMind is researching how to train robots with LLMs
AutoRT allows DeepMind to deploy robots and then instruct them to gather training data in new environments. Essentially, the system acts as a manager for the robot and tells it what to do.
They achieved this using a Large Language Model (LLM), a Visual Language Model (VLM) and a Robot Transformer (RT). The VLM takes in video footage and tries to understand what the environment looks like for the robot. The LLM then gives it a task to do, with the RT finally controlling the robot and completing the task.
That was a lot of acronyms, but essentially this project looked at how we could use large AI models to control robots and train them for new tasks. The team at DeepMind has also been researching how we can safely restrict robots and prevent them from causing harm to humans.
This is a key area for all the major tech companies. Google, Meta, and Microsoft have been researching new ways we can train robots for several years now - and a lot of their research has revolved around using Generative AI.
But theyāre not the only players here, Tesla and Agility are developing their own humanoid robots. Just recently, a company called 1X, which is backed by OpenAI, has raised $100 million for their own research into this area. Itās likely weāll see a lot more announcements around this in the coming year.
3. OpenAI release their app store for custom GPTs
OpenAI has launched the GPT Store, which allows users to access versions of ChatGPT that have been lightly customised by other people. Itās only available to subscribers on their premium plan, with apps in categories such as writing, research, programming, and education.
You donāt actually need any programming experience to make your own GPTs - simply type in what you want the GPT to do and then upload any files it should use as a knowledge base. All of the GPTs are free to use, but OpenAI plans to pay developers via a revenue sharing program.
This will be based on user engagement, which raises a few question marks as to how this will impact future development for the OpenAI app store. Weāve already seen what has happened with social media platforms, as they relentlessly prioritised engagement over everything else. OpenAI wonāt want to become fixated on the same issue and make the same mistake.
4. Walmart hopes that AI can do your shopping for you
The company is exploring a new AI feature, which would learn from your individual shopping habits and the broader trends of other shoppers who are similar to you.
It would then anticipate your needs and automatically create a grocery list - without you having to lift a finger. At the moment, shoppers need to manually look through and select items.
While the idea is still in development, it signals how Walmart aims to broaden their offering. Theyāre certainly one of the most innovation-focused retailers and regularly explore the art of the possible. In the last few days, Walmart have announced that theyāre expanding their drone delivery system to 1.8 million more homes.
5. Google launches new generative AI tools for retailers
Google is making a push into the retail sector with its newly unveiled GenAI products, which aim to further personalise the online shopping experience and help retailers to produce more marketing content.
One of these tools is Conversational Commerce, which allows retailers to integrate GenAI-powered agents into their websites and apps. These agents will use a chat interface to speak with shoppers and offer product suggestions. If the bot needs to be customised further, the retailer can do this by providing their own data.
Additionally, retailers can use their Catalog and Content Enrichment tool to automatically generate product descriptions, metadata, categorise the item, and even produce new product images.
Stanfordās work on household robots
Several researchers from Stanford have released their latest work on Mobile Aloha, with the aim of creating a general purpose robot.
It is able to autonomously complete several tasks - such as cleaning up after a drink was spilled, pressing the button to call an elevator and then entering it, open cabinets and placing items inside, or even pushing chairs forward.
Just 50 videos have been used to train the robot on those tasks, which shows how quickly it was able to gain these new skills. The robotic arms are attached to a stand, which can be moved around the room, and a computer is placed on top to control the robot.
Itās worth noting that there are several demo videos of Mobile Aloha. Some of these showcase the autonomous features, which you can see above.
Although some people online have been confused by the other demos, which feature teleoperating and a human involved in actually moving the robot. It seems these teleoperating demos were included to demonstrate what other use cases could be possible with the robot.
Despite some initial confusion around the demos, itās a fascinating project and really gives a good insight into some of the research work being done by the top universities.
Microsoft releases a new framework for AI agents
TaskWeaver is an open-source framework from Microsoft, which allows you to create your own agents that specialise in data analysis.
The project is similar to Microsoftās previous work with AutoGen, which also allows you to create AI agents. However, TaskWeaver is much more focused on analysing data and complex information.
An AI agent is essentially a bot that aims to figure out all the steps needed to complete a task. Then, it will try to carry out the task and implement any code needed. These agents can be completely autonomous, or you can be more involved and guide them on how to complete the task.
Let's say that you're a business analyst like the person in the scenario above. You need to analyse some data about your companyās sales, which is stored in a SQL database. With TaskWeaver, you can simply ask the assistant to identify any anomalies in the data.
The AI agent will then fetch the data you asked for, check for the anomalies you specified, and visualise the results ā and this is all done through a chatbot interface.
To achieve this, TaskWeaver has 3 agents - a Planner, Code Generator, and Code Executor. These agents will work together to understand your request, break it down into smaller tasks, generate the necessary code, and then execute it.
Itās an interesting tool and something that should be useful for a lot of different roles - from data and business analysts, to support staff that analyse customer data, to marketing teams that use data to improve their advertising campaigns.
š° OpenAI fights back against the New York Timesā copyright lawsuit
š Hyundai says its electric air taxi business will take flight in 2028
š Walmart debuts generative AI search feature at CES
ā Duolingo cuts 10% of its contractor workforce, as the company instead looks to AI
š OpenAI changes policy to allow military applications
āļø Shield AI raises another $300M, scaling valuation to $2.8B
š Volkswagen is bringing ChatGPT into its cars and SUVs
š¤ DeepMind says images that can trick computers, can also trick humans
Source: @alexcarliera on X
Luma AI
This is a Generative AI startup thatās got quite a bit of attention recently. They allow users to quickly create 3D objects, which is simply done through text prompts.
While text-to-3D isnāt a huge area right now, that will change with the launch of Appleās Vision Pro this year. Weāre going to see a huge growth in the number of apps and websites that support 3D experiences, as more people adopt what Apple likes to call āspatial computingā.
Luma AI have recently raised $43 million in funding, as they seek to double their teamās headcount in the coming year. Theyāve just released their new 3D generation model, called Genie, which can generate an object in under 10 seconds - and the results look pretty promising.
If you want to try it out, you can check out their post below.
This Weekās Art
Loop via Midjourney V6
While OpenAIās release of the GPT store has captured a lot of headlines, there have been plenty of other stories recently that are worth taking note of.
Weāve covered a lot this week:
Appleās plans to release the Vision Pro next month
DeepMindās robotics & generative AI research
OpenAIās app store for custom GPTs
How Walmart hopes that AI can do your shopping for you
Googleās GenAI tools for retailers
Stanfordās work on household robots
Microsoftās new framework on AI agents
And Lumaās ambitions for text-to-3D models
Weāre increasingly seeing how the big tech companies are using LLMs to manage robots and then assign them different tasks. DeepMindās research is particularly exciting, but they have been involved in this area for some time now - itāll be interesting to see how they actually start to apply this going forward.
Have a good week!
Liam
Feedback
Share with Others
If you found something interesting in this weekās edition, feel free to share this newsletter with your colleagues.