When I asked the next-generation AI voice assistant to subscribe to WIRED, it accessed the correct web page and found a subscription form. Virtual assistants like Siri and Alexa will become even more powerful if equipped with AI.

Compared to the latest AI-powered chatbots like ChatGPT and Google Bard, voice assistants like Siri, Alexaand Google Assistant can do fewer things. However, if these traditional voice assistants incorporate the results of the recent generative AI boom, things will certainly get interesting.

Next-generation voice assistant powered by AI

To see what’s next, I decided to try out an experimental AI voice assistant called vimGPTWhen I asked the AI ​​voice assistant to subscribe to the US version of WIRED, it immediately got to work and showed off its impressive skills. He found the correct web page and accessed the online subscription form. If he had known my credit card information, he would have completed the request.

Although this task would not be a test of intelligence for humans, accessing the open web and purchasing something is far more complex and difficult than the tasks typically performed by Siri, Alexa, or Google Assistant (though the task of Setting up and getting sports results is technology from 10 years ago.) This requires not only understanding instructions and searching the Internet to find the correct site but also correctly navigating the relevant pages and forms.

The AI ​​assistant we tested was able to correctly access the US version of WIRED’s subscription page and find the subscription form. Along the way, you’ve probably been impressed by the fact that you can enjoy all of WIRED’s interesting and insightful articles for just $1 a month. However, since I did not have credit card information, I was unable to complete the task.

Change your computer experience

vimGPT uses Chromium, Google’s open-source browser that does not store user information. After a few tries, I found this AI assistant to be very good at finding funny cat videos and cheap flights.

vimGPT is not a product under development, but an experimental open-source program developed by developer Ishan Shah alone. That said, Apple and Google are likely working on similar efforts to improve Siri and other voice assistants.

vimGPT is built on GPT-4V, a multimodal version of OpenAI’s famous language model (capable of handling different types of information, such as audio and images as well as text). By analyzing user instructions, it can better determine what to click or type when trying to decipher complex HTML and understand a website’s content than models that only analyze text.

“I think in a year, the experience of using a computer will be a far cry from what it is now,” Shah says. Additionally, vimGPT was developed in just a few days. “You’ll be clicking less and chatting more when using most applications. These assistants will become an essential part of your web experience.”

Complete complex tasks online

Shah is not alone in thinking that the next logical evolution for chatbots like ChatGPT is the development of AI assistants that can leverage computers to access the web. Ruslan Saravtudinov, a professor at Carnegie Mellon University, believes that Siri and other voice assistants are on the verge of being completely updated to take advantage of AI. Incidentally, Saravtudinov was Apple’s director of AI research from 2016 to 2020.

However, there are still many failures. In experiments, researchers at Carnegie Mellon University found that AI assistants completed complex tasks about 16% of the time, whereas humans can complete these tasks 88% of the time.

Many failures are mundane. For example, you may have trouble navigating a website and become stuck. However, some failures appear to be malfunctions. For example, the AI ​​assistant mistakenly added dozens of items to a user’s cart or mistakenly added people as friends on social media that they didn’t want to interact with. It might actually be a good thing that we can’t give payment information to vimGPT yet.

Get smarter in a simulated environment

One reason the virtual environment Carnegie Mellon University has built is valuable is that there is no real harm to an AI assistant running amok. Collecting data on such incidents can help researchers understand exactly how well AI assistants perform certain tasks and where they make mistakes.

By allowing the AI ​​assistant to move freely around an environment like VisualWebArena, Saravtudinov says it can actively learn from its successes and failures. This is similar to how game simulations train the machine learning algorithms that play the games. This method has led to superior AIs such as AlphaGo, which defeated the world champion Alphabet Go.

* Click here for related articles on digital assistants by “WIRED. ”

I tried the next-generation voice assistant equipped with AI and was surprised by its capabilities.

