Technology
Trending
Fitness
Relationships
View More PostsDo you find it difficult to keep up with the latest ML research? Are you overwhelmed with the massive amount of papers about LLMs, vector databases, or RAGs?
In this post, I will show how to build an AI assistant that mines this large amount of information easily. You’ll ask it your questions in natural language and it’ll answer according to relevant papers it finds on Papers With Code.
On the backend side, this assistant will be powered with a Retrieval Augmented Generation (RAG) framework that relies on a scalable serverless vector database, an embedding model from VertexAI, and an LLM from OpenAI.
On the front-end side, this assistant will be integrated into an interactive and easily deployable web application built with Streamlit.
Every step of this process will be detailed below with an accompanying source code that you can reuse and adapt👇.
Ready? Let’s dive in 🔍.
If you’re interested in ML content, detailed tutorials, and practical tips from the industry, follow my newsletter. It’s called The Tech Buffet.
Do you find it difficult to keep up with the latest ML research? Are you overwhelmed with the massive amount of papers about LLMs, vector databases, or RAGs?
In this post, I will show how to build an AI assistant that mines this large amount of information easily. You’ll ask it your questions in natural language and it’ll answer according to relevant papers it finds on Papers With Code.
On the backend side, this assistant will be powered with a Retrieval Augmented Generation (RAG) framework that relies on a scalable serverless vector database, an embedding model from VertexAI, and an LLM from OpenAI.
On the front-end side, this assistant will be integrated into an interactive and easily deployable web application built with Streamlit.
Every step of this process will be detailed below with an accompanying source code that you can reuse and adapt👇.
Ready? Let’s dive in 🔍.
If you’re interested in ML content, detailed tutorials, and practical tips from the industry, follow my newsletter. It’s called The Tech Buffet.
Apple was the first major tech company to launch a voice assistant back in 2011. But a common criticism of Siri is that it’s struggled to compete with Amazon’s Alexa, Google’s Assistant and, more recently, ChatGPT-powered assistants.
If rumors are right, however, Siri is expected to receive a big brain power boost this year, thanks mostly to AI. The integration of large language models, the technology behind ChatGPT, is poised to transform Siri into what one leaker is envisioning as the “ultimate virtual assistant.”
Read more: Best iPhone to Buy for 2024
In December, Apple published research showing it can make LLM AI models run on-device in a similar way that Qualcomm and MediaTek have done for their chips in Android phones. This may indicate that Siri will get a long-awaited overhaul that iPhone fans have been waiting for, including the ability to chat like ChatGPT.
Only Apple knows what’s next for the iPhone and its other products, but here’s how Siri could change in the iPhone 16.
Siri could improve follow-up requests
Imagine you ask Siri about when the Olympics are taking place. It quickly spits out the correct dates in the summer of this year. But if you follow that up with, “Add it to my calendar,” the virtual assistant tends to respond imperfectly with “What should I call it?” The answer to that question would be obvious to us humans. Even when I responded, “Olympics,” Siri replied, “When should I schedule it for?”
The reason Siri tends to falter is that it lacks contextual awareness. That limits its ability to follow a conversation like a human can. However, that could change in June of this year, when Apple is rumoured to unveil improvements to Siri via iOS 18.
The iPhone maker is training Siri (and the iPhone’s Spotlight search tool) on large language models in order to improve the virtual assistant’s ability to answer more questions accurately, according to the October edition of Mark Gurman’s Bloomberg newsletter PowerOn. A large language model is a specific kind of AI that excels at understanding and producing natural language. With advancements in LLMs, Siri is likely to become more skilled at processing the way people speak. This should not only allow Siri to understand more complex and nuanced questions, but also provide accurate responses. All in all Siri is expected to become a more context-aware and powerful virtual assistant.
Siri may get better at executing multistep tasks
Apart from understanding people better, Siri is also expected to become more capable and efficient in the coming months. Apple plans to use large language models to make Siri smarter, according to a September report from the Information. The article detailed an example explaining how Siri might respond to simple voice commands for more complex tasks, such as turning a set of photos into a GIF and then sending them to one of your contacts, which would be a significant step forward in Siri’s capabilities.
Watch this: iOS 17 Brings Big Changes to Old Habits: Live Voicemail, AirDrop and Siri
Read more: Did You Know Siri Can Do This?
Siri may improve its interactions with the Messages app (an other apps)
Apart from answering questions, the next version of Siri could become better at automatically completing sentences, according to a Bloomberg report published in October.
Thanks to LLMs, which are trained on troves of data, Siri is expected to up its predictive text game. Beyond that, Apple is rumored to be planning to add AI to as many Apple apps as possible which could even include a feature in the Messages app to craft complex messages.
Apple never talks specifics about products before they launch. Since Apple usually unveils new iPhone software features at WWDC in June, we’ll likely know more about iPhone AI plans then.
Editors’ note: CNET is using an AI engine to help create some stories. For more, see this post.
I Took 600+ Photos With the iPhone 15 Pro and Pro Max. Look at My Favorites
See all photos
Apple was the first major tech company to launch a voice assistant back in 2011. But a common criticism of Siri is that it’s struggled to compete with Amazon’s Alexa, Google’s Assistant and, more recently, ChatGPT-powered assistants.
If rumors are right, however, Siri is expected to receive a big brain power boost this year, thanks mostly to AI. The integration of large language models, the technology behind ChatGPT, is poised to transform Siri into what one leaker is envisioning as the “ultimate virtual assistant.”
Read more: Best iPhone to Buy for 2024
In December, Apple published research showing it can make LLM AI models run on-device in a similar way that Qualcomm and MediaTek have done for their chips in Android phones. This may indicate that Siri will get a long-awaited overhaul that iPhone fans have been waiting for, including the ability to chat like ChatGPT.
Only Apple knows what’s next for the iPhone and its other products, but here’s how Siri could change in the iPhone 16.
Siri could improve follow-up requests
Imagine you ask Siri about when the Olympics are taking place. It quickly spits out the correct dates in the summer of this year. But if you follow that up with, “Add it to my calendar,” the virtual assistant tends to respond imperfectly with “What should I call it?” The answer to that question would be obvious to us humans. Even when I responded, “Olympics,” Siri replied, “When should I schedule it for?”
The reason Siri tends to falter is that it lacks contextual awareness. That limits its ability to follow a conversation like a human can. However, that could change in June of this year, when Apple is rumoured to unveil improvements to Siri via iOS 18.
The iPhone maker is training Siri (and the iPhone’s Spotlight search tool) on large language models in order to improve the virtual assistant’s ability to answer more questions accurately, according to the October edition of Mark Gurman’s Bloomberg newsletter PowerOn. A large language model is a specific kind of AI that excels at understanding and producing natural language. With advancements in LLMs, Siri is likely to become more skilled at processing the way people speak. This should not only allow Siri to understand more complex and nuanced questions, but also provide accurate responses. All in all Siri is expected to become a more context-aware and powerful virtual assistant.
Siri may get better at executing multistep tasks
Apart from understanding people better, Siri is also expected to become more capable and efficient in the coming months. Apple plans to use large language models to make Siri smarter, according to a September report from the Information. The article detailed an example explaining how Siri might respond to simple voice commands for more complex tasks, such as turning a set of photos into a GIF and then sending them to one of your contacts, which would be a significant step forward in Siri’s capabilities.
Watch this: iOS 17 Brings Big Changes to Old Habits: Live Voicemail, AirDrop and Siri
Read more: Did You Know Siri Can Do This?
Siri may improve its interactions with the Messages app (an other apps)
Apart from answering questions, the next version of Siri could become better at automatically completing sentences, according to a Bloomberg report published in October.
Thanks to LLMs, which are trained on troves of data, Siri is expected to up its predictive text game. Beyond that, Apple is rumored to be planning to add AI to as many Apple apps as possible which could even include a feature in the Messages app to craft complex messages.
Apple never talks specifics about products before they launch. Since Apple usually unveils new iPhone software features at WWDC in June, we’ll likely know more about iPhone AI plans then.
Editors’ note: CNET is using an AI engine to help create some stories. For more, see this post.
I Took 600+ Photos With the iPhone 15 Pro and Pro Max. Look at My Favorites
See all photosApple was the first major tech company to launch a voice assistant back in 2011. But a common criticism of Siri is that it’s struggled to compete with Amazon’s Alexa, Google’s Assistant and, more recently, ChatGPT-powered assistants.
If rumors are right, however, Siri is expected to receive a big brain power boost this year, thanks mostly to AI. The integration of large language models, the technology behind ChatGPT, is poised to transform Siri into what one leaker is envisioning as the “ultimate virtual assistant.”
Read more: Best iPhone to Buy for 2024
In December, Apple published research showing it can make LLM AI models run on-device in a similar way that Qualcomm and MediaTek have done for their chips in Android phones. This may indicate that Siri will get a long-awaited overhaul that iPhone fans have been waiting for, including the ability to chat like ChatGPT.
Only Apple knows what’s next for the iPhone and its other products, but here’s how Siri could change in the iPhone 16.
Siri could improve follow-up requests
Imagine you ask Siri about when the Olympics are taking place. It quickly spits out the correct dates in the summer of this year. But if you follow that up with, “Add it to my calendar,” the virtual assistant tends to respond imperfectly with “What should I call it?” The answer to that question would be obvious to us humans. Even when I responded, “Olympics,” Siri replied, “When should I schedule it for?”
The reason Siri tends to falter is that it lacks contextual awareness. That limits its ability to follow a conversation like a human can. However, that could change in June of this year, when Apple is rumoured to unveil improvements to Siri via iOS 18.
The iPhone maker is training Siri (and the iPhone’s Spotlight search tool) on large language models in order to improve the virtual assistant’s ability to answer more questions accurately, according to the October edition of Mark Gurman’s Bloomberg newsletter PowerOn. A large language model is a specific kind of AI that excels at understanding and producing natural language. With advancements in LLMs, Siri is likely to become more skilled at processing the way people speak. This should not only allow Siri to understand more complex and nuanced questions, but also provide accurate responses. All in all Siri is expected to become a more context-aware and powerful virtual assistant.
Siri may get better at executing multistep tasks
Apart from understanding people better, Siri is also expected to become more capable and efficient in the coming months. Apple plans to use large language models to make Siri smarter, according to a September report from the Information. The article detailed an example explaining how Siri might respond to simple voice commands for more complex tasks, such as turning a set of photos into a GIF and then sending them to one of your contacts, which would be a significant step forward in Siri’s capabilities.
Watch this: iOS 17 Brings Big Changes to Old Habits: Live Voicemail, AirDrop and Siri
Read more: Did You Know Siri Can Do This?
Siri may improve its interactions with the Messages app (an other apps)
Apart from answering questions, the next version of Siri could become better at automatically completing sentences, according to a Bloomberg report published in October.
Thanks to LLMs, which are trained on troves of data, Siri is expected to up its predictive text game. Beyond that, Apple is rumored to be planning to add AI to as many Apple apps as possible which could even include a feature in the Messages app to craft complex messages.
Apple never talks specifics about products before they launch. Since Apple usually unveils new iPhone software features at WWDC in June, we’ll likely know more about iPhone AI plans then.
Editors’ note: CNET is using an AI engine to help create some stories. For more, see this post.
I Took 600+ Photos With the iPhone 15 Pro and Pro Max. Look at My Favorites
See all photos
Apple was the first major tech company to launch a voice assistant back in 2011. But a common criticism of Siri is that it’s struggled to compete with Amazon’s Alexa, Google’s Assistant and, more recently, ChatGPT-powered assistants.
If rumors are right, however, Siri is expected to receive a big brain power boost this year, thanks mostly to AI. The integration of large language models, the technology behind ChatGPT, is poised to transform Siri into what one leaker is envisioning as the “ultimate virtual assistant.”
Read more: Best iPhone to Buy for 2024
In December, Apple published research showing it can make LLM AI models run on-device in a similar way that Qualcomm and MediaTek have done for their chips in Android phones. This may indicate that Siri will get a long-awaited overhaul that iPhone fans have been waiting for, including the ability to chat like ChatGPT.
Only Apple knows what’s next for the iPhone and its other products, but here’s how Siri could change in the iPhone 16.
Siri could improve follow-up requests
Imagine you ask Siri about when the Olympics are taking place. It quickly spits out the correct dates in the summer of this year. But if you follow that up with, “Add it to my calendar,” the virtual assistant tends to respond imperfectly with “What should I call it?” The answer to that question would be obvious to us humans. Even when I responded, “Olympics,” Siri replied, “When should I schedule it for?”
The reason Siri tends to falter is that it lacks contextual awareness. That limits its ability to follow a conversation like a human can. However, that could change in June of this year, when Apple is rumoured to unveil improvements to Siri via iOS 18.
The iPhone maker is training Siri (and the iPhone’s Spotlight search tool) on large language models in order to improve the virtual assistant’s ability to answer more questions accurately, according to the October edition of Mark Gurman’s Bloomberg newsletter PowerOn. A large language model is a specific kind of AI that excels at understanding and producing natural language. With advancements in LLMs, Siri is likely to become more skilled at processing the way people speak. This should not only allow Siri to understand more complex and nuanced questions, but also provide accurate responses. All in all Siri is expected to become a more context-aware and powerful virtual assistant.
Siri may get better at executing multistep tasks
Apart from understanding people better, Siri is also expected to become more capable and efficient in the coming months. Apple plans to use large language models to make Siri smarter, according to a September report from the Information. The article detailed an example explaining how Siri might respond to simple voice commands for more complex tasks, such as turning a set of photos into a GIF and then sending them to one of your contacts, which would be a significant step forward in Siri’s capabilities.
Watch this: iOS 17 Brings Big Changes to Old Habits: Live Voicemail, AirDrop and Siri
Read more: Did You Know Siri Can Do This?
Siri may improve its interactions with the Messages app (an other apps)
Apart from answering questions, the next version of Siri could become better at automatically completing sentences, according to a Bloomberg report published in October.
Thanks to LLMs, which are trained on troves of data, Siri is expected to up its predictive text game. Beyond that, Apple is rumored to be planning to add AI to as many Apple apps as possible which could even include a feature in the Messages app to craft complex messages.
Apple never talks specifics about products before they launch. Since Apple usually unveils new iPhone software features at WWDC in June, we’ll likely know more about iPhone AI plans then.
Editors’ note: CNET is using an AI engine to help create some stories. For more, see this post.
I Took 600+ Photos With the iPhone 15 Pro and Pro Max. Look at My Favorites
See all photosArtificial intelligence is everywhere, whether you realize it or not. It’s behind the chatbots you talk to online, the playlists you stream and the personalized ads that somehow know exactly what you’ve been craving. Now it’s taking on a more public persona: Think Meta AI, showing up in apps like Facebook, Messenger and WhatsApp; or Google’s Gemini, working in the background across the company’s platforms; or Apple Intelligence, just now starting a slow rollout.
AI has a long history, going back to a conference at Dartmouth in 1956 that first discussed artificial intelligence as a thing. Milestones along the way include ELIZA, essentially the first chatbot, developed in 1964 by MIT computer scientist Joseph Weizenbaum, and 2004, when Google’s autocomplete first appeared.
Then came 2022 and ChatGPT’s rise to fame. Generative AI developments and product launches have accelerated rapidly since then, including Google Bard (now Gemini), Microsoft Copilot, IBM Watsonx.ai and Meta’s open-source Llama models.
Let’s break down what generative AI is, how it differs from “regular” artificial intelligence and whether gen AI can live up to the hype.
Generative AI in a nutshell
From talking fridges to iPhones, our experts are here to help make the world a little less complicated.
At its core, generative AI refers to artificial intelligence systems that are designed to produce new content based on patterns and data they’ve learned. Instead of just analyzing numbers or predicting trends, these systems generate creative outputs like text, images music, videos and software code.
Some of the most popular generative AI tools on the market include ChatGPT, Dall-E, Midjourney, Adobe Firefly, Claude and Stable Diffusion.
Foremost among its abilities, ChatGPT can craft human-like conversations or essays based on a few simple prompts. Dall-E and Midjourney create detailed artwork from a short description, while Adobe Firefly focuses on image editing and design.
From talking fridges to iPhones, our experts are here to help make the world a little less complicated.
The AI that’s not generative AI
However, not all AI is generative. While gen AI focuses on creating new content, traditional AI excels at analyzing data and making predictions. This includes technologies like image recognition and predictive text. It is also used for novel solutions in science, medical diagnostics, weather forecasting, fraud detection and financial analyses for forecasting and reporting. The AI that beat human grand champions at chess and the board game Go was not generative AI.
These systems might not be as flashy as gen AI, but classic artificial intelligence is a huge part of the technology we rely on every day.
How generative AI works
Behind the magic of generative AI are large language models and advanced machine learning techniques. These systems are trained on massive amounts of data, such as entire libraries of books, millions of images, years of recorded music and data scraped from the internet.
AI developers, from tech giants to startups, are well aware that AI is only as good as the data you feed it. If it’s fed poor-quality data, AI can produce biased results. It’s something that even the biggest players in the field, like Google, haven’t been immune to.
The AI learns patterns, relationships and structures within this data during training. Then, when prompted, it applies that knowledge to generate something new. For instance, if you ask a gen AI tool to write a poem about the ocean, it’s not just pulling prewritten verses from a database. Instead, it’s using what it learned about poetry, oceans and language structure to create a completely original piece.
It’s impressive, but it’s not perfect. Sometimes the results can feel a little off. Maybe the AI misunderstands your request, or it gets overly creative in ways you didn’t expect. It might confidently provide completely false information, and it’s up to you to fact-check it. Those quirks, often called hallucinations, are part of what makes generative AI both fascinating and frustrating.
Generative AI’s capabilities are growing. It can now understand multiple data types by combining technologies like machine learning, natural language processing and computer vision. The result is called multimodal AI that can integrate some combination of text, images, video and speech within a single framework, offering more contextually relevant and accurate responses. ChatGPT’s Advanced Voice Mode is an example, as is Google’s Project Astra.
Gen AI comes with challenges
There’s no shortage of generative AI tools out there, each with its unique flair. These tools have sparked creativity, but they’ve also raised many questions besides bias and hallucinations — like, who owns the rights to AI-generated content? Or what material is fair game or off-limits for AI companies to use for training their language models — see, for instance, the The New York Times lawsuit against OpenAI and Microsoft.
Other concerns — no small matters — involve privacy, job displacement, accountability in AI and AI-generated deepfakes. Another issue is the impact on the environment because training large AI models uses a lot of energy, leading to big carbon footprints.
The rapid ascent of gen AI in the last couple of years has accelerated worries about the risks of AI in general. Governments are ramping up AI regulations to ensure responsible and ethical development, most notably the European Union’s AI Act.
Generative AI in everyday life
Many people have interacted with chatbots in customer service or used virtual assistants like Siri, Alexa and Google Assistant — which now are on the cusp of becoming gen AI power tools. That, along with apps for ChatGPT, Claude and other new tools, is putting AI in your hands.
Meanwhile, according to McKinsey’s 2024 Global AI Survey, 65% of respondents said their organizations regularly use generative AI, nearly double the figure reported just 10 months earlier. Industries like health care and finance are using gen AI to streamline business operations and automate mundane tasks.
Generative AI isn’t just for techies or creative people. Once you get the knack of giving it prompts, it has the potential to do a lot of the legwork for you in a variety of daily tasks. Let’s say you’re planning a trip. Instead of scrolling through pages of search results, you ask a chatbot to plan your itinerary. Within seconds, you have a detailed plan tailored to your preferences. (That’s the ideal. Please always fact-check its recommendations.) A small business owner who needs a marketing campaign but doesn’t have a design team can use generative AI to create eye-catching visuals and even ask it to suggest ad copy.
Generative AI is here to stay
There hasn’t been a tech advancement that’s caused such a boom since the internet and, later, the iPhone. Despite its challenges, generative AI is undeniably transformative. It’s making creativity more accessible, helping businesses streamline workflows and even inspiring entirely new ways of thinking and solving problems.
But perhaps what’s most exciting is its potential, and we’re just scratching the surface of what these tools can do.
Artificial intelligence is everywhere, whether you realize it or not. It’s behind the chatbots you talk to online, the playlists you stream and the personalized ads that somehow know exactly what you’ve been craving. Now it’s taking on a more public persona: Think Meta AI, showing up in apps like Facebook, Messenger and WhatsApp; or Google’s Gemini, working in the background across the company’s platforms; or Apple Intelligence, just now starting a slow rollout.
AI has a long history, going back to a conference at Dartmouth in 1956 that first discussed artificial intelligence as a thing. Milestones along the way include ELIZA, essentially the first chatbot, developed in 1964 by MIT computer scientist Joseph Weizenbaum, and 2004, when Google’s autocomplete first appeared.
Then came 2022 and ChatGPT’s rise to fame. Generative AI developments and product launches have accelerated rapidly since then, including Google Bard (now Gemini), Microsoft Copilot, IBM Watsonx.ai and Meta’s open-source Llama models.
Let’s break down what generative AI is, how it differs from “regular” artificial intelligence and whether gen AI can live up to the hype.
Generative AI in a nutshell
At its core, generative AI refers to artificial intelligence systems that are designed to produce new content based on patterns and data they’ve learned. Instead of just analyzing numbers or predicting trends, these systems generate creative outputs like text, images music, videos and software code.
Some of the most popular generative AI tools on the market include ChatGPT, Dall-E, Midjourney, Adobe Firefly, Claude and Stable Diffusion.
Foremost among its abilities, ChatGPT can craft human-like conversations or essays based on a few simple prompts. Dall-E and Midjourney create detailed artwork from a short description, while Adobe Firefly focuses on image editing and design.
The AI that’s not generative AI
However, not all AI is generative. While gen AI focuses on creating new content, traditional AI excels at analyzing data and making predictions. This includes technologies like image recognition and predictive text. It is also used for novel solutions in science, medical diagnostics, weather forecasting, fraud detection and financial analyses for forecasting and reporting. The AI that beat human grand champions at chess and the board game Go was not generative AI.
These systems might not be as flashy as gen AI, but classic artificial intelligence is a huge part of the technology we rely on every day.
How generative AI works
Behind the magic of generative AI are large language models and advanced machine learning techniques. These systems are trained on massive amounts of data, such as entire libraries of books, millions of images, years of recorded music and data scraped from the internet.
AI developers, from tech giants to startups, are well aware that AI is only as good as the data you feed it. If it’s fed poor-quality data, AI can produce biased results. It’s something that even the biggest players in the field, like Google, haven’t been immune to.
The AI learns patterns, relationships and structures within this data during training. Then, when prompted, it applies that knowledge to generate something new. For instance, if you ask a gen AI tool to write a poem about the ocean, it’s not just pulling prewritten verses from a database. Instead, it’s using what it learned about poetry, oceans and language structure to create a completely original piece.
It’s impressive, but it’s not perfect. Sometimes the results can feel a little off. Maybe the AI misunderstands your request, or it gets overly creative in ways you didn’t expect. It might confidently provide completely false information, and it’s up to you to fact-check it. Those quirks, often called hallucinations, are part of what makes generative AI both fascinating and frustrating.
Generative AI’s capabilities are growing. It can now understand multiple data types by combining technologies like machine learning, natural language processing and computer vision. The result is called multimodal AI that can integrate some combination of text, images, video and speech within a single framework, offering more contextually relevant and accurate responses. ChatGPT’s Advanced Voice Mode is an example, as is Google’s Project Astra.
Gen AI comes with challenges
There’s no shortage of generative AI tools out there, each with its unique flair. These tools have sparked creativity, but they’ve also raised many questions besides bias and hallucinations — like, who owns the rights to AI-generated content? Or what material is fair game or off-limits for AI companies to use for training their language models — see, for instance, the The New York Times lawsuit against OpenAI and Microsoft.
Other concerns — no small matters — involve privacy, job displacement, accountability in AI and AI-generated deepfakes. Another issue is the impact on the environment because training large AI models uses a lot of energy, leading to big carbon footprints.
The rapid ascent of gen AI in the last couple of years has accelerated worries about the risks of AI in general. Governments are ramping up AI regulations to ensure responsible and ethical development, most notably the European Union’s AI Act.
Generative AI in everyday life
Many people have interacted with chatbots in customer service or used virtual assistants like Siri, Alexa and Google Assistant — which now are on the cusp of becoming gen AI power tools. That, along with apps for ChatGPT, Claude and other new tools, is putting AI in your hands.
Meanwhile, according to McKinsey’s 2024 Global AI Survey, 65% of respondents said their organizations regularly use generative AI, nearly double the figure reported just 10 months earlier. Industries like health care and finance are using gen AI to streamline business operations and automate mundane tasks.
Generative AI isn’t just for techies or creative people. Once you get the knack of giving it prompts, it has the potential to do a lot of the legwork for you in a variety of daily tasks. Let’s say you’re planning a trip. Instead of scrolling through pages of search results, you ask a chatbot to plan your itinerary. Within seconds, you have a detailed plan tailored to your preferences. (That’s the ideal. Please always fact-check its recommendations.) A small business owner who needs a marketing campaign but doesn’t have a design team can use generative AI to create eye-catching visuals and even ask it to suggest ad copy.
Generative AI is here to stay
There hasn’t been a tech advancement that’s caused such a boom since the internet and, later, the iPhone. Despite its challenges, generative AI is undeniably transformative. It’s making creativity more accessible, helping businesses streamline workflows and even inspiring entirely new ways of thinking and solving problems.
But perhaps what’s most exciting is its potential, and we’re just scratching the surface of what these tools can do.
The dizzying explosion of generative artificial intelligence platforms has been the big business story of the past year, but how they’ll make money and how smart companies can use them wisely are the questions that will dominate the next 12 months.
“Students and executives are no longer asking whether we should adopt AI—but rather, when and how to do so,” says Andy Wu, the Arjun and Minoo Melwani Family Associate Professor of Business Administration at Harvard Business School.
Wu’s recent case study and background note, AI Wars and the Generative AI Value Chain, offer a crash course in ChatGPT, Bard, and other AI chatbots—as well as the dueling tech titans behind them—and probe the strategic dilemmas ahead for innovators and users. The public’s fascination with the human-like aspects of chatbots may be overshadowing more fundamental questions about how companies can profit from AI, Wu says.
I think the basic economics of a generative AI are being overlooked.
In an interview, Wu discusses the challenging economics of AI, how business models are likely to differ from traditional software models, and some of the potentially painful tradeoffs ahead for companies such as Google, Microsoft, and others. Wu collaborated on the case study with HBS research associate Matt Higgins; HBS doctoral student Miaomiao Zhang; and Massachusetts Institute of Technology doctoral student Hang Jiang.
Ben Rand: What did you find most surprising in preparing this case and why?
Andy Wu: I think the basic economics of a generative AI are being overlooked. There are significant unanswered questions in terms of how people will actually make money with this technology. Google and OpenAI and others can’t lose money in perpetuity. But it’s not yet obvious to anyone exactly how this will be monetized. At minimum, I can tell you that we are going to need new business models, and the integration of generative AI is going to transform how we monetize software and the business model.
Rand: How so?
Wu: Our notions of fixed cost and variable costs are different here than they were for any other form of computing we’ve lived through in the past. The key insight is that the variable cost of delivering generative AI to an end user is not zero… which means we can’t necessarily be handing out future software-as-a-service applications containing generative AI for free to anyone or even as a paid subscription without usage limits as we are used to today. Usage pricing is going to be much more important.
A second distinction is that a significant portion of the core technology is open source, and a lot of the data being used to train these models is public data and may be copyrighted but is publicly available online. The barriers to entry for AI are not as high as it may seem. So many companies will be in the game, at least for specific vertical AI models and applications.
Rand: Is it too soon to tell which business model will emerge?
Wu: The companies are still trying to figure it out. But I think by their actions, we can get some hints about the direction we’re going to go. The generative AI companies out there are actually pricing on a usage model, which says to me that they don’t think they can make the subscription model work economically today.
Rand: Which companies are in the best position right now?
Wu: A real standout right now is Meta, in terms of fighting hard for a prominent position on the open-source side with their LLaMA model. Prior to last year, many would have assumed that Google would have been the putative leader in the open-source part of the market. Microsoft also deserves a lot of credit for making a decision to work with OpenAI and getting access to leading technology that they can integrate both into their applications and as a way to sell cloud computing services.
But what’s interesting here is that none of the big tech providers are in the business of selling the actual model itself. Amazon largely offers its cloud customers the open-source models that others have made. Meta is largely handing its model out for free (with some limits), and Microsoft outsourced much of the core technology to OpenAI. Looking at these decisions together, they are making a real, albeit subtle, statement about what to avoid, which is actually trying to directly monetize the core technology—the AI model—itself.
The challenge we face right now with AI is it’s very possible that the actual invention of the technology itself is not what people will make money on. It will transform the world, but the money isn’t made on the thing that enabled the transformation.
The issue is, we normally think of intellectual property as being copyrighted or not copyrighted.
Rand: What role will regulation play, do you think?
Wu: I understand the interest of regulators, given the risks of this technology. But it’s going to be very difficult for regulators to come up with a comprehensive policy that controls things in the way that they would aspire to control them. That comes down to one principal factor, and that is that the barrier to entry is not that high. There are already a significant amount of open-source models that you or I could build on.
So, let’s say we want to block AI from generating hate speech. To the extent that there is a market for hate speech, some entrepreneur will be able to build that model. It’s hard to exactly figure out how you would block it. If there is a market, someone will figure out how to do it.
Rand: Are there some areas where regulation may be useful?
Wu: Copyright law is one area they can address. The issue is, we normally think of intellectual property as being copyrighted or not copyrighted. But I like to teach my students that we’re in a new world. There’s copyrighted and not copyrighted, and then there’s also public and private. And so, the issue right now is that you can have copyrighted data that is also public. For instance, any newspaper publisher that allows their news articles to be indexed by search engines has put their intellectual property in this situation of being copyrighted but also public.
What do those creators do about their data? You can say it’s copyrighted, and say other people can’t use it, but you can’t really run around proving that all these different models are being trained using your data. This is something that regulators will have to clarify and also something that companies are taking note of themselves, particularly in the music space and image space.
To the extent that maybe you don’t want an AI offering now, but you want one five years from now, the effort of building a centralized data store for all that data will be important.
Rand: With so much to consider, how can managers stay on top of the developments in AI? They seem to be changing so fast.
Wu: I would advise managers not to play for the next year. Play for the next 10 years. The idea is that you want to have people inside your company who are on top of the different technologies and experimenting with different things. You need to give them a pathway to communicate with the CEO and top management team about which technologies to invest in.
The corollary to that is the integration of data across business units in a company. It’s not being done well enough right now, based on my experience. Companies already have a fairly complicated portfolio of different databases and enterprise products that store their data. That data increasingly needs to be kept track of, and, ideally, integrated. And so, to the extent that maybe you don’t want an AI offering now, but you want one five years from now, the effort of building a centralized data store for all that data will be important.
Rand: What are some best practices of things companies need to be doing now?
Wu: If you’re a company thinking about implementing AI, there are different levels of sophistication to consider. You could wait for other companies to develop the relevant applications, or you could buy an API and build your own application, or you could actually train your own model and then build your own application. And I think companies need to begin the process now of identifying what level of sophistication they want. For example, one early leader in this process is Bloomberg, where they have already gone ahead and built BloombergGPT, a large-language model tailored for financial tasks. They used their own proprietary data to finetune an open-source model. For a company like Bloomberg, providing financial insights is mission critical, and so they cannot wait around for someone else to develop that AI model and application.
You Might Also Like:
Feedback or ideas to share? Email the Working Knowledge team at [email protected].
Image: AI Generation
The dizzying explosion of generative artificial intelligence platforms has been the big business story of the past year, but how they’ll make money and how smart companies can use them wisely are the questions that will dominate the next 12 months.
“Students and executives are no longer asking whether we should adopt AI—but rather, when and how to do so,” says Andy Wu, the Arjun and Minoo Melwani Family Associate Professor of Business Administration at Harvard Business School.
Wu’s recent case study and background note, AI Wars and the Generative AI Value Chain, offer a crash course in ChatGPT, Bard, and other AI chatbots—as well as the dueling tech titans behind them—and probe the strategic dilemmas ahead for innovators and users. The public’s fascination with the human-like aspects of chatbots may be overshadowing more fundamental questions about how companies can profit from AI, Wu says.
I think the basic economics of a generative AI are being overlooked.
In an interview, Wu discusses the challenging economics of AI, how business models are likely to differ from traditional software models, and some of the potentially painful tradeoffs ahead for companies such as Google, Microsoft, and others. Wu collaborated on the case study with HBS research associate Matt Higgins; HBS doctoral student Miaomiao Zhang; and Massachusetts Institute of Technology doctoral student Hang Jiang.
Ben Rand: What did you find most surprising in preparing this case and why?
Andy Wu: I think the basic economics of a generative AI are being overlooked. There are significant unanswered questions in terms of how people will actually make money with this technology. Google and OpenAI and others can’t lose money in perpetuity. But it’s not yet obvious to anyone exactly how this will be monetized. At minimum, I can tell you that we are going to need new business models, and the integration of generative AI is going to transform how we monetize software and the business model.
Rand: How so?
Wu: Our notions of fixed cost and variable costs are different here than they were for any other form of computing we’ve lived through in the past. The key insight is that the variable cost of delivering generative AI to an end user is not zero… which means we can’t necessarily be handing out future software-as-a-service applications containing generative AI for free to anyone or even as a paid subscription without usage limits as we are used to today. Usage pricing is going to be much more important.
A second distinction is that a significant portion of the core technology is open source, and a lot of the data being used to train these models is public data and may be copyrighted but is publicly available online. The barriers to entry for AI are not as high as it may seem. So many companies will be in the game, at least for specific vertical AI models and applications.
Rand: Is it too soon to tell which business model will emerge?
Wu: The companies are still trying to figure it out. But I think by their actions, we can get some hints about the direction we’re going to go. The generative AI companies out there are actually pricing on a usage model, which says to me that they don’t think they can make the subscription model work economically today.
Rand: Which companies are in the best position right now?
Wu: A real standout right now is Meta, in terms of fighting hard for a prominent position on the open-source side with their LLaMA model. Prior to last year, many would have assumed that Google would have been the putative leader in the open-source part of the market. Microsoft also deserves a lot of credit for making a decision to work with OpenAI and getting access to leading technology that they can integrate both into their applications and as a way to sell cloud computing services.
But what’s interesting here is that none of the big tech providers are in the business of selling the actual model itself. Amazon largely offers its cloud customers the open-source models that others have made. Meta is largely handing its model out for free (with some limits), and Microsoft outsourced much of the core technology to OpenAI. Looking at these decisions together, they are making a real, albeit subtle, statement about what to avoid, which is actually trying to directly monetize the core technology—the AI model—itself.
The challenge we face right now with AI is it’s very possible that the actual invention of the technology itself is not what people will make money on. It will transform the world, but the money isn’t made on the thing that enabled the transformation.
The issue is, we normally think of intellectual property as being copyrighted or not copyrighted.
Rand: What role will regulation play, do you think?
Wu: I understand the interest of regulators, given the risks of this technology. But it’s going to be very difficult for regulators to come up with a comprehensive policy that controls things in the way that they would aspire to control them. That comes down to one principal factor, and that is that the barrier to entry is not that high. There are already a significant amount of open-source models that you or I could build on.
So, let’s say we want to block AI from generating hate speech. To the extent that there is a market for hate speech, some entrepreneur will be able to build that model. It’s hard to exactly figure out how you would block it. If there is a market, someone will figure out how to do it.
Rand: Are there some areas where regulation may be useful?
Wu: Copyright law is one area they can address. The issue is, we normally think of intellectual property as being copyrighted or not copyrighted. But I like to teach my students that we’re in a new world. There’s copyrighted and not copyrighted, and then there’s also public and private. And so, the issue right now is that you can have copyrighted data that is also public. For instance, any newspaper publisher that allows their news articles to be indexed by search engines has put their intellectual property in this situation of being copyrighted but also public.
What do those creators do about their data? You can say it’s copyrighted, and say other people can’t use it, but you can’t really run around proving that all these different models are being trained using your data. This is something that regulators will have to clarify and also something that companies are taking note of themselves, particularly in the music space and image space.
To the extent that maybe you don’t want an AI offering now, but you want one five years from now, the effort of building a centralized data store for all that data will be important.
Rand: With so much to consider, how can managers stay on top of the developments in AI? They seem to be changing so fast.
Wu: I would advise managers not to play for the next year. Play for the next 10 years. The idea is that you want to have people inside your company who are on top of the different technologies and experimenting with different things. You need to give them a pathway to communicate with the CEO and top management team about which technologies to invest in.
The corollary to that is the integration of data across business units in a company. It’s not being done well enough right now, based on my experience. Companies already have a fairly complicated portfolio of different databases and enterprise products that store their data. That data increasingly needs to be kept track of, and, ideally, integrated. And so, to the extent that maybe you don’t want an AI offering now, but you want one five years from now, the effort of building a centralized data store for all that data will be important.
Rand: What are some best practices of things companies need to be doing now?
Wu: If you’re a company thinking about implementing AI, there are different levels of sophistication to consider. You could wait for other companies to develop the relevant applications, or you could buy an API and build your own application, or you could actually train your own model and then build your own application. And I think companies need to begin the process now of identifying what level of sophistication they want. For example, one early leader in this process is Bloomberg, where they have already gone ahead and built BloombergGPT, a large-language model tailored for financial tasks. They used their own proprietary data to finetune an open-source model. For a company like Bloomberg, providing financial insights is mission critical, and so they cannot wait around for someone else to develop that AI model and application.
You Might Also Like:
Feedback or ideas to share? Email the Working Knowledge team at [email protected].
Image: AI Generation
Do you find it difficult to keep up with the latest ML research? Are you overwhelmed with the massive amount of papers about LLMs, vector databases, or RAGs?
In this post, I will show how to build an AI assistant that mines this large amount of information easily. You’ll ask it your questions in natural language and it’ll answer according to relevant papers it finds on Papers With Code.
On the backend side, this assistant will be powered with a Retrieval Augmented Generation (RAG) framework that relies on a scalable serverless vector database, an embedding model from VertexAI, and an LLM from OpenAI.
On the front-end side, this assistant will be integrated into an interactive and easily deployable web application built with Streamlit.
Every step of this process will be detailed below with an accompanying source code that you can reuse and adapt👇.
Ready? Let’s dive in 🔍.
If you’re interested in ML content, detailed tutorials, and practical tips from the industry, follow my newsletter. It’s called The Tech Buffet.
Do you find it difficult to keep up with the latest ML research? Are you overwhelmed with the massive amount of papers about LLMs, vector databases, or RAGs?
In this post, I will show how to build an AI assistant that mines this large amount of information easily. You’ll ask it your questions in natural language and it’ll answer according to relevant papers it finds on Papers With Code.
On the backend side, this assistant will be powered with a Retrieval Augmented Generation (RAG) framework that relies on a scalable serverless vector database, an embedding model from VertexAI, and an LLM from OpenAI.
On the front-end side, this assistant will be integrated into an interactive and easily deployable web application built with Streamlit.
Every step of this process will be detailed below with an accompanying source code that you can reuse and adapt👇.
Ready? Let’s dive in 🔍.
If you’re interested in ML content, detailed tutorials, and practical tips from the industry, follow my newsletter. It’s called The Tech Buffet.