On the backend side, this assistant will be powered with a Retrieval Augmented Generation (RAG) framework that relies on a scalable serverless vector database, an embedding model from VertexAI, and an LLM from OpenAI.
On the front-end side, this assistant will be integrated into an interactive and easily deployable web application built with Streamlit.
Every step of this process will be detailed below with an accompanying source code that you can reuse and adapt👇.
Ready? Let’s dive in 🔍.
If you’re interested in ML content, detailed tutorials, and practical tips from the industry, follow my newsletter. It’s called The Tech Buffet.
Do you find it difficult to keep up with the latest ML research? Are you overwhelmed with the massive amount…
AI has a long history, going back to a conference at Dartmouth in 1956 that first discussed artificial intelligence as a thing. Milestones along the way include ELIZA, essentially the first chatbot, developed in 1964 by MIT computer scientist Joseph Weizenbaum, and 2004, when Google’s autocomplete first appeared.
Then came 2022 and ChatGPT’s rise to fame. Generative AI developments and product launches have accelerated rapidly since then, including Google Bard (now Gemini), Microsoft Copilot, IBM Watsonx.ai and Meta’s open-source Llama models.
Let’s break down what generative AI is, how it differs from “regular” artificial intelligence and whether gen AI can live up to the hype.
Generative AI in a nutshell
From talking fridges to iPhones, our experts are here to help make the world a little less complicated.
At its core, generative AI refers to artificial intelligence systems that are designed to produce new content based on patterns and data they’ve learned. Instead of just analyzing numbers or predicting trends, these systems generate creative outputs like text, images music, videos and software code.
Foremost among its abilities, ChatGPT can craft human-like conversations or essays based on a few simple prompts. Dall-E and Midjourney create detailed artwork from a short description, while Adobe Firefly focuses on image editing and design.
ChatGPT / Screenshot by CNET
From talking fridges to iPhones, our experts are here to help make the world a little less complicated.
The AI that’s not generative AI
However, not all AI is generative. While gen AI focuses on creating new content, traditional AI excels at analyzing data and making predictions. This includes technologies like image recognition and predictive text. It is also used for novel solutions in science, medical diagnostics, weather forecasting, fraud detection and financial analyses for forecasting and reporting. The AI that beat human grand champions at chess and the board game Go was not generative AI.
These systems might not be as flashy as gen AI, but classic artificial intelligence is a huge part of the technology we rely on every day.
How generative AI works
Behind the magic of generative AI are large language models and advanced machine learning techniques. These systems are trained on massive amounts of data, such as entire libraries of books, millions of images, years of recorded music and data scraped from the internet.
AI developers, from tech giants to startups, are well aware that AI is only as good as the data you feed it. If it’s fed poor-quality data, AI can produce biased results. It’s something that even the biggest players in the field, like Google, haven’t been immune to.
The AI learns patterns, relationships and structures within this data during training. Then, when prompted, it applies that knowledge to generate something new. For instance, if you ask a gen AI tool to write a poem about the ocean, it’s not just pulling prewritten verses from a database. Instead, it’s using what it learned about poetry, oceans and language structure to create a completely original piece.
ChatGPT / Screenshot by CNET
It’s impressive, but it’s not perfect. Sometimes the results can feel a little off. Maybe the AI misunderstands your request, or it gets overly creative in ways you didn’t expect. It might confidently provide completely false information, and it’s up to you to fact-check it. Those quirks, often called hallucinations, are part of what makes generative AI both fascinating and frustrating.
Generative AI’s capabilities are growing. It can now understand multiple data types by combining technologies like machine learning, natural language processing and computer vision. The result is called multimodal AI that can integrate some combination of text, images, video and speech within a single framework, offering more contextually relevant and accurate responses. ChatGPT’s Advanced Voice Mode is an example, as is Google’s Project Astra.
Gen AI comes with challenges
There’s no shortage of generative AI tools out there, each with its unique flair. These tools have sparked creativity, but they’ve also raised many questions besides bias and hallucinations — like, who owns the rights to AI-generated content? Or what material is fair game or off-limits for AI companies to use for training their language models — see, for instance, the The New York Times lawsuit against OpenAI and Microsoft.
Other concerns — no small matters — involve privacy, job displacement, accountability in AI and AI-generated deepfakes. Another issue is the impact on the environment because training large AI models uses a lot of energy, leading to big carbon footprints.
The rapid ascent of gen AI in the last couple of years has accelerated worries about the risks of AI in general. Governments are ramping up AI regulations to ensure responsible and ethical development, most notably the European Union’s AI Act.
Generative AI in everyday life
Many people have interacted with chatbots in customer service or used virtual assistants like Siri, Alexa and Google Assistant — which now are on the cusp of becoming gen AI power tools. That, along with apps for ChatGPT, Claude and other new tools, is putting AI in your hands.
Meanwhile, according to McKinsey’s 2024 Global AI Survey, 65% of respondents said their organizations regularly use generative AI, nearly double the figure reported just 10 months earlier. Industries like health care and finance are using gen AI to streamline business operations and automate mundane tasks.
Generative AI isn’t just for techies or creative people. Once you get the knack of giving it prompts, it has the potential to do a lot of the legwork for you in a variety of daily tasks. Let’s say you’re planning a trip. Instead of scrolling through pages of search results, you ask a chatbot to plan your itinerary. Within seconds, you have a detailed plan tailored to your preferences. (That’s the ideal. Please always fact-check its recommendations.) A small business owner who needs a marketing campaign but doesn’t have a design team can use generative AI to create eye-catching visuals and even ask it to suggest ad copy.
ChatGPT / Screenshot by CNET
Generative AI is here to stay
There hasn’t been a tech advancement that’s caused such a boom since the internet and, later, the iPhone. Despite its challenges, generative AI is undeniably transformative. It’s making creativity more accessible, helping businesses streamline workflows and even inspiring entirely new ways of thinking and solving problems.
But perhaps what’s most exciting is its potential, and we’re just scratching the surface of what these tools can do.
Artificial intelligence is everywhere, whether you realize it or not. It’s behind the chatbots you talk to online, the playlists…
How about users? I’m a European who will be able to take advantage of all these DMA-related benefits. I already know I don’t want sideloading on iPhone (or Android, for that matter). But interoperability seems like the dumbest requirement of the DMA, a feature I don’t want to take advantage of in WhatsApp or any competing instant messaging app that might be labeled a gatekeeper.
Meta’s explanation of how WhatsApp interop will work is also the best explanation for the unnecessary interoperability requirement. Why go through all this trouble to fix something that wasn’t broken in the first place?
What is interoperability?
Meta explained in a detailed blog post all the work behind making WhatsApp and Facebook Messenger compatible with competing chat apps that ask to be supported.
Tech. Entertainment. Science. Your inbox.
Sign up for the most interesting tech & entertainment news out there.
That’s what interop hinges on. First, a WhatsApp rival must want their app to work with Meta’s chat platforms. Even if that’s achieved, it’s up to the WhatsApp/Messenger user to choose whether to enable the functionality.
Meta says it wants to preserve end-to-end WhatsApp encryption after interop support arrives. It’ll push WhatsApp and Messenger’s Signal encryption protocol for third-party chat apps. Other alternatives can be accepted if they’re at least as good as Signal.
How will it work?
Meta has been working for two years to implement the changes required by the DMA. But things will not just work out of the box starting Thursday. A competing service must ask for interop support and then wait at least three months for Meta to deploy it.
It might take longer than that for WhatsApp and Messenger to support that service. Rinse and repeat for each additional chat app that wants to work with WhatsApp.
That’s a lot of work right there, both for Meta and WhatsApp competitors. I can’t see how any of this benefits the user. The interop chat experience isn’t worth it to me. Here’s what you’ll get in the first year. Because yes, the DMA has specific requirements in place for what features interop chats should offer:
Interoperability is a technical challenge – even when focused on the basic functionalities as required by the DMA. In year one, the requirement is for 1:1 text messaging between individual users and the sharing of images, voice messages, videos, and other attached files between individual end users. In the future, requirements expand to group functionality and calling.
Thankfully, the DMA also focuses on privacy and security. That’s why WhatsApp and Messenger will focus on ensuring that chats remain end-to-end encrypted. I’ll note that Messenger end-to-end encryption started rolling out months ago, and it might not be available in all markets.
A screenshot from WhatsApp beta 2.24.6.2 shows you can disable interoperability and choose which third-party apps to chat with. Image source: WABetaInfo
Meta’s blog does a great job explaining what’s going on under the hood with interop chats between WhatsApp and third-party apps. It underlines all the massive work and resources Meta is deploying for this.
I’m actually kind of in awe of Meta’s willingness to comply with these DMA provisions. All this effort makes me wonder what Meta can gain from the whole interoperability thing. Maybe the endgame is converting even more users to WhatsApp and Messenger, but I digress. After all, it’s not like Meta could avoid complying with the DMA.
I’ll also say that Meta doesn’t seem to restrict interoperabiltiy to the European Union, as Apple does with iPhone sideloading. Or, at least, restrictions aren’t the focus of this blog, though the title clarifies it’s about chats in Europe: “Making messaging interoperability with third parties safe for users in Europe.”
The obvious warning
While Meta also explains how encryption and user authentication will work, it acknowledges that it’s not in full control. Therefore, it can’t promise the user the same level of security and privacy for Whatsapp interop chats as Whatsapp-to-Whatsapp chats:
It’s important to note that the E2EE promise Meta provides to users of our messaging services requires us to control both the sending and receiving clients. This allows us to ensure that only the sender and the intended recipient(s) can see what has been sent, and that no one can listen to your conversation without both parties knowing.
While we have built a secure solution for interop that uses the Signal Protocol encryption to protect messages in transit, without ownership of both clients (endpoints) we cannot guarantee what a third-party provider does with sent or received messages, and we therefore cannot make the same promise.
[…] users need to know that our security and privacy promise, as well as the feature set, won’t exactly match what we offer in WhatsApp chats.
If you care about WhatsApp interoperability should read the entire blog post at this link. Then promptly disable the feature once WhatsApp informs you that interop support is ready.
It’s March 7th, the big deadline day for the Digital Markets Act (DMA). The law came into effect on Thursday,…
This is a hugely environmentally destructive side to the tech industry. While it has played a big role in reaching net zero, giving us smart meters and efficient solar,it’s critical that we turn the spotlight on its environmental footprint. Large language models such as ChatGPT are some of the most energy-guzzling technologies of all. Research suggests, for instance, that about 700,000 litres of water could have been used to cool the machines that trained ChatGPT-3 at Microsoft’s data facilities. It is hardly news that the tech bubble’s self-glorification has obscured the uglier sides of this industry, from its proclivity for tax avoidance to its invasion of privacy and exploitation of our attention span. The industry’s environmental impact is a key issue, yet the companies that produce such models have stayed remarkably quiet about the amount of energy they consume – probably because they don’t want to spark our concern.
Google’s global datacentre and Meta’s ambitious plans for a new AI Research SuperCluster (RSC) further underscore the industry’s energy-intensive nature, raising concerns that these facilities could significantly increase energy consumption. Additionally, as these companies aim to reduce their reliance on fossil fuels, they may opt to base their datacentres inregions with cheaper electricity, such as the southern US, potentially exacerbating water consumption issues in drier parts of the world. Before making big announcements, tech companies should be transparent about the resource use required for their expansion plans.
Furthermore, while minerals such as lithium and cobalt are most commonly associated with batteries in the motor sector, they are also crucial for the batteries used in datacentres. The extraction process often involves significant water usage and can lead to pollution, undermining water security. The extraction of these minerals are also often linked to human rights violations and poor labour standards. Trying to achieve one climate goal of limiting our dependence on fossil fuelscan compromise another goal, of ensuring everyone has a safe and accessible water supply.
Moreover, when significant energy resources areallocated to tech-related endeavours, it can lead to energy shortages for essential needs such as residential power supply. Recent data from the UK shows that the country’s outdated electricity network is holding back affordable housing projects. This will only get worse as households move away from using fossil fuels and rely more on electricity, putting even more pressure on the National Grid. In Bicester, for instance, plans to build 7,000 new homes were paused because the electricity network didn’t have enough capacity.
In an era where we expect businesses to do more than just make profits for their shareholders, governments need to evaluate the organisations they fund and partner with, based on whether their actions will result in concrete successes for people and the planet. In other words, policy needs to be designed not to pick sectors or technologies as “winners”, but to pick the willing by providing support that is conditional on companies moving in the right direction. Making disclosure of environmental practices and impacts a condition for government support could ensure greater transparency and accountability. Similar measures could promote corporate accountability in global mineral supply chains, enforcing greater human rights compliance.
In navigating the intersection of technological advancement and environmental sustainability, policymakers are facing the challenge of cultivating less extractive business models. This is not just about adopting a piecemeal approach; it’s about taking a comprehensive systematic view, empowering governments to build the needed planning and implementation capacity. Such an approach should eschew outdated top-down methods in favour of flexible strategies that integrate knowledge at all levels, from local to global. Only by adopting a holistic perspective can we effectively mitigate the significant environmental impacts of the tech industry.
Ultimately, despite the unprecedented wave of innovation since the 1990s, we have consistently overlooked the repercussions of these advances on the climate crisis. As climate scientists anticipate that global heating will exceed the 1.5C target, it’s time we approach today’s grand challenges systemically, so that the solution to one problem does not exacerbate another.
When you picture the tech industry, you probably think of things that don’t exist in physical space, such as the…