In December, Apple published research showing it can make LLM AI models run on-device in a similar way that Qualcomm and MediaTek have done for their chips in Android phones. This may indicate that Siri will get a long-awaited overhaul that iPhone fans have been waiting for, including the ability to chat like ChatGPT.
Only Apple knows what’s next for the iPhone and its other products, but here’s how Siri could change in the iPhone 16.
Siri could improve follow-up requests
Imagine you ask Siri about when the Olympics are taking place. It quickly spits out the correct dates in the summer of this year. But if you follow that up with, “Add it to my calendar,” the virtual assistant tends to respond imperfectly with “What should I call it?” The answer to that question would be obvious to us humans. Even when I responded, “Olympics,” Siri replied, “When should I schedule it for?”
The reason Siri tends to falter is that it lacks contextual awareness. That limits its ability to follow a conversation like a human can. However, that could change in June of this year, when Apple is rumoured to unveil improvements to Siri via iOS 18.
The iPhone maker is training Siri (and the iPhone’s Spotlight search tool) on large language models in order to improve the virtual assistant’s ability to answer more questions accurately, according to the October edition of Mark Gurman’s Bloomberg newsletter PowerOn. A large language model is a specific kind of AI that excels at understanding and producing natural language. With advancements in LLMs, Siri is likely to become more skilled at processing the way people speak. This should not only allow Siri to understand more complex and nuanced questions, but also provide accurate responses. All in all Siri is expected to become a more context-aware and powerful virtual assistant.
Siri may get better at executing multistep tasks
Apart from understanding people better, Siri is also expected to become more capable and efficient in the coming months. Apple plans to use large language models to make Siri smarter, according to a September report from the Information. The article detailed an example explaining how Siri might respond to simple voice commands for more complex tasks, such as turning a set of photos into a GIF and then sending them to one of your contacts, which would be a significant step forward in Siri’s capabilities.
Watch this: iOS 17 Brings Big Changes to Old Habits: Live Voicemail, AirDrop and Siri
Siri may improve its interactions with the Messages app (an other apps)
Apart from answering questions, the next version of Siri could become better at automatically completing sentences, according to a Bloomberg report published in October.
Thanks to LLMs, which are trained on troves of data, Siri is expected to up its predictive text game. Beyond that, Apple is rumored to be planning to add AI to as many Apple apps as possible which could even include a feature in the Messages app to craft complex messages.
Apple never talks specifics about products before they launch. Since Apple usually unveils new iPhone software features at WWDC in June, we’ll likely know more about iPhone AI plans then.
Editors’ note: CNET is using an AI engine to help create some stories. For more, see this post.
I Took 600+ Photos With the iPhone 15 Pro and Pro Max. Look at My Favorites
AI has a long history, going back to a conference at Dartmouth in 1956 that first discussed artificial intelligence as a thing. Milestones along the way include ELIZA, essentially the first chatbot, developed in 1964 by MIT computer scientist Joseph Weizenbaum, and 2004, when Google’s autocomplete first appeared.
Then came 2022 and ChatGPT’s rise to fame. Generative AI developments and product launches have accelerated rapidly since then, including Google Bard (now Gemini), Microsoft Copilot, IBM Watsonx.ai and Meta’s open-source Llama models.
Let’s break down what generative AI is, how it differs from “regular” artificial intelligence and whether gen AI can live up to the hype.
Generative AI in a nutshell
From talking fridges to iPhones, our experts are here to help make the world a little less complicated.
At its core, generative AI refers to artificial intelligence systems that are designed to produce new content based on patterns and data they’ve learned. Instead of just analyzing numbers or predicting trends, these systems generate creative outputs like text, images music, videos and software code.
Foremost among its abilities, ChatGPT can craft human-like conversations or essays based on a few simple prompts. Dall-E and Midjourney create detailed artwork from a short description, while Adobe Firefly focuses on image editing and design.
ChatGPT / Screenshot by CNET
From talking fridges to iPhones, our experts are here to help make the world a little less complicated.
The AI that’s not generative AI
However, not all AI is generative. While gen AI focuses on creating new content, traditional AI excels at analyzing data and making predictions. This includes technologies like image recognition and predictive text. It is also used for novel solutions in science, medical diagnostics, weather forecasting, fraud detection and financial analyses for forecasting and reporting. The AI that beat human grand champions at chess and the board game Go was not generative AI.
These systems might not be as flashy as gen AI, but classic artificial intelligence is a huge part of the technology we rely on every day.
How generative AI works
Behind the magic of generative AI are large language models and advanced machine learning techniques. These systems are trained on massive amounts of data, such as entire libraries of books, millions of images, years of recorded music and data scraped from the internet.
AI developers, from tech giants to startups, are well aware that AI is only as good as the data you feed it. If it’s fed poor-quality data, AI can produce biased results. It’s something that even the biggest players in the field, like Google, haven’t been immune to.
The AI learns patterns, relationships and structures within this data during training. Then, when prompted, it applies that knowledge to generate something new. For instance, if you ask a gen AI tool to write a poem about the ocean, it’s not just pulling prewritten verses from a database. Instead, it’s using what it learned about poetry, oceans and language structure to create a completely original piece.
ChatGPT / Screenshot by CNET
It’s impressive, but it’s not perfect. Sometimes the results can feel a little off. Maybe the AI misunderstands your request, or it gets overly creative in ways you didn’t expect. It might confidently provide completely false information, and it’s up to you to fact-check it. Those quirks, often called hallucinations, are part of what makes generative AI both fascinating and frustrating.
Generative AI’s capabilities are growing. It can now understand multiple data types by combining technologies like machine learning, natural language processing and computer vision. The result is called multimodal AI that can integrate some combination of text, images, video and speech within a single framework, offering more contextually relevant and accurate responses. ChatGPT’s Advanced Voice Mode is an example, as is Google’s Project Astra.
Gen AI comes with challenges
There’s no shortage of generative AI tools out there, each with its unique flair. These tools have sparked creativity, but they’ve also raised many questions besides bias and hallucinations — like, who owns the rights to AI-generated content? Or what material is fair game or off-limits for AI companies to use for training their language models — see, for instance, the The New York Times lawsuit against OpenAI and Microsoft.
Other concerns — no small matters — involve privacy, job displacement, accountability in AI and AI-generated deepfakes. Another issue is the impact on the environment because training large AI models uses a lot of energy, leading to big carbon footprints.
The rapid ascent of gen AI in the last couple of years has accelerated worries about the risks of AI in general. Governments are ramping up AI regulations to ensure responsible and ethical development, most notably the European Union’s AI Act.
Generative AI in everyday life
Many people have interacted with chatbots in customer service or used virtual assistants like Siri, Alexa and Google Assistant — which now are on the cusp of becoming gen AI power tools. That, along with apps for ChatGPT, Claude and other new tools, is putting AI in your hands.
Meanwhile, according to McKinsey’s 2024 Global AI Survey, 65% of respondents said their organizations regularly use generative AI, nearly double the figure reported just 10 months earlier. Industries like health care and finance are using gen AI to streamline business operations and automate mundane tasks.
Generative AI isn’t just for techies or creative people. Once you get the knack of giving it prompts, it has the potential to do a lot of the legwork for you in a variety of daily tasks. Let’s say you’re planning a trip. Instead of scrolling through pages of search results, you ask a chatbot to plan your itinerary. Within seconds, you have a detailed plan tailored to your preferences. (That’s the ideal. Please always fact-check its recommendations.) A small business owner who needs a marketing campaign but doesn’t have a design team can use generative AI to create eye-catching visuals and even ask it to suggest ad copy.
ChatGPT / Screenshot by CNET
Generative AI is here to stay
There hasn’t been a tech advancement that’s caused such a boom since the internet and, later, the iPhone. Despite its challenges, generative AI is undeniably transformative. It’s making creativity more accessible, helping businesses streamline workflows and even inspiring entirely new ways of thinking and solving problems.
But perhaps what’s most exciting is its potential, and we’re just scratching the surface of what these tools can do.
Artificial intelligence is everywhere, whether you realize it or not. It’s behind the chatbots you talk to online, the playlists…
OpenAI introduced ChatGPT in November 2022, sparking a tremendous amount of interest in artificial intelligence. ChatGPT gained so much attention that generative AI (GenAI) became a dominant theme in the tech world in 2023.
Microsoft backed OpenAI at the start of 2023 by pledging a multimillion-dollar, multiyear investment to accelerate OpenAI’s development of its AI technology.
Google made its GenAI move in March 2023 with Bard. In February 2024, Google rebranded Bard as Gemini when it debuted an improved version of the AI chatbot.
ChatGPT and Gemini are largely responsible for the considerable buzz around GenAI, which uses data from machine learning models to answer questions and create images, text and videos. OpenAI and Google are continuously improving the large language models (LLMs) behind ChatGPT and Gemini to give them a greater ability to generate human-like text.
GenAI is still rapidly evolving, and models don’t always return correct answers. Despite the common occurrence of AI hallucinations — wrong answers generated by AI — in both ChatGPT and Gemini, the tools are being adopted by businesses and consumers seeking to automate time-consuming tasks.
What is ChatGPT?
ChatGPT is the AI-powered chatbot that made GenAI the hot technology of 2023. According to OpenAI CEO Sam Altman, ChatGPT reached 1 million users within five days of its release on Nov. 30, 2022.
Generative Pre-trained Transformer, the model ChatGPT is based on, finds patterns within data sequences. Its AI language model produces responses to user queries and serves as the interface that lets users communicate with the language model. As of May 2024, GPT-4o is an available default in the free version of ChatGPT. Users can still choose to use GPT-3.5, which was the previous default. A more robust access to GPT-4o as well as GPT-4 is available in the paid subscription versions of ChatGPT Plus, ChatGPT Team and ChatGPT Enterprise. GPT-4 was generally considered the most advanced GenAI model when it became available, but Google Gemini Advanced provided it with a formidable rival.
Popular applications for ChatGPT include content generation of emails, social media posts and blogs; text summarization; language translation; code generation; learning and education; building virtual assistants; simulation and training; research assistance; and building games and other entertainment applications.
ChatGPT is multimodal, meaning users can use images and voice to prompt the chatbot. ChatGPT Voice — available on iOS and Android phones — lets users hold conversations with ChatGPT, which can respond in one of five AI-generated voices.
ChatGPT and ChatGPT Plus are targeted at individual users. The free version of ChatGPT is available through web browsers and mobile devices. Developers can also embed ChatGPT APIs in their software applications for their users to access.
ChatGPT Plus costs $20 per user, per month. The full version of GPT-4o, used in ChatGPT Plus, responds faster than previous versions of GPT; is more accurate; and includes features such as advanced data analysis. GPT-4o can also create more detailed responses and is faster at tasks such as describing photos and writing image captions. And while GPT-3.5 was only trained on data up to January 2022, GPT-4o has been trained on data up to October 2023.
Another advantage of a ChatGPT Plus subscription is that it guarantees ChatGPT access even during peak usage times. Response times for free ChatGPT are limited by bandwidth and availability. ChatGPT Plus also provides integrated access to OpenAI’s Dall-E 3 text to image GenAI model.
OpenAI sells ChatGPT Team and ChatGPT Enterprise to businesses. ChatGPT Team is available for $25 per user, per month billed annually. It includes everything in ChatGPT Plus but allows more messages during a defined time limit. It can also share GPTs with other workers, has a faster response time than ChatGPT Plus and includes an admin console. ChatGPT Enterprise has unlimited high-speed access to GPT-4; more advanced administration, customer support and analytics capabilities; expanded content windows for longer inputs; and has the fastest response time of all the ChatGPT versions. ChatGPT Enterprise pricing varies depending on usage.
What is Google Gemini?
Gemini is Google’s GenAI model that was built by the Google DeepMind AI research library. The Gemini AI model powered Google’s Bard GenAI tool that launched in March 2023. Google rebranded Bard as Gemini in February 2024, several months after launching Gemini Advanced based on its new Ultra 1.0 LLM foundation. In May 2024, Google first offered users of Gemini Advanced access to the newer Gemini 1.5 Pro model.
Gemini is designed to retrieve information as a simple answer, similar to the way smart assistants like Alexa and Siri work. It uses LLMs to reply to prompts with information it has already learned or can retrieve from other Google services.
Google Gemini is multimodal — it understands audio, video and computer code as well as text. Google has paused Gemini’s image generation feature because of inaccuracies, however. Google’s statement disclosing the pause pledged to re-release an improved image generation feature soon.
Gemini’s capabilities are integrated into Google’s search engine and available in Google Workspace apps such as Docs, Gmail, Sheets, Slides and Meet. Gemini for Google Workspace is the new name for Duet AI for Google Workspace, which was Google’s answer to the Microsoft Copilot AI assistant. Google Gemini is available through an app on Android phones and in the Google app on iOS.
Gemini Advanced is part of the Google One AI Premium plan subscription service that costs $19.99 per month in the United States. Google One AI Premium also includes 2 TB of storage.
Gemini Advanced is a more powerful AI version than Gemini Pro, which remains available for free. Gemini Advanced with Gemini Pro 1.5 provides a large context window of 1 million tokens, enabling analysis of larger data sets.
Google suggests Gemini Pro and its AI capabilities is the better choice for development, research and creation tasks, and if you’re looking for a free chatbot. It brings AI to simple tasks for personal use. For those willing to pay the subscription fee, Google recommends Gemini Advanced for professional applications, more demanding workflows, enhanced performance and more cutting-edge capabilities. Google Advanced will also include early access to new features.
Gemini Nano, another part of the Google Gemini family, is used in devices such as Google’s Pixel 8 Pro smartphones.
A snapshot of the differences between ChatGPT and Gemini.
What are the main differences between Gemini and ChatGPT?
ChatGPT and Google Gemini have become increasingly similar. Both have a free service, a nearly identically priced subscription service, and similar interfaces and use cases. The differences are largely under the hood — in their language models.
They’re also used for many similar functions, and work by users typing in a query to get a response. Both raise privacy concerns about how user data can be used. However, they differ in their training models, data sources, user experiences and how they store data.
Training models
ChatGPT is built on OpenAI’s GPT-3.5 or GPT-4. Gemini has three sizes: Gemini Pro for a wide range of tasks, Gemini Ultra for highly complex tasks, and Gemini Nano for mobile devices. Gemini Pro 1.5, which powers the subscription Gemini Advanced version, is faster and more advanced than the model used for the free Gemini service.
Data sources
The main difference between ChatGPT and Gemini is the data sources used to train their LLMs. GPT-4o uses predefined data that goes up to October 2023. Gemini draws on data pulled from the internet in real time. It is tuned to select data chosen from sources that fit specific topics such as coding or the latest scientific research.
User experience
ChatGPT users can log onto the free ChatGPT with any email account. ChatGPT also includes an API that developers can use to integrate OpenAI LLMs into third-party software. It lacks a Save button, but users can copy and paste answers from ChatGPT into another application. It does have an Archive button that can list previous responses in ChatGPT’s left-hand pane for quick retrieval.
Because ChatGPT is text-based, it can’t include images, videos, charts or links in its answers. It also lacks the ability to search the internet.
Because of OpenAI’s close partnership with Microsoft, ChatGPT can be used through Windows apps such as Word, Excel, PowerPoint and Outlook. Also, Microsoft’s Copilot AI assistants use the GPT-4 language model.
Gemini Pro’s interface gives users a chance to like or dislike a response, opt to modify the size or tone of the response, share or fact-check the response, or export it to Google Docs or Gmail. Gemini also has a “review other drafts” option that shows alternate versions of its answer. Gemini also lets users upload images, but its ability to create images is on hold until Google improves that feature.
Data storage and privacy
Both ChatGPT and Google Gemini store user data.
ChatGPT stores all prompts and queries entered. Users can review previous conversations through its archive feature. Although users can delete responses and conversations, the chatbot might continue to use these responses in its LLM for training. This raises privacy concerns when users enter personal data or proprietary information. OpenAI also discloses that ChatGPT gathers geolocation data, network activity, contact details such as email addresses and phone numbers, and device information.
According to OpenAI’s privacy policy, it collects any personal information a user provides. This includes account information such as name, contact information, payment card information and transaction history. OpenAI also might disclose geolocation data to third parties such as vendors and service providers, and to law enforcement agencies if required to do so by law.
OpenAI said the user retains ownership rights of input data and owns the output, but it “may use Content to provide, maintain, develop, and improve our Services, comply with applicable law, enforce our terms and policies, and keep our Services safe.”
Gemini stores conversations in a user’s Google account for 18 months, but users can change the retention period to three months or 36 months in their activity settings. Gemini conversations can also appear in searches, raising privacy concerns.
Google discloses that it collects conversations, location, feedback and usage information. The Google Privacy Policy claims Google uses collected data to develop, provide, maintain and improve services, and to provide personal services such as content and ads. Customers can delete information from their account using My Google Activity, or by deleting Google products or their Google accounts.
Google said it will share information to third parties with user consent and law enforcement when required.
Which chatbot is better?
There is a bit of a GenAI arms race going on now, with OpenAI and Google making updates to their models. Google has been especially aggressive, perhaps because ChatGPT came out first and Gemini must play catch-up. With each new version of the LLMs, Google and OpenAI make significant gains over their previous versions.
Generally, ChatGPT is considered the best option for text-based tasks while Gemini is the best choice for multimedia content. However, there are other considerations, as noted in earlier sections of this article. Users can try the free versions to determine which works better for them.
There have been several in-depth reviews about the chatbots worth noting:
Researchers from Carnegie Mellon University and BerriAI benchmarked Gemini Pro against GPT-3 and GPT-4 on 10 diverse language tasks with the goal of providing an impartial in-depth analysis. They found Gemini’s strengths included performance on long, complex reasoning chains and translating into non-English languages. On the downside, it struggled with mathematical reasoning — especially with large numbers — showed bias on multiple choice questions and aggressive content filtering blocked many responses. In summary, the researchers concluded Gemini Pro did not match GPT-3 and GPT-4, but “exhibits strengths in handling complexity and reasoning depth.”
Ethan Mollick, an associate professor who studies AI at the Wharton School of the University of Pennsylvania, performed what he called “tasting notes” of Gemini Advanced vs. GPT-4. Mollick concluded that Gemini Advanced is the first advanced AI model that can compete with GPT-4. He said each has its strengths and weaknesses — for example, GPT-4 uses code in a more sophisticated way and is better at hard verbal tasks while Gemini is better at explanations and search. But both “are weird and inconsistent and hallucinate more than you would like.”
Bernard Marr, a futurist and author of Generative AI in Practice, pointed out in a Forbes article that ChatGPT is designed to be more conversational while Gemini processes information and automates tasks more efficiently. Marr’s conclusion after using ChatGPT and Gemini is that ChatGPT-4 is the more powerful chat interface but “Gemini is closing the gap …”.
Neither ChatGPT nor Gemini are perfect, and their developers admit that. Both generate hallucinations and even warn users of that in their responses.
Both of the chatbots include a disclaimer on the bottom of their prompt screens. Gemini’s reads: “Gemini may display inaccurate info, including about people, so double-check its responses.” ChatGPT advises: “ChatGPT can make mistakes. Consider checking important information.”
The Gemini FAQ on Google’s website offers this valuable advice that can apply to all AI tools:
Gemini can’t replace important people in your life, like family, friends, teachers or doctors.
Gemini can’t do your work for you.
Gemini can’t make important life decisions for you.
Generative AI alternatives
GenAI is a fast-moving technology. Besides the updates to ChatGPT and Google Gemini, other companies are working on AI projects. These include AI21 Labs’ Wordtune, Anthropic’s Claude, Glean, Jasper, Open Assistant and Writesonic’s Chatsonic. China’s Baidu search engine uses AI with an application called Ernie Bot. Many productivity applications and SaaS products also incorporate GenAI assistants.
Comparison of ChatGPT vs. Gemini responses
We asked ChatGPT 3.5 and Google Gemini Pro the same requests and prompts to see how their responses would compare. The results are as follows:
Idea generation
Prompt: What are the five hottest IT trends an IT professional should know about?
ChatGPT’s idea generation response to the five hottest IT trends.
Gemini’s idea generation response to the five hottest IT trends.
Thoughts: ChatGPT’s answers were more general while Gemini drilled down into specific areas — for example, generative AI vs. AI/ML and cybersecurity mesh vs. cyber security. ChatGPT’s inability to reference data past January 2022 limits its effectiveness when looking for trending information. Gemini snuck in a few extras under “Bonus trends.”
Creating content
Prompt: Write a two-paragraph summary explaining cyber-resiliency challenges.
ChatGPT’s content generation response to explain cyber-resiliency challenges.
Gemini’s content generation response to explain cyber-resiliency challenges.
Thoughts: Both did a good job of explaining and summarizing a complex issue in two paragraphs, but Gemini included more specifics about the challenges and what can be done about them.
Planning
Prompt: What are the best cloud computing conferences to attend?
ChatGPT’s planning response for the best cloud computing conferences to attend.
Gemini’s planning response for the best cloud computing conferences to attend.
Thoughts: ChatGPT listed more conferences, but its list was a bit dated as several of its conferences have been renamed. Gemini offered greater detail and broke its list into specific areas of expertise.
Developer assistance
Prompt: List 10 frequently used SQL queries for querying a PostgreSQL database.
ChatGPT’s response for developer assistance on frequently used SQL queries for querying a PostgreSQL database.
Gemini’s response for developer assistance on frequently used SQL queries for querying a PostgreSQL database.
Thoughts: The lists were similar, although they used different terms in some cases. A nice feature was the code embedded in the responses. We shortened Gemini’s response to fit on one page, but its longer version included embedded code.
Dave Raffo is an independent IT analyst and journalist. He previously worked as a senior analyst at The Futurum Group and Evaluator Group, covering integrated systems, software-defined storage, container storage, public cloud storage and as-a-service offerings. He previously worked at TechTarget from 2007 to 2021 as executive news director and editorial director for its storage coverage, and he was a technology journalist for 30 years.
OpenAI introduced ChatGPT in November 2022, sparking a tremendous amount of interest in artificial intelligence. ChatGPT gained so much…