July 16, 2025
Agentic AI Hallucinations: Outputs that Sound Right but are Wrong
by
Justin Wayman
share on

What’s the current AI landscape?

The use of both gen AI and agentic AI continues to build momentum. According to a recent McKinsey survey, use of AI increased in 2024. In fact, 78% of respondents say their organizations use AI in at least one business function. That’s up from 75% in early  2024 and 55% a year earlier. 

Startups and organizations large and small are moving quickly since early last year. And the technology continues to evolve, with a view toward agentic AI as the next level of AI innovation.

This momentum comes on the heels of the digital transformation trend of the early 2000s, which resulted in widespread adoption of cloud services, data analytics, and digital twins in newly data-driven businesses. So it’s somewhat surprising to see organizations adopting AI experiencing some of the same bumps in the road that occurred during those digital transformation projects. 

Our experience working with clients along with news from the frontlines reveal four best practices that can go a long way toward mitigating problems at the initial, introductory stage:

  • Craft a detailed roadmap.
  • Set clear targets and get organization-wide buy-in.
  • Start out with smaller, well-defined projects that have a clear scope and timeline, and that are measurable.
  • Build the capability early in order to scale smoothly

As much as there’s been momentum, we’ve also seen more cautious enterprise investment. AI-based projects are now being evaluated more deliberately. Some businesses are wary of underestimating the difficulty and expense of integrating AI into existing workflows and legacy infrastructure. Others report they fear reputational risks related to bias or misinformation.

This is particularly true of Agentic AI. In a recent article for Wired, Dave Paresh defined agents as AI programs that can act mostly independently, allowing companies to automate tasks such as answering customer questions or paying invoices. He noted that agents can perceive, reason, and act autonomously. But instances of misinformation or “creative” responses still occur. 

In the meantime, the ability to reason is growing more and more, allowing models to autonomously take actions and complete complex tasks across workflows. This is a profound step forward. As an example, in 2023, an AI bot could support call center representatives by synthesizing and summarizing large volumes of data, including voice messages, text, and technical specs to suggest responses to customer queries. Today, an AI agent can converse with a customer and plan the actions it will take afterward. Examples include processing a payment, checking for fraud, and completing a shipping action.

Software companies are embedding agentic AI capabilities into their core products. For example, Salesforce’s Agentforce is a new layer on its existing platform that enables users to easily build and deploy autonomous AI agents to handle complex tasks across workflows, such as simulating product launches and orchestrating marketing campaigns. Marc Benioff, Salesforce co-founder, chair, and CEO, describes this as providing a “digital workforce where humans and automated agents work together to achieve customer outcomes.” 

What about hallucinations? 

One of the more serious problems with agents, and gen AI, that’s gained recent attention is the tendency toward hallucinations, meaning false, misleading, or entirely fabricated outputs that appear plausible or authoritative. In the 1968 Kubrick film, 2001: A Space Odyssey, the onboard computer called HAL engaged in a form of cognitive hallucination, with disastrous consequences.

The more significant implications pose serious business risks:

  • Financial Risk and Liability: Businesses deploying AI that hallucinates can face lawsuits, compliance violations, and penalties, and lose trust. 
  • Reputational Damage: A single high-profile hallucination such as a fake legal case or false product claim can severely erode customer trust and brand credibility.
  • Operational Disruption: Hallucinations can lead to costly errors in day-to-day operations. Business leaders relying on AI-generated summaries or insights may act on false data.

How do we fix this?

Agentic hallucinations happen when language models generate outputs that sound right but are wrong, authoritative but thoroughly inaccurate. If the original training data contains gaps or contradictions, the AI agent may fill in the blanks creatively. (This tendency toward creativity is close enough to human thought to further ignite critics of AI).

More often than not though, the cause of agentic hallucinations is that the agent is operating in a domain without the correct datasets to support it. Thus it’s more likely to encounter queries or tasks for which it was never trained or fine-tuned. 

TechCrunch and others have suggested that hallucinations are driving businesses toward more specialized or vertical AI models focused on narrower domains as a way to improve reliability. Narrowing the vertical domain and business use case(s) of an agent is one of the most effective ways to reduce hallucinations. Another key input is to make sure the agent is working with correct datasets for that vertical or use. 

According to Tim Wolters, Socialgist CTO, “Because hallucinations often stem from generic or bot-influenced data, we focus on sourcing clean, nuanced signals from long-tail sources. This enables grounded agentic reasoning that's relevant, trustworthy, and domain-specific.”

Our user interface provides the capability for the customer to control the specificity and scale of data ingestion based on their own use cases. These characteristics enable efficient, precise, curated and structured data to inform such use cases as product development and product design, customer support for specific products, predictive analytics to inform ecommerce product  placement, long-form content preferences, and more. “Our UI enables control and specificity, so your analysts will spend more time analyzing, and a lot less time in data clean up.” says Socialgist CRO Justin Wyman

These use cases plus thousands more rely on precision - precision enabled by the vast amount of data Socialgist has accrued for over 20 years. And as the consistent proliferation of bot traffic on the internet grows, separating the wheat from the chaff becomes paramount. Our long tail aggregation of small, niche, or infrequent events or products is all the more critical. 

Giant LLMs like Meta’s Llama consistently produce hallucinations because of the proliferation of bots on Facebook especially. Llama also comes with open weights out of the box, allowing anyone to fine-tune and deploy it, which can definitely be a recipe for disaster.

Socialgist CTO Tim Wolters says that, “We’ve engineered our platform for scale from day one. In contrast to LLMs that struggle with noisy bot-generated data, we ensure high-fidelity, human-verified sources—down to niche communities and industry-specific forums.”

Moving forward

Further to that end, we’ve just announced our partnership with Snowflake, a cloud-native data platform that powers analytics, AI, and data collaboration for enterprises. Through our work with Snowflake and early analytics partners, we’re entering the next phase: Allowing customers to select and tune datasets tailored to their specific use case, from campaign analysis to customer support automation. AI is only as smart as the data you give it. We make sure that data is real, relevant, and ready.

“As enterprises lean into AI and real-time decisioning, they need more than structured logs or survey data. They need to understand how real people are thinking and talking,” said Koki Uchiyama, CEO of Socialgist. “Through Snowflake, we’re unlocking seamless access to the social web, from App Store reviews to global influencer conversations, so teams can analyze, learn, and act in minutes in a way that was previously impossible prior to the AI data cloud.”

Sources: McKinsey State of AI 2025; Gartner Hype Cycle 2025; HBR Agentic AI; Hootsuite; MIT Tech Review; Techcrunch