A Look Inside Uniphore’s AI Research and Innovation Strategy

There’s a reason Uniphore is at the forefront of artificial intelligence: our engineering team is constantly questioning, rethinking, and pushing what is possible with AI.

We recently sat down with Roberto Pieraccini, Vice President, Chief Scientist, and Head of AI Frontier, and Andreas Stolcke, Distinguished AI Scientist and Vice President of AI at Uniphore, to discuss their team’s critical role, guiding principles, and notable achievements. They also shed light on exciting current developments and addressed the hype surrounding the latest industry trend: neuro-symbolic AI.

Read on to learn how Uniphore’s AI leaders approach research and innovation to ultimately provide more value to customers.

The difference between AI engineering and AI research

The terms “AI engineering” and “AI research” are often used interchangeably within the AI industry. However, Roberto Pieraccini argues that this is more than just an oversimplification—it’s fundamentally wrong. “There is a huge difference between AI engineering and research,” he explains. “AI engineering applies existing methods to build practical, reliable, and scalable systems. We do a lot of that at Uniphore. This is very important because it creates value for our customers.”

While AI research—which Pieraccini categorizes as either “applied” or “basic” research—is fundamentally different from AI engineering, he stresses that it’s essential for advancing AI science and, ultimately, its engineered applications. “Applied AI research tries to solve limitations in current AI systems by experimenting with new ideas that are not yet fully proven,” he explains. “Basic AI research [involves] developing new theories, models, and algorithms to advance and disrupt the field.”

Large language models (LLMs) are perhaps the most recent example of this. Before the rise of the transformer architecture, AI systems were largely the subject of, first, statistical, and then neural and deep learning basic research. That changed in 2017, when Google researchers shared their landmark paper, “Attention Is All You Need” , introducing the blueprint for modern transformers like BERT and GPT. Since then, applied science researchers and AI engineers have experimented with and developed increasingly sophisticated LLM models and real-world applications.

However, Pieraccini cautions listeners against putting LLMs on too high a pedestal. While these models represent a milestone in AI, the path forward continues, and newer innovations—with other milestones—will inevitably eclipse even the mighty LLM.

“Are LLMs the solution to everything? No,” he says. “LLMs are the current best solution for the problems we have; however, they have a lot of limitations. There will be new models that come out in a few years that will likely disrupt LLMs.”

Where is AI providing the biggest value right now?

Small Language Models (SLMs) are among the most exciting areas of AI development right now—and among the most promising in terms of value generation. SLMs are streamlined, production-proven derivatives of LLMs. Those developed and deployed by Uniphore are currently delivering strong results at a fraction of the cost of full-scale models. They’re also being used to power agentic workflows, enabling automated problem-solving with minimal human intervention and highly attractive operating costs.

How do they generate value? As we explained in an earlier article, Uniphore’s unique approach to SLMs leverages retrieval augmented fine-tuning (RAFT) to enable faster responses, higher accuracy, and greater cost savings.

In addition to SLMs, AI scientists on Pieraccini’s team continue to explore what’s possible with LLMs, forging new applications and strengthening others throughout the enterprise.

These applications are where AI provides the greatest value to business users today, explains Andreas Stolcke. He gives six examples where Uniphore specifically has created value through its differentiated approach to AI (and LLMs):

AI agents
AI safety guardrails
Call summarization

Question answering from documents
Conversation facts
Speech recognition

AI agents

AI agents are today redefining enterprise AI as we know it. Agentic AI represents a paradigm shift, where the role of AI is moving from insight generation to autonomous action or agency. “[Today], we can train LLMs to carry out actions in the real world to automate certain tasks,” says Stolcke. “They can basically act as a planning agent that can devise a series of steps to achieve a larger goal and then executes them by calling out to a tool interface.”

While agentic AI is exciting in its own right, there’s one area that Stolcke finds particularly thrilling: agentic dialogue models. “We developed agentic systems that leverage the reasoning and context awareness of LLMs to enable autonomous human-machine natural-language dialogues, whether via text or voice,” he says. “We actually have a system that acquires the information needed and then calls on tools that perform actions on your behalf.”

AI safety guardrails

“LLMs are very powerful—too powerful really for specific business uses, so you need to prevent them from answering certain questions,” states Stolcke. The solution: AI safety guardrails. Other guardrail solutions are themselves based on LLM queries, but Uniphore’s approach is to employ smaller models using the BERT architecture and to detect material that can cause harm to the user or the company deploying LLM-based systems, thereby enforcing Responsible AI guidelines.

“These guardrail models have to be very light-weight—otherwise you couldn’t run them on every input without multiplying your total compute overhead,” he explains. “That’s why we like to use transformer models based on BERT that are fine-tuned to the guardrail function and are just as effective or even better at that than LLMs, but with much lower computational cost. That’s one of the ways we innovate for effective, scalable AI use in business settings.”

Call summarization

Call summaries play a critical role in helping companies interpret customer interactions and extract actionable insights. AI can deliver significant ROI by automating call summarization—a task that human agents typically perform during call wrap-up.

As we explored in an earlier article, Uniphore’s innovative approach to call summarization overcomes the limitations of traditional methods by creating textual summaries of calls as they happen, based on LLMs.

“Summarizing a conversation or a document is essentially a rewriting of those token streams into a more compact form,” explains Stolcke. “LLMs are good at token stream rewriting and can learn to do that from examples. So, summarization has seen huge improvements in the past decade, essentially because of LLMs.”

Question answering from documents

With LLMs, business users can “ask” documents questions, and the model will extract the relevant information to answer the user’s question using a process called retrieval-augmented generation (RAG). By combining LLMs with RAG (with built-in factuality checking to prevent hallucinations), Uniphore sharpens the accuracy—and usefulness—of these models considerably.

“We chunk and index all the documents informed by their semantic content, which allows us to retrieve them based on similarity to a topic or a set of keywords,” Stolcke explains. “Then we feed the LLM with the excerpts of documents we think are relevant—plus the question you’re trying to answer—and then you prompt the LLM to generate an answer that’s relevant to the question.”

Conversation facts

Conversation facts are specific insights from interactions with humans that are relevant to an enterprise’s business or operations. For example, Conversation Insights Agent users can ask—in plain language—things like, “What was the concern a customer called in with? Was the issue resolved? Was the agent helpful?” and the agent will surface the relevant insights. No keywords, tags, or technical expertise required.

“It is a form of summarization that is more targeted,” says Stolcke. In this case, LLMs ingest the conversation, the transcript, and generate the answer based on how the question was asked. “It’s a rewriting of information based on contextual information into a form that fulfills a certain need.”

Speech recognition

At Uniphore, speech recognition is in our DNA. After all, some of our earliest innovations were in the field of speech recognition. Traditionally performed with statistical models, speech recognition has advanced significantly with the advent of neural networks and—most recently—speechLLMs. A speechLLM is trained to convert audio-encoded information into textual form and follow instructions in a way similar to an LLM.

SpeechLLMs is one of the areas Stolcke considers to be at the frontier of AI research: “With speechLLMs, we unify the processing of spoken language—the acoustic modality—with the text modality. We also collapse the usual pipeline that goes from speech to transcript to semantic interpretation. […] That’s technology that can overcome the propagation of errors introduced in multi-state pipelined systems, while also reducing latency. This approach has been investigated in the lab for a few years, but we’re right at the forefront of making that work in operational AI systems.”

“Finally, there is a growing effort to advance the tuning of LLMs — or more precisely SLMs — so they better reflect the desired outputs and reduce undesired behaviors, without requiring large amounts of annotated training data,” adds Pieraccini. “This approach is known as Reinforcement Learning, or RL. RL theory was developed in the 1990s and models how humans — and animals in general — learn. We don’t learn simply by reading vast amounts of text (and animals certainly can’t); we learn through positive and negative reinforcement signals that tell us when things go well or not.”

“There are many forms of RL that have been applied to LLMs, most famously using human feedback as the reinforcement signal. But the trend today is to automate this process and remove humans from the loop. At Uniphore, we are exploring RL as a powerful way to significantly enhance the capabilities of SLMs on targeted tasks, and we are already starting to see promising results.”

What’s next?

These frontier AI research topics—speechLLMs and agentic dialogue models—offer a hint at where AI is headed next, particularly within the enterprise space. “These are all applications where LLMs or generative AI really simplify the overall complexity of the system by combining multiple steps into a single model,” Stolcke says. “As businesses increasingly turn to AI—and computationally intensive LLM models in particular—they’ll need innovative ways to control the growing complexity and cost.”

“Many now sense that transformer-based deep neural networks—and LLMs as we know them—as well as AI agents built on top of them, may be approaching a performance wall,” he adds. “This is driving a growing search for new approaches that can enhance LLM capabilities by grounding them in additional knowledge and assisting them in the execution of complex workflows.”

One promising direction, known as neuro-symbolic AI, integrates symbolic representations with neural computation. “While LLMs already operate over symbolic inputs and outputs, we can further constrain and guide their behavior using explicit knowledge structures, such as rules or knowledge graphs, as well as orchestration frameworks such as programming code,” explains Stolcke. “These hybrid approaches are attracting increasing interest from researchers and are beginning to show meaningful improvements.”

The Orby team, which recently joined Uniphore, is a leader in neuro-symbolic methods. Their extensive research and domain expertise is enabling Uniphore to develop practical applications that will solve real-world business challenges and drive customer value. That’s the ultimate goal, according to Stolcke and Pieraccini—and one their team never tires of pursuing.