Skip to main content

Progress Update: On-Prem AI Agent Integration with Ataccama ONE

  • July 11, 2025
  • 2 replies
  • 49 views

Happy to share a quick update with the community on the progress we’ve made with our on-premise AI agent integration into Ataccama ONE.

 

We’ve now developed and tested four distinct use cases that support users across various data governance tasks. All of these are accessible through a lightweight chat-based interface that runs alongside the Ataccama UI, without modifying the core product.

Here’s a brief overview of what’s working today:

Data Quality Insights

The agent detects data quality issues and provides root cause analysis with natural-language explanations. When needed, it also offers recommendations for resolving common problems, making DQ monitoring faster and more intuitive.

AI-Assisted Data Catalog Navigation

Users can search and explore the Ataccama catalog using natural language. The agent retrieves relevant metadata and provides contextual recommendations, helping non-technical users find and understand data without needing to know the structure.

Compliance & Sensitive Data Discovery

The agent helps compliance teams identify and classify sensitive data across systems. It offers real-time audit visibility and generates draft reports to support regulatory readiness, reducing manual review and documentation.

Metadata Enrichment

By analyzing table and column structures, the agent auto-generates business-friendly descriptions for metadata elements. It doesn’t stop at suggestions — it can also write these enriched descriptions directly into the Ataccama metadata repository, updating the relevant description fields of tables and attributes. This significantly reduces manual documentation effort and improves metadata consistency across the platform.

 

All use cases are powered by our own on-prem LLM and fully isolated from external cloud services. We’ve designed the logic to integrate with Ataccama via available interfaces, making the experience seamless while keeping the platform intact.

Looking forward to hearing thoughts from the community and continuing to share our journey as it evolves.

2 replies

  • Data Pioneer
  • December 17, 2025

Can you share some technical details, how you achieve this?


  • Author
  • Data Voyager
  • December 18, 2025

Hi,

Our main goal from the beginning was to add AI capabilities around Ataccama ONE without modifying the core product itself. We wanted the solution to be safe, upgrade-friendly, and fully on-prem.

At a high level, we implemented everything as an external companion layer rather than a native customization.

First, we built a separate, lightweight chat-based UI that runs alongside the Ataccama ONE interface. From a user perspective, it feels like an AI assistant for Ataccama, but technically it’s an independent UI that understands user context and governance use cases. This gave us flexibility while keeping Ataccama untouched.

Behind that UI, we introduced an MCP server, which acts as the orchestration layer. The MCP server hosts our on-prem LLM and exposes a controlled set of AI tools. It’s responsible for deciding what the model can access, how it reasons, and how requests are executed. This keeps the solution predictable, auditable, and enterprise-ready.

Inside the MCP server, we implemented multiple purpose-built tools, each aligned with a specific data governance scenario — for example data quality analysis, catalog exploration, sensitive data discovery, and metadata enrichment. These tools interact with Ataccama using available internal APIs (GraphQL-based in our case), in the same way other system components do. The LLM itself never accesses Ataccama directly — it can only invoke these predefined tools with strict input/output contracts and permission checks.

Another important aspect is that we work in a scenario-driven way, not just free-form chat. Each use case has its own logic, context, and allowed actions, which is why the assistant feels governance-aware rather than like a generic chatbot.

Everything runs fully on-prem and fully isolated — the LLM, MCP server, UI, and integration layer. No data leaves the environment, and no external cloud services are involved.