Databricks is accelerating its competition with Snowflake by introducing a new SQL-based AI feature that automatically interprets and structures documents. Part of the company’s Agent Bricks suite, the feature is designed to simplify handling of PDFs, images and Office files.
The major advance is that organizations can now extract text, tables and figures from documents and make them searchable, analyzable and directly integrated into their data platform—without relying on traditional OCR pipelines.
An AI feature that turns documents into database-ready assets
The new capability, named ai_parse_document, is currently in public preview. It supports formats such as PDF, JPG, PNG, DOCX and PPTX. The function extracts content along with spatial metadata so the original document’s layout and structure are preserved after conversion.
This enables organizations to:
- treat documents as tables
- use SQL for analysis, search and indexing
- automate pipelines that continuously ingest new documents
- integrate content with vector search and Unity Catalog
For companies that previously built custom OCR tools and homegrown code, this represents significant savings in time and cost.
Why this matters for businesses
Unstructured data has long been a bottleneck in data platforms. PDF reports, images and presentations often contain valuable information but are hard to use in analytics and AI systems. Databricks aims to change that by making documents as easy to work with as database tables.
Three trends are driving this shift:
- AI-native data platforms are becoming the standard
- SQL interfaces simplify adoption across teams
- Cost savings compared with complex, custom pipelines
Databricks also highlights that cost and performance are crucial factors for organizations managing millions of documents.
A direct response to Snowflake’s AI push
Snowflake recently introduced its own document interpretation features under the name Agentic Document Analytics. While both vendors offer similar tools, Databricks emphasizes cost advantages and a pure SQL workflow.
Analysts believe Databricks’ solution reduces the need for manual integrations and suits organizations that already rely heavily on SQL.
Impact on Nordic IT companies
For IT providers in the Nordics, this development brings multiple benefits:
1. More efficient document management
Technical specifications, white papers and PDF reports become automatically structured and searchable.
2. New analytics opportunities
Information previously locked in PDFs can be integrated into dashboards and BI tools.
3. Lower development costs
There is less need for separate OCR tools and bespoke code.
4. Stronger AI pipelines
RAG systems, chatbots, search engines and internal knowledge bases gain richer, more usable data.
5. Increased competition in the data platform race
The choice between Snowflake and Databricks becomes even more strategic and AI-centered.
Conclusion
Databricks’ SQL-based AI feature for document interpretation marks a step toward the next generation of data platforms. The boundary between structured and unstructured data is blurring, giving organizations new ways to leverage documents and files in their data and AI workflows.
The launch also intensifies competition with Snowflake and is likely to influence how Nordic companies choose their data platforms going forward.