The Winner of the Enterprise RAG Challenge

These are the proud winners of the Enterprise RAG Challenge

The wait is over! The Enterprise RAG Challenge has brought together some of the brightest minds in the AI world to push the boundaries of Retrieval-Augmented Generation (RAG). After an exciting competition, we are now proud to present the top solutions and their developers.

The Winners of Both Categories at a Glance

The Enterprise RAG Challenge featured two different tracks, providing participants with distinct opportunities to showcase their skills.

The Regular Track: This track was open to all, regardless of the technology used. Participants could submit their best Retrieval-Augmented Generation (RAG) solutions, leveraging various models, embeddings, and retrieval methods. The goal was to encourage creative and efficient approaches that fully exploit the potential of RAG.

The IBM watsonx Track: This track focused on developing outstanding RAG solutions using IBM WatsonX. This track provided a unique comparison, showcasing which approaches could extract the most from IBM watsons and deliver the most precise answers.

Regular Track - Winners

In the Regular Track, two participants secured first place. Due to a tie at the top, there is one second-place winner and two champions sharing the first place!

1st Place: Emil Shagiev – Efficient semantic search with LLM-supported optimization

1st Place: Ilia Ris – Smart combination of embeddings & LLMs for precise answers

2nd Place: Team Hopeless – High precision through multi-stage processing, consistency checks, and LLM-supported re-ranking

Learn more about the solutions of the regular track

IBM watsonx Track- Winners

The IBM WatsonX Track has proven that security and quality do not require compromises. It is possible to achieve outstanding results with models that can run locally!

1st Place: Ilia Ris – Smart combination of embeddings & LLMs for precise answers

2nd Place: A. Rasskazov & V. Kalesnikau – Smart combination of embeddings & LLMs for precise answers

3rd Place: Team Nightwalker – High precision through multi-stage processing, consistency checks, and LLM-supported re-ranking

Learn more about the winning solutions of the watsonx track

The Official Winner Announcement Video

Watch the full live broadcast, where Rinat Abdullin (Head of Innovation and Machine Learning at TIMETOACT GROUP Austria) announces the winners and presents their groundbreaking approaches.

textandvideo.cookieNotEnabled.text

Insights into the Winners of the Regular Track

1st Place: Emil Shagiev

Efficient Semantic Search with LLM-Optimized Processing

Models: gpt-4o-mini-2024-07-18, gpt-4o-2024-08-06, o3-mini-2025-01-31

Query Expansion: Expands the search query to perform semantic search and maximize relevant results
Page-Based Retrieval: The system searches for relevant documents using fast and cost-effective LLMs.
Answer Generation: The best matches are extracted and formulated into a response using more powerful LLMs.
Final Answer Optimization: The answer is reviewed and refined before being presented to the user.Die Antwort wird überprüft, optimiert und erst dann dem Nutzer präsentiert.

1st Place: Ilia Ris

Smart Combination of Embeddings & Reasoning for Precise Answers

Model: o3-mini-2025-01-31

PDF Analysis: Documents are processed using a highly modified Docling Library from IBM.
Dense Retrieval: The system searches for relevant information based on semantic similarity.
Router Pattern: First step in question answering flow picks the most suitable agent.
Parent Document Retrieval: Instead of analyzing only a single section, the full document context is considered.
LLM Reranking: Retrieved information is re-evaluated and reordered by the LLM.
Self-Consistency with Majority Vote: Multiple answer variations are generated, compared, and the most consistent one is selected.
Reasoning Patterns: Improve LLM accuracy within a single prompt by controlling its thinking process with Custom Chain-of-Thought and Structured Outputs.
Final Answer Generation: The optimized result is generated and delivered using o3-mini.

2nd Place: Team Hopeless

High Precision through Multi-Step Processing, Consistency Checking, and LLM-Supported Reranking

Model: gpt-4o-2024-08-06

Dynamic Structured Output: The system uses SEC EDGAR ontologies to systematically organize information. Structured Outputs help to drive Chain-of-Thought and ensure classification precision.
Query Expansion: The query is semantically translated (expanded) to the search space using CBOW similarity.
Majority Voting for Multiple Runs: Several answer variations are generated, and the best one is selected.Mehrere Antwortvarianten werden generiert, verglichen und die beste ausgewählt.
Intelligent Chunking: The system processes text page by page instead of token-based, preserving better context coherence.

Insights into the Winners of the IBM watsonx Track

1st Place: Ilia Ris

Dense Retrieval, Intelligent Routing, LLM Reranking, Self-Consistency & Reasoning

Model: IBM WatsonX llama-3.3 70B

PDF Parsing: Uses a heavily modified Docling library from IBM to extract and structure text.
Dense Retrieval: Retrieves relevant information based on semantic similarity with IBM embedding models.
Router Pattern: First step in question answering flow picks the most suitable agent.
Parent Document Retrieval: Expands the context by retrieving larger document sections.
LLM Reranking: Reorders retrieved results with LLM to improve relevance.
Reasoning with Structured Outputs (SO):
- Chain-of-Thought (CoT) – Enhances logical reasoning in responses.
- Schema Repair – Fixes responses of models that don’t have constrained decoding to support SO.
Self-Consistency with Majority Vote: Generates multiple responses and selects the most consistent one.
LLM: Uses Llama-3.3 70B (IBM WatsonX) to drive this RAG implementation.

2. Platz: A. Rasskazov & V. Kalesnikau

Intelligent Retrieval, Adaptive Matching & LLM-Driven Answer Generation

Models: Combination of Llama-3, Granite Embeddings & GPT-4o-mini

Database Initialization: Generate the RAG model databases to store and retrieve relevant information.
Key Information Extraction: Identify essential details from the question, such as company, industry, metric, and currency.
Similarity Matching: Find the most similar question in the database based on the extracted key metrics.
Leverage Existing Knowledge: Use the answer from the similar question as a reference for the new query.
LLM-Powered Response Generation: Utilize the LLM model to refine or generate a new answer if necessary.
Final Answer Compilation: Gather all responses, structure them, and present the best result to the user.

3rd Place: Team Nightwalkers

Efficient Retrieval & LLM-Powered Answer Generation

Models: Use of DeepSeek Llama-70B and IBM Granite Embeddings

Vector Database Creation: Build a vector database using all-MiniLM-L6-v2 / IBM Granite Embeddings (107M, multilingual) to enable efficient semantic search.
Query Processing: When a question is asked, the system searches the database for the most relevant page and document.
Document Retrieval: The best-matching content is selected based on similarity scoring.
LLM Answer Generation: The retrieved information is sent to the IBM WatsonX LLM (LLM DeepSeek-r1-Distill-Llama-70B) to generate a structured and precise response.
Final Output: The AI-generated answer is refined and presented to the user.

Key Take-Aways from the Regular Track

LLM Reasoning patterns (based on Structured Outputs and Custom Chain of Thought) are used in almost all top submissions.
If you have a solid RAG architecture, you can get great results even with the smallest model. Ilia’s solution scored 109.3 (R: 81.1, G: 68.7) with llama-3.1 8b.
Majority Vote is an efficient way to improve system accuracy

Key Take-Aways from the IBM watsonx Track

You don’t need to compromise safety and quality. It is possible to achieve great scores with models that you can run locally.
IBM Docling library is the key differentiator for parsing business documents.
Context is the king - accurate retrieval is essential for producing accurate answers.