So You are Building an AI Assistant?

So you are building an AI assistant for the business?

This is a popular topic in the companies these days. Everybody seems to be doing that.

While running AI Research in the last months, I have discovered that many companies in the USA and Europe are building some sort of AI assistants these days, mostly around enterprise workflow automation and knowledge bases.

There are common patterns in how such projects work most of the time. So let me tell you a story. This could help us to avoid possible known pitfalls, save some time and ship the product faster.
 

Building an AI assistant

AI assistant project at a company usually starts with an idea: “Let’s build a business assistant or copilot to solve a problem X”. Problems are usually the ones that require:

  • Answering questions based on the corporate knowledge bases

  • Understanding tables in business reports

  • Being able to run marketing research on the Internet

  • Working locally with sensitive data (e.g. banking, insurance or anything that goes into the data rooms)

Based on the internet research, engineering teams decide that current AI technology is mature enough. Surely, there are enough good success stories.

They pick a tool like:

  • LangChain - “Applications that can reason. Powered by LangChain… From startups to global enterprises, ambitious builders choose LangChain.”

  • LlamaIndex - “Turn your enterprise data into production-ready LLM applications”

  • QDrant - “powering thousands of innovative AI solutions at leading companies”

There is even an official NVIDIA project Chat with RTX that lets us upload our documents and start chatting with an assistant.

These are fun times that go by fast. Teams pick their tools and start working steadily, exchanging new terms and discoveries: vector databases, agents, text chunking and computing embeddings, function calling, tools and Chains of Thought.

First results of the assistant are really good - it is capable of providing knowledgeable and insightful answers in most of the cases. This proves that the prototype is good enough to go for a proper development and deployment.

A few months down the road, the system is deployed. It still works good in most of the cases. But that’s where people start realising that something is not quite right.

You see, the system sometimes also hallucinates. It used to hallucinate a bit during the testing, but after the deployment to the real users, things get a bit more worse.

You see, more specific the question is, higher is the chance that the answer be a nonsense. For some reason the specific questions were not asked before, not until the end users started playing with the system.

End users will be asking very specific questions about their daily jobs, questions they already know answer for. Some people would be even willing to go at great lengths to prove that AI doesn’t work at all. Fear of loosing jobs to AI has always been an important factor.

Hallucinations and mistakes in AI are never a good thing. What is the point of using a system that you can’t trust?

Fortunately, public resources provide many ways to improve the accuracy of a mis-behaving AI assistant.

  • Better prompting

  • Tweaking retrieval and fine-tuning embeddings

  • Query transformations (expansion)

  • Chunk reranking

  • Adding multiple agents and building routers

In the end architecture of an AI assistant could grow as complex as the one below. Yet, the it still will not be able to always answer simple questions. It will lack the accuracy and hallucinate from time to time.

 

IMAGE FROM: HTTPS://TOWARDSDATASCIENCE.COM/12-RAG-PAIN-POINTS-AND-PROPOSED-SOLUTIONS-43709939A28C

So what should the teams do at this point? So much effort already invested, and the accuracy is almost there. We probably just need to fine-tune a proper LLM and everything will be finally working.

At this point, I’m not willing to continue going further into this rabbit hole, even in telling a story. There ending is not happy there.

Been there, done that. Also have seen many companies follow this path.

Fortunately, this is just a story. So we can ask the question “What went wrong?”, then rewind the time a few months back and do things differently.

 

What went wrong and at which point?

Things went wrong at the very start. There are 3 assumptions that are commonly misjudged.

  • Public materials and articles tell the whole truth about how state-of-the-art AI systems actually work in business.

  • If we shred our documents into small chunks, then put them into a vector database, then AI will be able to magically make sense out of that.

  • We need to build an omnipotent AI assistant with complex architecture, in order to bring value to a business.

In my experience, all of these are wrong. The assumptions may be useful, if we want to mislead our competitition into going into a deep rabbit hole. But ultimately they are wrong.

We can significantly reduce the risk by testing and verifying these assumptions before committing to a bit project. Also by:

  • talking to the end users first;

  • asking questions and gathering data beforehand;

  • designing a system that is built to capture feedback and learn even after the first deployment.
     

Story with a happy ending

Here is one common and repeated path to a successfully deployed AI assistant:

  1. Start by asking the questions about the business. Identify process steps that could potentially be automated, based on the existing success stories. Prioritise cases that represent the “lowest hanging fruit”

  2. Talk to the process stakeholders, gain insights about how things really work. Look for the specific repeatable patterns. ChatGPT/LLMs are good at working with patterns.

  3. Establish a quality metric, before writing a single line of code. For example, come up with a list of 40 questions that the system should be able to answer correctly. Identify these correct answers beforehand. This will be our first evaluation dataset.

  4. Give a list of 20 questions with answers to the developers and ask them to build a prototype. Don’t ask for a fancy UI at this point, this will be just a waste of time. This can be a Jupyter notebook with Gradio interface that the team can show over a screen share.

  5. Ideally, developers will just cluster the questions together into categories. Then they will figure out a pipeline that prepares documents in such a way, that a ChatGPT prompt for this question category will be able to produce a correct answer.

  6. Once developers are satisfied with the quality, test the system prototype on the remaining 20 questions. This will be a wake-up call for many. Count the number of mistakes.

  7. Now give developers a little bit of time and let them adjust the system so that it can handle these 20 questions. Measure the time it took to adjust the system.

  8. Give developers 20 more questions they haven’t seen. Measure the time it takes them to fix them now, also count the number of mistakes after the iteration.

Repeat the last step a couple of times, still without investing into the complex UI and user experience (since these will just slow us down).

At this point we should have a trend: with each new iteration, the number of correctly answered questions increases, while the time it takes to make the system “learn” about that - stays the same or even decreases.

If numbers don’t follow the trend (a frequent case with RAG systems based on vector databases), then it is time to close the prototype.

If the trend persists - then we can continue it into the future - build a better user experience, deploy it to the end users, but tell the users this:
 

The system is designed to be learning over the time. So it is very important for us if you would rate answers of the system, when they are correct or incorrect. We will incorporate feedback back into the underlying architecture in a semi-automatic way. This way with each passing iteration the system will understand the specifics of your business much better.

 

Under the hood, your teams will continue following the same pattern they have done in the past:

  1. Take new rated questions and add them to the evaluation dataset.

  2. Figure out why doesn’t the AI Assistant handle the questions properly. Restructure prompts and data preparation until overall quality increases.

  3. Test new versions against the entire evaluation dataset. If things look good, deploy it to the end users to gather more data and edge cases.

  4. Repeat.

Sometime within this process, the teams might gather so much feedback, that they can experiment with fine-tuning models or even training ones. At this point, this will just be a controlled experiment - “Can we make the AI assistant better by training a custom model?”

If the numbers tell otherwise, we can get back to the original old-school process that actually works.
 

Summary

There is a lot of hype around AI and AI assistants these days. This hype is partially fuelled by the companies that sell shovels in this gold rush. All solution providers are partially incentivised to keep around problems to which they provide solutions.

This situation can waste time and energy.

You can save time and sometimes even ship products with AI/GPT faster, if you:

  • Treat everything with a grain of salt, including this blog post.

  • Learn about the mistakes and successes of the others before starting projects.

  • Start by gathering evaluation datasets from the end-users of the actual AI-systems.

  • Use customer feedback to measure quality of AI system from day one. Use it to drive the development.

  • Abort and reconsider the project, if the quality doesn’t improve over the time.

 

Image examples, so we get an idea how we should design the artwork for the blog post:

You can search for images on Google and copy them here, or add a self made sketch. We will then take it and create new shiny versions from it.

Contact

Christoph Hasenzagl
TIMETOACT GROUP Österreich GmbHContact
Rinat AbdullinRinat AbdullinBlog
Blog

Let's build an Enterprise AI Assistant

In the previous blog post we have talked about basic principles of building AI assistants. Let’s take them for a spin with a product case that we’ve worked on: using AI to support enterprise sales pipelines.

Rinat AbdullinRinat AbdullinBlog
Blog

Open-sourcing 4 solutions from the Enterprise RAG Challenge

Our RAG competition is a friendly challenge different AI Assistants competed in answering questions based on the annual reports of public companies.

Blog
Blog

SAM Wins First Prize at AIM Hackathon

The winning team of the AIM Hackathon, nexus. Group AI, developed SAM, an AI-powered ESG reporting platform designed to help companies streamline their sustainability compliance.

Blog
Blog

Third Place - AIM Hackathon 2024: The Venturers

ESG reports are often filled with vague statements, obscuring key facts investors need. This team created an AI prototype that analyzes these reports sentence-by-sentence, categorizing content to produce a "relevance map".

Blog
Blog

ChatGPT & Co: LLM Benchmarks for December

Find out which large language models outperformed in the December 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Rinat AbdullinRinat AbdullinBlog
Blog

LLM Performance Series: Batching

Beginning with the September Trustbit LLM Benchmarks, we are now giving particular focus to a range of enterprise workloads. These encompass the kinds of tasks associated with Large Language Models that are frequently encountered in the context of large-scale business digitalization.

Aqeel AlazreeBlog
Blog

Part 1: Data Analysis with ChatGPT

In this new blog series we will give you an overview of how to analyze and visualize data, create code manually and how to make ChatGPT work effectively. Part 1 deals with the following: In the data-driven era, businesses and organizations are constantly seeking ways to extract meaningful insights from their data. One powerful tool that can facilitate this process is ChatGPT, a state-of-the-art natural language processing model developed by OpenAI. In Part 1 pf this blog, we'll explore the proper usage of data analysis with ChatGPT and how it can help you make the most of your data.

Blog
Blog

ChatGPT & Co: LLM Benchmarks for November

Find out which large language models outperformed in the November 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Blog
Blog

ChatGPT & Co: LLM Benchmarks for September

Find out which large language models outperformed in the September 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Blog
Blog

ChatGPT & Co: LLM Benchmarks for October

Find out which large language models outperformed in the October 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Rinat AbdullinRinat AbdullinBlog
Blog

The Intersection of AI and Voice Manipulation

The advent of Artificial Intelligence (AI) in text-to-speech (TTS) technologies has revolutionized the way we interact with written content. Natural Readers, standing at the forefront of this innovation, offers a comprehensive suite of features designed to cater to a broad spectrum of needs, from personal leisure to educational support and commercial use. As we delve into the capabilities of Natural Readers, it's crucial to explore both the advantages it brings to the table and the ethical considerations surrounding voice manipulation in TTS technologies.

Felix KrauseBlog
Blog

AIM Hackathon 2024: Sustainability Meets LLMs

Focusing on impactful AI applications, participants addressed key issues like greenwashing detection, ESG report relevance mapping, and compliance with the European Green Deal.

Blog
Blog

Second Place - AIM Hackathon 2024: Trustpilot for ESG

The NightWalkers designed a scalable tool that assigns trustworthiness scores based on various types of greenwashing indicators, including unsupported claims and inaccurate data.

Matus ZilinskyBlog
Blog

Creating a Social Media Posts Generator Website with ChatGPT

Using the GPT-3-turbo and DALL-E models in Node.js to create a social post generator for a fictional product can be really helpful. The author uses ChatGPT to create an API that utilizes the openai library for Node.js., a Vue component with an input for the title and message of the post. This article provides step-by-step instructions for setting up the project and includes links to the code repository.

Rinat AbdullinRinat AbdullinBlog
Blog

Strategic Impact of Large Language Models

This blog discusses the rapid advancements in large language models, particularly highlighting the impact of OpenAI's GPT models.

Aqeel AlazreeBlog
Blog

Part 4: Save Time and Analyze the Database File

ChatGPT-4 enables you to analyze database contents with just two simple steps (copy and paste), facilitating well-informed decision-making.

Aqeel AlazreeBlog
Blog

Part 3: How to Analyze a Database File with GPT-3.5

In this blog, we'll explore the proper usage of data analysis with ChatGPT and how you can analyze and visualize data from a SQLite database to help you make the most of your data.

Workshop
Workshop

AI Workshops for Companies

Whether it's the basics of AI, prompt engineering, or potential scouting: our diverse AI workshop offerings provide the right content for every need.

Martin WarnungMartin WarnungBlog
Blog

Common Mistakes in the Development of AI Assistants

How fortunate that people make mistakes: because we can learn from them and improve. We have closely observed how companies around the world have implemented AI assistants in recent months and have, unfortunately, often seen them fail. We would like to share with you how these failures occurred and what can be learned from them for future projects: So that AI assistants can be implemented more successfully in the future!

Jörg EgretzbergerJörg EgretzbergerBlog
Blog

8 tips for developing AI assistants

AI assistants for businesses are hype, and many teams were already eagerly and enthusiastically working on their implementation. Unfortunately, however, we have seen that many teams we have observed in Europe and the US have failed at the task. Read about our 8 most valuable tips, so that you will succeed.

Rinat AbdullinRinat AbdullinBlog
Blog

5 Inconvenient Questions when hiring an AI company

This article discusses five questions you should ask when buying an AI. These questions are inconvenient for providers of AI products, but they are necessary to ensure that you are getting the best product for your needs. The article also discusses the importance of testing the AI system on your own data to see how it performs.

Felix KrauseBlog
Blog

License Plate Detection for Precise Car Distance Estimation

When it comes to advanced driver-assistance systems or self-driving cars, one needs to find a way of estimating the distance to other vehicles on the road.

Aqeel AlazreeBlog
Blog

Database Analysis Report

This report comprehensively analyzes the auto parts sales database. The primary focus is understanding sales trends, identifying high-performing products, Analyzing the most profitable products for the upcoming quarter, and evaluating inventory management efficiency.

Branche
Branche

Artificial Intelligence in Treasury Management

Optimize treasury processes with AI: automated reports, forecasts, and risk management.

TIMETOACT
Technologie
Headerbild zu IBM Watson Assistant
Technologie

IBM Watson Assistant

Watson Assistant identifies intention in requests that can be received via multiple channels. Watson Assistant is trained based on real-live requests and can understand the context and intent of the query based on the acting AI. Extensive search queries are routed to Watson Discovery and seamlessly embedded into the search result.

Sebastian BelczykBlog
Blog

Building and Publishing Design Systems | Part 2

Learn how to build and publish design systems effectively. Discover best practices for creating reusable components and enhancing UI consistency.

Sebastian BelczykBlog
Blog

Building A Shell Application for Micro Frontends | Part 4

We already have a design system, several micro frontends consuming this design system, and now we need a shell application that imports micro frontends and displays them.

Sebastian BelczykBlog
Blog

Building a micro frontend consuming a design system | Part 3

In this blopgpost, you will learn how to create a react application that consumes a design system.

Ian RussellIan RussellBlog
Blog

Introduction to Web Programming in F# with Giraffe – Part 2

In this series we are investigating web programming with Giraffe and the Giraffe View Engine plus a few other useful F# libraries.

Laura GaetanoBlog
Blog

5 lessons from running a (remote) design systems book club

Last year I gifted a design systems book I had been reading to a friend and she suggested starting a mini book club so that she’d have some accountability to finish reading the book. I took her up on the offer and so in late spring, our design systems book club was born. But how can you make the meetings fun and engaging even though you're physically separated? Here are a couple of things I learned from running my very first remote book club with my friend!

Bernhard SchauerBlog
Blog

Isolating legacy code with ArchUnit tests

Clear boundaries in code are important ... and hard. ArchUnit allows you to capture the structure your team agreed on in tests.

Ian RussellIan RussellBlog
Blog

Using Discriminated Union Labelled Fields

A few weeks ago, I re-discovered labelled fields in discriminated unions. Despite the fact that they look like tuples, they are not.

Ian RussellIan RussellBlog
Blog

Introduction to Partial Function Application in F#

Partial Function Application is one of the core functional programming concepts that everyone should understand as it is widely used in most F# codebases.In this post I will introduce you to the grace and power of partial application. We will start with tupled arguments that most devs will recognise and then move onto curried arguments that allow us to use partial application.

Sebastian BelczykBlog
Blog

Composite UI with Design System and Micro Frontends

Discover how to create scalable composite UIs using design systems and micro-frontends. Enhance consistency and agility in your development process.

Daniel PuchnerBlog
Blog

How we discover and organise domains in an existing product

Software companies and consultants like to flex their Domain Driven Design (DDD) muscles by throwing around terms like Domain, Subdomain and Bounded Context. But what lies behind these buzzwords, and how these apply to customers' diverse environments and needs, are often not as clear. As it turns out it takes a collaborative effort between stakeholders and development team(s) over a longer period of time on a regular basis to get them right.

Ian RussellIan RussellBlog
Blog

Ways of Creating Single Case Discriminated Unions in F#

There are quite a few ways of creating single case discriminated unions in F# and this makes them popular for wrapping primitives. In this post, I will go through a number of the approaches that I have seen.

Daniel WellerBlog
Blog

Revolutionizing the Logistics Industry

As the logistics industry becomes increasingly complex, businesses need innovative solutions to manage the challenges of supply chain management, trucking, and delivery. With competitors investing in cutting-edge research and development, it is vital for companies to stay ahead of the curve and embrace the latest technologies to remain competitive. That is why we introduce the TIMETOACT Logistics Simulator Framework, a revolutionary tool for creating a digital twin of your logistics operation.

Rinat AbdullinRinat AbdullinBlog
Blog

Part 1: TIMETOACT Logistics Hackathon - Behind the Scenes

A look behind the scenes of our Hackathon on Sustainable Logistic Simulation in May 2022. This was a hybrid event, running on-site in Vienna and remotely. Participants from 12 countries developed smart agents to control cargo delivery truck fleets in a simulated Europe.

Aqeel AlazreeBlog
Blog

Part 2: Data Analysis with powerful Python

Analyzing and visualizing data from a SQLite database in Python can be a powerful way to gain insights and present your findings. In Part 2 of this blog series, we will walk you through the steps to retrieve data from a SQLite database file named gold.db and display it in the form of a chart using Python. We'll use some essential tools and libraries for this task.

Christian FolieBlog
Blog

The Power of Event Sourcing

This is how we used Event Sourcing to maintain flexibility, handle changes, and ensure efficient error resolution in application development.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 5

Master F# asynchronous workflows and parallelism. Enhance application performance with advanced functional programming techniques.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F#

Dive into functional programming with F# in our introductory series. Learn how to solve real business problems using F#'s functional programming features. This first part covers setting up your environment, basic F# syntax, and implementing a simple use case. Perfect for developers looking to enhance their skills in functional programming.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 4

Unlock F# collections and pipelines. Manage data efficiently and streamline your functional programming workflow with these powerful tools.

Rinat AbdullinRinat AbdullinBlog
Blog

Innovation Incubator at TIMETOACT GROUP Austria

Discover how our Innovation Incubator empowers teams to innovate with collaborative, week-long experiments, driving company-wide creativity and progress.

Felix KrauseBlog
Blog

Part 1: Detecting Truck Parking Lots on Satellite Images

Real-time truck tracking is crucial in logistics: to enable accurate planning and provide reliable estimation of delivery times, operators build detailed profiles of loading stations, providing expected durations of truck loading and unloading, as well as resting times. Yet, how to derive an exact truck status based on mere GPS signals?

Laura GaetanoBlog
Blog

My Weekly Shutdown Routine

Discover my weekly shutdown routine to enhance productivity and start each week fresh. Learn effective techniques for reflection and organization.

Jonathan ChannonBlog
Blog

Tracing IO in .NET Core

Learn how we leverage OpenTelemetry for efficient tracing of IO operations in .NET Core applications, enhancing performance and monitoring.

Ian RussellIan RussellBlog
Blog

Introduction to Web Programming in F# with Giraffe – Part 1

In this series we are investigating web programming with Giraffe and the Giraffe View Engine plus a few other useful F# libraries.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 2

Explore functions, types, and modules in F#. Enhance your skills with practical examples and insights in this detailed guide.

Christian FolieBlog
Blog

Running Hybrid Workshops

When modernizing or building systems, one major challenge is finding out what to build. In Pre-Covid times on-site workshops were a main source to get an idea about ‘the right thing’. But during Covid everybody got used to working remotely, so now the question can be raised: Is it still worth having on-site, physical workshops?

Daniel PuchnerBlog
Blog

Make Your Value Stream Visible Through Structured Logging

Boost your value stream visibility with structured logging. Improve traceability and streamline processes in your software development lifecycle.

Laura GaetanoBlog
Blog

Using a Skill/Will matrix for personal career development

Discover how a Skill/Will Matrix helps employees identify strengths and areas for growth, boosting personal and professional development.

Rinat AbdullinRinat AbdullinBlog
Blog

Innovation Incubator Round 1

Team experiments with new technologies and collaborative problem-solving: This was our first round of the Innovation Incubator.

Rinat AbdullinRinat AbdullinBlog
Blog

Announcing Domain-Driven Design Exercises

Interested in Domain Driven Design? Then this DDD exercise is perfect for you!

Rinat AbdullinRinat AbdullinBlog
Blog

Using NLP libraries for post-processing

Learn how to analyse sticky notes in miro from event stormings and how this analysis can be carried out with the help of the spaCy library.

Rinat AbdullinRinat AbdullinBlog
Blog

Consistency and Aggregates in Event Sourcing

Learn how we ensures data consistency in event sourcing with effective use of aggregates, enhancing system reliability and performance.

Blog
Blog

My Workflows During the Quarantine

The current situation has deeply affected our daily lives. However, in retrospect, it had a surprisingly small impact on how we get work done at TIMETOACT GROUP Austria.

Rinat AbdullinRinat AbdullinBlog
Blog

Learning + Sharing at TIMETOACT GROUP Austria

Discover how we fosters continuous learning and sharing among employees, encouraging growth and collaboration through dedicated time for skill development.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 8

Discover Units of Measure and Type Providers in F#. Enhance data management and type safety in your applications with these powerful tools.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 9

Explore Active Patterns and Computation Expressions in F#. Enhance code clarity and functionality with these advanced techniques.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 10

Discover Agents and Mailboxes in F#. Build responsive applications using these powerful concurrency tools in functional programming.

Rinat AbdullinRinat AbdullinBlog
Blog

Process Pipelines

Discover how process pipelines break down complex tasks into manageable steps, optimizing workflows and improving efficiency using Kanban boards.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 11

Learn type inference and generic functions in F#. Boost efficiency and flexibility in your code with these essential programming concepts.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 12

Explore reflection and meta-programming in F#. Learn how to dynamically manipulate code and enhance flexibility with advanced techniques.

Bernhard SchauerBlog
Blog

ADRs as a Tool to Build Empowered Teams

Learn how we use Architecture Decision Records (ADRs) to build empowered, autonomous teams, enhancing decision-making and collaboration.

Peter SzarvasPeter SzarvasBlog
Blog

Why Was Our Project Successful: Coincidence or Blueprint?

“The project exceeded all expectations,” is one among our favourite samples of the very positive feedback from our client. Here's how we did it!

Nina DemuthBlog
Blog

They promised it would be the next big thing!

Haven’t we all been there? We have all been promised by teachers, colleagues or public speakers that this or that was about to be the next big thing in tech that would change the world as we know it.

Ian RussellIan RussellBlog
Blog

Creating solutions and projects in VS code

In this post we are going to create a new Solution containing an F# console project and a test project using the dotnet CLI in Visual Studio Code.

Jonathan ChannonBlog
Blog

Understanding F# Type Aliases

In this post, we discuss the difference between F# types and aliases that from a glance may appear to be the same thing.

Jonathan ChannonBlog
Blog

Understanding F# applicatives and custom operators

In this post, Jonathan Channon, a newcomer to F#, discusses how he learnt about a slightly more advanced functional concept — Applicatives.

Nina DemuthBlog
Blog

From the idea to the product: The genesis of Skwill

We strongly believe in the benefits of continuous learning at work; this has led us to developing products that we also enjoy using ourselves. Meet Skwill.

Felix KrauseBlog
Blog

Creating a Cross-Domain Capable ML Pipeline

As classifying images into categories is a ubiquitous task occurring in various domains, a need for a machine learning pipeline which can accommodate for new categories is easy to justify. In particular, common general requirements are to filter out low-quality (blurred, low contrast etc.) images, and to speed up the learning of new categories if image quality is sufficient. In this blog post we compare several image classification models from the transfer learning perspective.

Rinat AbdullinRinat AbdullinBlog
Blog

State of Fast Feedback in Data Science Projects

DSML projects can be quite different from the software projects: a lot of R&D in a rapidly evolving landscape, working with data, distributions and probabilities instead of code. However, there is one thing in common: iterative development process matters a lot.

Felix KrauseBlog
Blog

Part 2: Detecting Truck Parking Lots on Satellite Images

In the previous blog post, we created an already pretty powerful image segmentation model in order to detect the shape of truck parking lots on satellite images. However, we will now try to run the code on new hardware and get even better as well as more robust results.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 3

Dive into F# data structures and pattern matching. Simplify code and enhance functionality with these powerful features.

Ian RussellIan RussellBlog
Blog

So, I wrote a book

Join me as I share the story of writing a book on F#. Discover the challenges, insights, and triumphs along the way.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 7

Explore LINQ and query expressions in F#. Simplify data manipulation and enhance your functional programming skills with this guide.

Nina DemuthBlog
Blog

7 Positive effects of visualizing the interests of your team

Interests maps unleash hidden potentials and interests, but they also make it clear which topics are not of interest to your colleagues.

Daniel PuchnerBlog
Blog

How to gather data from Miro

Learn how to gather data from Miro boards with this step-by-step guide. Streamline your data collection for deeper insights.

Christian FolieBlog
Blog

Designing and Running a Workshop series: An outline

Learn how to design and execute impactful workshops. Discover tips, strategies, and a step-by-step outline for a successful workshop series.

Christian FolieBlog
Blog

Designing and Running a Workshop series: The board

In this part, we discuss the basic design of the Miro board, which will aid in conducting the workshops.

Ian RussellIan RussellBlog
Blog

Introduction to Functional Programming in F# – Part 6

Learn error handling in F# with option types. Improve code reliability using F#'s powerful error-handling techniques.

Rinat AbdullinRinat AbdullinBlog
Blog

Machine Learning Pipelines

In this first part, we explain the basics of machine learning pipelines and showcase what they could look like in simple form. Learn about the differences between software development and machine learning as well as which common problems you can tackle with them.

Rinat AbdullinRinat AbdullinBlog
Blog

Event Sourcing with Apache Kafka

For a long time, there was a consensus that Kafka and Event Sourcing are not compatible with each other. So it might look like there is no way of working with Event Sourcing. But there is if certain requirements are met.

Felix KrauseBlog
Blog

Boosting speed of scikit-learn regression algorithms

The purpose of this blog post is to investigate the performance and prediction speed behavior of popular regression algorithms, i.e. models that predict numerical values based on a set of input variables.

Chrystal LantnikBlog
Blog

CSS :has() & Responsive Design

In my journey to tackle a responsive layout problem, I stumbled upon the remarkable benefits of the :has() pseudo-class. Initially, I attempted various other methods to resolve the issue, but ultimately, embracing the power of :has() proved to be the optimal solution. This blog explores my experience and highlights the advantages of utilizing the :has() pseudo-class in achieving flexible layouts.