Designing an Event-Driven RAG Application

I’ve been exploring ways to integrate Retrieval-Augmented Generation (RAG) into my website, and I wanted to design a small, event-driven app to learn more about it. My goal was to create something simple yet practical for my site. Below is a Mermaid diagram I created to model the flow of my application. It highlights the main components of an event-driven RAG system, covering content updates, user queries, and background indexing: ...

May 5, 2025 · 3 min · TC

Enhancing Code Quality with MCP and ReSharper CLI

Introduction Recently, I started to learn about Model Context Protocol (MCP) and wrote a small project to use on Visual Studio Code with GitHub Copilot agent. After completing some tutorials, I wanted to create something more useful for my own tasks. I decided to write an MCP server that checks code quality and code coverage using ReSharper CLI, which is free and available with my Visual Studio subscription. This use case is perfect for me as it integrates seamlessly with my development workflow. ...

May 5, 2025 · 3 min · tc

Exploring Quantization Techniques for RAG Applications

Quantization has been on my mind lately as I explore ways to optimize my RAG (Retrieval-Augmented Generation) application. With so many options available, I wanted to break down the main techniques and share my thoughts on their strengths, trade-offs, and where they might fit best. Let’s dive in! 1. Scalar Quantization What It Is: Scalar quantization simplifies things by treating each component of a vector independently. For example, a 32-bit floating-point value can be mapped down to an 8-bit integer using a defined range and step-width. ...

May 5, 2025 · 4 min · tc

Fine-Tuning vs Retrieval-Augmented Generation (RAG)

When I first started exploring AI, I was eager to use my own data with large language models (LLMs). However, I faced a dilemma: should I fine-tune a model with my data or use Retrieval-Augmented Generation (RAG)? After diving into research, I discovered the strengths and challenges of each approach. Here’s what I learned: Fine-Tuning Fine-tuning involves retraining a pre-trained model on a specific dataset to adapt it to a particular domain or task. For example: ...

May 5, 2025 · 2 min · tc

Quantization in RAG Applications

I have been working on my RAG application and when it comes to using my data (although small), I was thinking about performance and size. I decided to check Quantization. Quantization can be a useful technique in my RAG (Retrieval-Augmented Generation) workflow, especially when dealing with high-dimensional embeddings. It essentially reduces the precision of embeddings—compressing them so that the memory footprint is lower and the similarity searches can be faster, all while preserving most of the semantic information. Let’s break down the concept and how I might integrate it into the app: ...

May 5, 2025 · 3 min · tc

Setting Up Elasticsearch for Your Blog Search

If you’re looking to add powerful search capabilities to your blog, Elasticsearch is a fantastic option. After some research, I’ve put together a simple guide to help you set up Elasticsearch for your site using Docker and .NET. Let’s get started! 1. Create a Docker Compose File To run Elasticsearch in a Docker container, create a docker-compose.yml file with the following content: version: '3.7' services: elasticsearch: image: docker.elastic.co/elasticsearch/elasticsearch:8.17.4 container_name: elasticsearch environment: - discovery.type=single-node - ES_JAVA_OPTS=-Xms512m -Xmx512m ulimits: memlock: soft: -1 hard: -1 volumes: - esdata:/usr/share/elasticsearch/data ports: - 9200:9200 networks: - elastic networks: elastic: driver: bridge volumes: esdata: driver: local 2. Configure System Settings Elasticsearch requires certain system settings to function properly. Run the following command to allow it to lock memory: ...

May 5, 2025 · 2 min · TC

Understanding Azure Service Bus Filters and Actions

Recently, I worked on a project where I leveraged Azure Service Bus Filters to optimize message handling. Below is a detailed summary of what I learned and how these filters can be effectively used. What are Azure Service Bus Filters? Azure Service Bus Topics provide a robust way to publish messages to multiple subscribers, following a pub/sub model. In this model, a topic acts as a queue where messages are sent, and subscriptions to that topic allow selective filtering and processing of messages. Filters and actions are key features that enable precise control over which messages subscribers receive and how they are processed. ...

May 5, 2025 · 2 min · TC

What Are Embedding Models?

Embedding models are a cornerstone of modern AI, transforming complex data—like words, sentences, or images—into numerical representations called embeddings. These embeddings are vectors in a multi-dimensional space, enabling machines to understand relationships between pieces of data. Here’s how they’re used across various fields: Applications of Embedding Models Natural Language Processing (NLP): Embeddings encode the meaning of words or sentences, powering tasks like sentiment analysis, machine translation, and question answering. Recommendation Systems: By embedding user preferences and item characteristics, these models enhance recommendations based on similarities. Image Recognition: Image embeddings identify objects or group similar images, making them essential for tasks like facial recognition. Search Engines: Embeddings improve search accuracy by finding data with similar representations. Clustering and Classification: They help identify patterns and group data efficiently, aiding in tasks like customer segmentation. How Embedding Models Work At their core, embedding models convert complex data into a format that computers can process and make decisions on. These models differ in several key aspects: ...

May 5, 2025 · 2 min · TC

What is Retrieval-Augmented Generation (RAG)?

As part of a small AI project, I wanted to dive deeper into Retrieval-Augmented Generation (RAG) to understand its potential. Below is a summary of what I learned and why I chose to use it for my website. What is RAG? RAG stands for Retrieval-Augmented Generation. It’s a method used in AI to enhance the way large language models generate responses by incorporating external information. Here’s how it works in simple terms: ...

May 5, 2025 · 2 min · TC

Building a Local RAG System for My Static Hugo Blog

How I integrated Retrieval-Augmented Generation (RAG) functionality into my static Hugo site using C# and local LLMs to create a dynamic question-answering system.

April 24, 2025 · TC