Semantic Kernel Agents for AI Orchestration
Build intelligent AI agents using Microsoft Semantic Kernel with tool calling, memory, and multi-agent coordination in .NET.
Introduction
When I first started building our document intelligence pipeline, I had a single monolithic LLM call that tried to do everything — classify the document, extract entities, generate a summary, and answer follow-up questions. It worked fine on simple invoices. Then someone uploaded a 47-page legal contract and asked it to compare clauses with a previous version. The model hallucinated wildly, confused sections from different documents, and returned a confident but completely wrong answer. That failure pushed me toward Semantic Kernel’s agent-based approach, where each specialized agent does one thing well and an orchestrator coordinates them. The difference was night and day.
A user asks: “Find all documents about the Q3 budget, compare them to last year’s actuals, and summarize the key variances.”
This isn’t a simple search query—it requires multiple steps: search, retrieval, comparison, and synthesis. Traditional APIs can’t handle this. AI agents can.
Agents go beyond simple question-answering by:
- Planning: Breaking complex requests into subtasks
- Tool calling: Using external functions (search, database queries, APIs)
- Memory: Maintaining context across interactions
- Reasoning: Deciding which tools to use and in what order
This reasoning-acting loop is the foundation of modern agent architectures, where the LLM interleaves chain-of-thought reasoning with concrete tool invocations.
What We’ll Build
In this guide, we’ll use Microsoft Semantic Kernel to build:
- Search Agent: Semantic and keyword search over document collections
- Analysis Agent: Document comparison and entity extraction
- Summary Agent: Report generation and synthesis
- Agent Orchestrator: Coordinates agents for complex multi-step tasks
By the end, users can ask natural language questions that span multiple operations—and get coherent, synthesized answers.
Why Semantic Kernel?
Microsoft’s Semantic Kernel is the .NET-native way to build AI applications:
- First-class C# support: Strongly-typed plugins, dependency injection integration
- Model agnostic: Works with OpenAI, Azure OpenAI, Ollama, and others
- Production ready: Used in Microsoft 365 Copilot
- Plugin architecture: Clean separation between AI reasoning and business logic
Architecture Overview
Our agent system coordinates multiple specialized agents, each with access to specific tools. This multi-agent pattern — where each agent has a narrow focus and an orchestrator routes between them — draws from research on collaborative AI systems.
[AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation] — Wu, Q. et al. (Microsoft Research) , 2023-08-16flowchart TB
subgraph User["💬 User Interface"]
Query["🗣️ Natural Language Query"]
end
subgraph Orchestrator["🎯 Semantic Kernel Runtime"]
Planner["🧠 Planner\n(LLM Reasoning)"]
Memory["💾 Memory\n(Conversation + Semantic)"]
end
subgraph Agents["🤖 Specialized Agents"]
Search["🔍 Search Agent"]
Analysis["📊 Analysis Agent"]
Summary["📝 Summary Agent"]
end
subgraph Tools["🛠️ Plugins (Tools)"]
SearchTools["🔎 SemanticSearch\nKeywordSearch\nFilterDocs"]
AnalysisTools["📋 ReadDocument\nExtractEntities\nCompareContent"]
SummaryTools["✍️ GenerateText\nCreateReport"]
end
subgraph LLM["🦙 LLM Backend"]
Ollama["⚡ Ollama\n(Fast/Local)"]
OpenAI["🌐 OpenAI\n(Complex)"]
end
Query --> Planner
Planner --> Search
Planner --> Analysis
Planner --> Summary
Planner <--> Memory
Search --> SearchTools
Analysis --> AnalysisTools
Summary --> SummaryTools
Planner --> Ollama
Planner -.->|"Fallback"| OpenAI
classDef primary fill:#7c3aed,color:#fff
classDef secondary fill:#06b6d4,color:#fff
classDef db fill:#f43f5e,color:#fff
classDef warning fill:#fbbf24,color:#000
class Orchestrator secondary
class Agents,Tools primary
class LLM db
class User warning
Agent Coordination Flow:
- User Query: Natural language request enters the system
- Planner: LLM reasons about which agents/tools to invoke
- Agents: Execute their specialized plugins (search, analyze, summarize)
- Memory: Maintains conversation context and semantic recall
- Response: Synthesized answer returned to user
Agent Architecture Overview
Before diving into code, let’s understand how the pieces fit together:
flowchart TB
subgraph System["🤖 Agent System"]
User["💬 User Query"] --> Orch["🎯 Orchestrator\n(Semantic Kernel Agent Runtime)"]
Orch --> SA["🔍 Search Agent"]
Orch --> AA["📊 Analysis Agent"]
Orch --> SuA["📝 Summary Agent"]
SA --> |"SemanticSearch\nKeywordSearch\nFilterDocs"| Memory
AA --> |"ReadDocument\nExtractEntities\nCompareContent"| Memory
SuA --> |"GenerateText\nCreateReport\nSendEmail"| Memory
Memory["🧠 Memory Store\n(Conversation + Semantic)"]
end
classDef primary fill:#7c3aed,color:#fff
classDef secondary fill:#06b6d4,color:#fff
classDef db fill:#f43f5e,color:#fff
classDef warning fill:#fbbf24,color:#000
class System primary
The flow:
- User submits a natural language query
- Orchestrator uses LLM reasoning to determine which agents/tools to invoke
- Agents execute their tools (search, read, analyze, etc.)
- Results are combined and synthesized into a coherent response
Semantic Kernel Setup
First, configure Semantic Kernel with our LLM providers and plugins:
// Infrastructure/DependencyInjection.cs
public static IServiceCollection AddSemanticKernel(
this IServiceCollection services,
IConfiguration configuration)
{
services.AddSingleton(sp =>
{
var kernelBuilder = Kernel.CreateBuilder();
// Configure Ollama for local inference (fast, private)
var ollamaEndpoint = configuration["Ollama:Endpoint"]
?? "http://ollama.ai.svc.cluster.local:11434";
kernelBuilder.AddOllamaChatCompletion(
modelId: "llama3.2",
endpoint: new Uri(ollamaEndpoint));
// Add OpenAI for complex reasoning (fallback for hard tasks)
var openAiKey = configuration["OpenAI:ApiKey"];
if (!string.IsNullOrEmpty(openAiKey))
{
kernelBuilder.AddOpenAIChatCompletion(
modelId: "gpt-4o",
apiKey: openAiKey,
serviceId: "openai-complex");
}
// Inject our application services into Semantic Kernel
kernelBuilder.Services.AddSingleton(sp.GetRequiredService<IVectorStore>());
kernelBuilder.Services.AddSingleton(sp.GetRequiredService<IDocumentRepository>());
kernelBuilder.Services.AddSingleton(sp.GetRequiredService<IEmbeddingService>());
// Register plugins (tools the agent can use)
kernelBuilder.Plugins.AddFromType<SearchPlugin>("Search");
kernelBuilder.Plugins.AddFromType<DocumentPlugin>("Document");
kernelBuilder.Plugins.AddFromType<AnalysisPlugin>("Analysis");
return kernelBuilder.Build();
});
services.AddScoped<IAgentService, SemanticKernelAgentService>();
return services;
}
Key insight: Plugins are C# classes with methods decorated with [KernelFunction]. The LLM reads the descriptions and decides when to call them.
Plugin Definitions
Plugins expose your business logic as tools the AI can use. Each method describes what it does; the LLM decides when to call it.
Search Plugin
The search plugin provides semantic and keyword search capabilities:
// Application/Plugins/SearchPlugin.cs
public sealed class SearchPlugin
{
private readonly IVectorStore _vectorStore;
private readonly IEmbeddingService _embeddings;
private readonly IUserContext _userContext;
public SearchPlugin(
IVectorStore vectorStore,
IEmbeddingService embeddings,
IUserContext userContext)
{
_vectorStore = vectorStore;
_embeddings = embeddings;
_userContext = userContext;
}
[KernelFunction("semantic_search")]
[Description("Search documents by semantic meaning. Use when the user asks to find documents about a topic.")]
public async Task<string> SemanticSearchAsync(
[Description("The search query describing what to find")] string query,
[Description("Maximum number of results (default 5)")] int limit = 5)
{
var ownerId = _userContext.CustomId;
var queryVector = await _embeddings.GenerateAsync(query);
var results = await _vectorStore.SearchAsync(
queryVector,
ownerId,
limit: limit,
scoreThreshold: 0.6f);
if (results.Count == 0)
{
return "No documents found matching your query.";
}
var sb = new StringBuilder();
sb.AppendLine($"Found {results.Count} relevant passages:\n");
foreach (var (result, index) in results.Select((r, i) => (r, i)))
{
sb.AppendLine($"**Result {index + 1}** (Score: {result.Score:F2})");
sb.AppendLine($"Document: {result.DocumentId}");
sb.AppendLine($"Content: {result.Content.Truncate(300)}...");
sb.AppendLine();
}
return sb.ToString();
}
[KernelFunction("filter_documents")]
[Description("Filter documents by tags or date range.")]
public async Task<string> FilterDocumentsAsync(
[Description("Comma-separated tags to filter by")] string? tags = null,
[Description("ISO date for 'created after' filter")] string? createdAfter = null)
{
var ownerId = _userContext.CustomId;
var filters = new SearchFilters
{
Tags = tags?.Split(',', StringSplitOptions.TrimEntries).ToList(),
CreatedAfter = createdAfter != null
? DateTimeOffset.Parse(createdAfter)
: null
};
// Use a generic embedding to fetch filtered results
var genericVector = await _embeddings.GenerateAsync("document content");
var results = await _vectorStore.SearchAsync(
genericVector,
ownerId,
limit: 20,
scoreThreshold: 0.0f, // No semantic filtering
filters: filters);
var uniqueDocs = results
.DistinctBy(r => r.DocumentId)
.Take(10)
.ToList();
if (uniqueDocs.Count == 0)
{
return "No documents match the specified filters.";
}
return $"Found {uniqueDocs.Count} documents:\n" +
string.Join("\n", uniqueDocs.Select(d => $"- {d.DocumentId}"));
}
}
Document Plugin
// Application/Plugins/DocumentPlugin.cs
public sealed class DocumentPlugin
{
private readonly IDocumentRepository _documents;
private readonly IArchiveStorage _storage;
private readonly IUserContext _userContext;
public DocumentPlugin(
IDocumentRepository documents,
IArchiveStorage storage,
IUserContext userContext)
{
_documents = documents;
_storage = storage;
_userContext = userContext;
}
[KernelFunction("read_document")]
[Description("Read the full content of a specific document by its ID.")]
public async Task<string> ReadDocumentAsync(
[Description("The document ID (8-character CustomId)")] string documentId)
{
var ownerId = _userContext.CustomId;
var docId = CustomId.From(documentId);
var document = await _documents.GetByIdAsync(docId);
if (document == null || document.OwnerId != ownerId)
{
return $"Document '{documentId}' not found or access denied.";
}
// Read OCR content from storage
var contentPath = $"processed/{documentId}/content.md";
var content = await _storage.ReadTextAsync(contentPath);
return $"""
**Document: {document.Title}**
- ID: {document.Id}
- Status: {document.Status}
- Created: {document.CreatedAt:yyyy-MM-dd}
- Tags: {string.Join(", ", document.Tags)}
**Content:**
{content?.Truncate(2000) ?? "Content not available"}
""";
}
[KernelFunction("list_documents")]
[Description("List the user's documents with optional status filter.")]
public async Task<string> ListDocumentsAsync(
[Description("Filter by status: Pending, Processing, Ready, Failed")] string? status = null)
{
var ownerId = _userContext.CustomId;
DocumentStatus? statusFilter = status != null
? Enum.Parse<DocumentStatus>(status, ignoreCase: true)
: null;
var documents = await _documents.GetByOwnerAsync(ownerId, statusFilter);
if (documents.Count == 0)
{
return status != null
? $"No documents with status '{status}' found."
: "No documents found in your archive.";
}
var sb = new StringBuilder();
sb.AppendLine($"Found {documents.Count} documents:\n");
foreach (var doc in documents.Take(10))
{
sb.AppendLine($"- **{doc.Title}** ({doc.Id})");
sb.AppendLine($" Status: {doc.Status} | Created: {doc.CreatedAt:yyyy-MM-dd}");
}
if (documents.Count > 10)
{
sb.AppendLine($"\n...and {documents.Count - 10} more.");
}
return sb.ToString();
}
[KernelFunction("get_document_summary")]
[Description("Get a brief summary of a document's metadata without reading full content.")]
public async Task<string> GetDocumentSummaryAsync(
[Description("The document ID")] string documentId)
{
var ownerId = _userContext.CustomId;
var docId = CustomId.From(documentId);
var document = await _documents.GetByIdAsync(docId);
if (document == null || document.OwnerId != ownerId)
{
return $"Document '{documentId}' not found.";
}
return $"""
**{document.Title}**
- ID: {document.Id}
- Type: {document.MimeType}
- Size: {document.FileSize.ToHumanReadable()}
- Status: {document.Status}
- Tags: {string.Join(", ", document.Tags)}
- Created: {document.CreatedAt:yyyy-MM-dd HH:mm}
- Last Modified: {document.UpdatedAt:yyyy-MM-dd HH:mm}
""";
}
}
Analysis Plugin
// Application/Plugins/AnalysisPlugin.cs
public sealed class AnalysisPlugin
{
private readonly IVectorStore _vectorStore;
private readonly IEmbeddingService _embeddings;
private readonly IUserContext _userContext;
public AnalysisPlugin(
IVectorStore vectorStore,
IEmbeddingService embeddings,
IUserContext userContext)
{
_vectorStore = vectorStore;
_embeddings = embeddings;
_userContext = userContext;
}
[KernelFunction("compare_documents")]
[Description("Compare the content of two documents to find similarities and differences.")]
public async Task<string> CompareDocumentsAsync(
[Description("First document ID")] string documentId1,
[Description("Second document ID")] string documentId2)
{
var ownerId = _userContext.CustomId;
// Get embeddings for both documents
var doc1Vectors = await GetDocumentVectorsAsync(CustomId.From(documentId1), ownerId);
var doc2Vectors = await GetDocumentVectorsAsync(CustomId.From(documentId2), ownerId);
if (doc1Vectors.Count == 0 || doc2Vectors.Count == 0)
{
return "One or both documents not found or not indexed.";
}
// Calculate average similarity
var similarities = new List<float>();
foreach (var v1 in doc1Vectors)
{
foreach (var v2 in doc2Vectors)
{
similarities.Add(CosineSimilarity(v1, v2));
}
}
var avgSimilarity = similarities.Average();
var interpretation = avgSimilarity switch
{
> 0.9f => "nearly identical content",
> 0.7f => "highly similar topics",
> 0.5f => "moderately related",
> 0.3f => "loosely related",
_ => "different topics"
};
return $"""
**Document Comparison**
- Document 1: {documentId1} ({doc1Vectors.Count} chunks)
- Document 2: {documentId2} ({doc2Vectors.Count} chunks)
- Similarity Score: {avgSimilarity:P0}
- Interpretation: {interpretation}
""";
}
[KernelFunction("find_similar_documents")]
[Description("Find documents similar to a given document.")]
public async Task<string> FindSimilarDocumentsAsync(
[Description("Source document ID")] string documentId,
[Description("Number of similar documents to find")] int limit = 5)
{
var ownerId = _userContext.CustomId;
var docId = CustomId.From(documentId);
// Get first chunk vector as representative
var results = await _vectorStore.SearchAsync(
new float[384], // Placeholder, we need the actual vector
ownerId,
limit: 1,
filters: new SearchFilters { DocumentIds = [docId] });
if (results.Count == 0)
{
return $"Document '{documentId}' not indexed.";
}
// Search for similar but exclude the source
var sourceVector = await GetFirstChunkVectorAsync(docId, ownerId);
if (sourceVector == null)
{
return "Could not retrieve document embeddings.";
}
var similar = await _vectorStore.SearchAsync(
sourceVector,
ownerId,
limit: limit + 10, // Get more to filter
scoreThreshold: 0.5f);
var filtered = similar
.Where(r => r.DocumentId != docId)
.DistinctBy(r => r.DocumentId)
.Take(limit)
.ToList();
if (filtered.Count == 0)
{
return "No similar documents found.";
}
var sb = new StringBuilder();
sb.AppendLine($"Documents similar to {documentId}:\n");
foreach (var doc in filtered)
{
sb.AppendLine($"- {doc.DocumentId} (Similarity: {doc.Score:P0})");
}
return sb.ToString();
}
private async Task<IReadOnlyList<float[]>> GetDocumentVectorsAsync(
CustomId documentId, CustomId ownerId)
{
// Implementation to retrieve stored vectors for a document
// This would require extending the vector store interface
return [];
}
private async Task<float[]?> GetFirstChunkVectorAsync(
CustomId documentId, CustomId ownerId)
{
// Implementation to get first chunk embedding
return null;
}
private static float CosineSimilarity(float[] a, float[] b)
{
var dotProduct = a.Zip(b, (x, y) => x * y).Sum();
var magnitudeA = MathF.Sqrt(a.Sum(x => x * x));
var magnitudeB = MathF.Sqrt(b.Sum(x => x * x));
return dotProduct / (magnitudeA * magnitudeB);
}
}
Agent Service
Implementation
// Application/Services/SemanticKernelAgentService.cs
public sealed class SemanticKernelAgentService : IAgentService
{
private readonly Kernel _kernel;
private readonly ILogger<SemanticKernelAgentService> _logger;
private const string SystemPrompt = """
You are a helpful document assistant for a personal knowledge management system.
Your capabilities:
1. Search documents using semantic search (Search.semantic_search)
2. Read specific documents (Document.read_document)
3. List and filter documents (Document.list_documents, Search.filter_documents)
4. Compare and analyze documents (Analysis.compare_documents, Analysis.find_similar_documents)
Guidelines:
- Always search before claiming no information exists
- Cite document IDs when referencing content
- If a query is ambiguous, ask for clarification
- Summarize results concisely
- Respect user privacy - only access their documents
Be helpful, accurate, and concise.
""";
public SemanticKernelAgentService(
Kernel kernel,
ILogger<SemanticKernelAgentService> logger)
{
_kernel = kernel;
_logger = logger;
}
public async IAsyncEnumerable<string> ChatAsync(
string userMessage,
ConversationHistory history,
[EnumeratorCancellation] CancellationToken ct = default)
{
var chatHistory = new ChatHistory(SystemPrompt);
// Add conversation history
foreach (var message in history.Messages)
{
chatHistory.Add(message.Role switch
{
ChatRole.User => new ChatMessageContent(AuthorRole.User, message.Content),
ChatRole.Assistant => new ChatMessageContent(AuthorRole.Assistant, message.Content),
_ => throw new ArgumentException($"Unknown role: {message.Role}")
});
}
chatHistory.AddUserMessage(userMessage);
var settings = new OpenAIPromptExecutionSettings
{
ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
Temperature = 0.3f,
MaxTokens = 2000
};
var chatService = _kernel.GetRequiredService<IChatCompletionService>();
await foreach (var chunk in chatService.GetStreamingChatMessageContentsAsync(
chatHistory,
settings,
_kernel,
ct))
{
if (chunk.Content != null)
{
yield return chunk.Content;
}
// Log tool calls
if (chunk.Metadata?.TryGetValue("ToolCalls", out var toolCalls) == true)
{
_logger.LogDebug("Tool call: {ToolCalls}", toolCalls);
}
}
}
public async Task<AgentResponse> ProcessAsync(
string userMessage,
ConversationHistory history,
CancellationToken ct = default)
{
var response = new StringBuilder();
var toolsUsed = new List<string>();
await foreach (var chunk in ChatAsync(userMessage, history, ct))
{
response.Append(chunk);
}
return new AgentResponse
{
Content = response.ToString(),
ToolsUsed = toolsUsed
};
}
}
public sealed record AgentResponse
{
public required string Content { get; init; }
public required IReadOnlyList<string> ToolsUsed { get; init; }
}
public sealed class ConversationHistory
{
private readonly List<ChatMessage> _messages = [];
public IReadOnlyList<ChatMessage> Messages => _messages;
public void AddUserMessage(string content)
{
_messages.Add(new ChatMessage(ChatRole.User, content));
}
public void AddAssistantMessage(string content)
{
_messages.Add(new ChatMessage(ChatRole.Assistant, content));
}
public void Clear() => _messages.Clear();
}
public sealed record ChatMessage(ChatRole Role, string Content);
public enum ChatRole { User, Assistant }
The AutoInvokeKernelFunctions behavior tells Semantic Kernel to let the LLM automatically call any registered plugin function. Under the hood, this leverages OpenAI’s function calling protocol.
API Endpoint
// Api/Endpoints/Agent/ChatEndpoint.cs
public sealed class ChatEndpoint : Endpoint<ChatRequest, IAsyncEnumerable<string>>
{
private readonly IAgentService _agent;
private readonly IConversationStore _conversations;
public ChatEndpoint(
IAgentService agent,
IConversationStore conversations)
{
_agent = agent;
_conversations = conversations;
}
public override void Configure()
{
Post("/api/agent/chat");
Description(d => d.WithTags("Agent"));
}
public override async Task HandleAsync(ChatRequest req, CancellationToken ct)
{
var history = await _conversations.GetOrCreateAsync(req.ConversationId, ct);
HttpContext.Response.ContentType = "text/event-stream";
await foreach (var chunk in _agent.ChatAsync(req.Message, history, ct))
{
await HttpContext.Response.WriteAsync($"data: {chunk}\n\n", ct);
await HttpContext.Response.Body.FlushAsync(ct);
}
// Save message to history
history.AddUserMessage(req.Message);
await _conversations.SaveAsync(history, ct);
}
}
public sealed record ChatRequest
{
public required string Message { get; init; }
public string? ConversationId { get; init; }
}
Blazor Chat Component
@* Components/Agent/AgentChat.razor *@
@using Microsoft.AspNetCore.SignalR.Client
@inject IAgentService AgentService
@implements IAsyncDisposable
<div class="glass-card h-[600px] flex flex-col">
<header class="p-4 border-b border-white/10">
<h2 class="text-lg font-semibold">Document Assistant</h2>
</header>
<div class="flex-1 overflow-y-auto p-4 space-y-4" @ref="_scrollContainer">
@foreach (var message in _messages)
{
<div class="@GetMessageClasses(message)">
<div class="@GetBubbleClasses(message)">
@if (message.IsStreaming)
{
<span>@message.Content</span>
<span class="typing-indicator">▋</span>
}
else
{
@((MarkupString)Markdig.Markdown.ToHtml(message.Content))
}
</div>
</div>
}
</div>
<form @onsubmit="SendMessageAsync" class="p-4 border-t border-white/10">
<div class="flex gap-2">
<input
@bind="_input"
@bind:event="oninput"
placeholder="Ask about your documents..."
class="glass-input flex-1"
disabled="@_isLoading" />
<button
type="submit"
class="glass-button-primary"
disabled="@(_isLoading || string.IsNullOrWhiteSpace(_input))">
@if (_isLoading)
{
<span class="loading-spinner" />
}
else
{
<span>Send</span>
}
</button>
</div>
</form>
</div>
@code {
private readonly List<ChatMessageDisplay> _messages = [];
private readonly ConversationHistory _history = new();
private ElementReference _scrollContainer;
private string _input = "";
private bool _isLoading;
private async Task SendMessageAsync()
{
if (string.IsNullOrWhiteSpace(_input) || _isLoading) return;
var userMessage = _input.Trim();
_input = "";
_isLoading = true;
// Add user message
_messages.Add(new ChatMessageDisplay
{
Role = ChatRole.User,
Content = userMessage
});
// Add assistant placeholder
var assistantMessage = new ChatMessageDisplay
{
Role = ChatRole.Assistant,
Content = "",
IsStreaming = true
};
_messages.Add(assistantMessage);
try
{
await foreach (var chunk in AgentService.ChatAsync(
userMessage, _history))
{
assistantMessage.Content += chunk;
StateHasChanged();
await ScrollToBottomAsync();
}
assistantMessage.IsStreaming = false;
_history.AddUserMessage(userMessage);
_history.AddAssistantMessage(assistantMessage.Content);
}
catch (Exception ex)
{
assistantMessage.Content = $"Error: {ex.Message}";
assistantMessage.IsStreaming = false;
}
finally
{
_isLoading = false;
StateHasChanged();
}
}
private async Task ScrollToBottomAsync()
{
await JS.InvokeVoidAsync("scrollToBottom", _scrollContainer);
}
private static string GetMessageClasses(ChatMessageDisplay message) =>
message.Role == ChatRole.User
? "flex justify-end"
: "flex justify-start";
private static string GetBubbleClasses(ChatMessageDisplay message) =>
message.Role == ChatRole.User
? "glass-bubble-user max-w-[80%] p-3 rounded-xl"
: "glass-bubble-assistant max-w-[80%] p-3 rounded-xl prose prose-sm prose-invert";
public ValueTask DisposeAsync() => ValueTask.CompletedTask;
private sealed class ChatMessageDisplay
{
public ChatRole Role { get; init; }
public string Content { get; set; } = "";
public bool IsStreaming { get; set; }
}
}
Conclusion
Semantic Kernel agents provide:
| Feature | Benefit |
|---|---|
| Plugin System | Modular, testable tool definitions |
| Auto Tool Selection | LLM decides which tools to use |
| Streaming | Real-time response display |
| Memory | Conversation history management |
Combined with document search plugins, agents enable natural language interactions with your knowledge base.
Looking back at this journey, the biggest shift in my thinking was moving from “one LLM call to rule them all” to a system of specialized agents. The monolithic approach felt simpler at first, but it collapsed under real-world complexity. With Semantic Kernel’s plugin architecture, each agent has a clear contract, is independently testable, and can be swapped or upgraded without affecting the others. The orchestrator pattern gave us the composability we needed to handle increasingly complex user queries without rewriting the entire pipeline every time requirements changed.
The next frontier for us is adding persistent semantic memory across sessions, so the agent remembers not just the current conversation but what the user has asked about historically. That is where things get truly interesting.
Next Steps
- LangGraph State Collisions: Lessons Learned — state management pitfalls in another agent framework
- AI Strategy: Moving from Local Llama to OpenAI — the hybrid architecture powering our LLM backend
- Optimizing System Latency — reducing end-to-end response times