Skip to main content
Running Local AI with .NET and Ollama

Running Local AI with .NET and Ollama

Cut cloud costs and improve privacy by running LLMs locally with Ollama and .NET. Learn how to build a local Log Analyzer.

  1. Posts/

Running Local AI with .NET and Ollama

·646 words·4 mins· loading
👤

Chris Malpass

Author

Cloud-based AI models like GPT-4 are powerful, but they come with trade-offs: latency, cost, and privacy. If you’re analyzing sensitive server logs or PII, sending that data to the cloud might be a non-starter.

In this post, we’ll build a Local Log Analyzer that runs entirely on your machine using Ollama and .NET.

Why Local AI?
#

  • Privacy: Your data never leaves your network.
  • Cost: Zero API fees, no matter how many tokens you process.
  • Latency: No network round-trips; speed depends entirely on your hardware.
  • Offline: Works on an air-gapped server or a plane.

The Stack
#

  1. Ollama: A lightweight tool to run models like Llama 3, Phi-3, or Mistral locally.
  2. Semantic Kernel: The .NET SDK to orchestrate the interaction.
  3. Llama 3 (8B): A powerful, efficient model that runs well on most modern laptops (requires ~8GB RAM).

Step 1: Setup Ollama
#

Download Ollama from ollama.com. Once installed, pull the model we’ll use:

1
ollama pull llama3

By default, Ollama starts a local API server at http://localhost:11434.

Step 2: The Code (Log Analyzer)
#

We’ll write a C# program that reads a raw, messy error log and asks the local AI to extract the key details into a clean format.

Prerequisites:

1
dotnet add package Microsoft.SemanticKernel

The Code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;

// 1. Configure the Kernel to talk to Ollama
// Ollama provides an OpenAI-compatible API, so we use the standard OpenAI connector.
var builder = Kernel.CreateBuilder();

builder.AddOpenAIChatCompletion(
    modelId: "llama3",               // The model name you pulled in Ollama
    apiKey: "ollama",                // Ollama ignores this, but the SDK requires a non-empty string
    endpoint: new Uri("http://localhost:11434/v1")); // The local API endpoint

var kernel = builder.Build();

// 2. Define our messy input data
var rawLogEntry = @"
[2025-12-04 14:22:11] [ERROR] [AuthService] Connection timeout while reaching DB_USERS (192.168.1.55). 
Retry count: 3. Exception: System.Net.Sockets.SocketException: Host is down.
";

// 3. Create the Prompt
// We ask the model to act as a parser.
var prompt = $@"
You are a system log analyzer. 
Analyze the following log entry and extract the Timestamp, Service Name, Error Type, and Root Cause.
Provide the output as a concise summary.

Log Entry:
{rawLogEntry}
";

// 4. Run it locally
Console.WriteLine("Analyzing log locally...");
var result = await kernel.InvokePromptAsync(prompt);

Console.WriteLine("\n--- Analysis Result ---");
Console.WriteLine(result);

Expected Output
#

Because this runs locally, you’ll see the output appear almost instantly (depending on your GPU/CPU).

1
2
3
4
5
--- Analysis Result ---
Timestamp: 2025-12-04 14:22:11
Service: AuthService
Error Type: Connection Timeout
Root Cause: The host DB_USERS (192.168.1.55) is down (SocketException).

Advanced Tips for Local Models
#

Context Window Management
#

Local models often have smaller context windows (e.g., 4k or 8k tokens) compared to cloud models (128k). If you’re analyzing huge log files, you’ll need to split them into chunks.

Temperature Settings
#

For extraction tasks like this, you want the model to be precise, not creative. When creating your request, set the Temperature to 0.

1
2
var settings = new OpenAIPromptExecutionSettings { Temperature = 0 };
var result = await kernel.InvokePromptAsync(prompt, new(settings));

Hardware Requirements
#

  • 7B/8B Models (Llama 3, Mistral): Require ~8GB RAM. Runs decent on CPU, fast on Apple Silicon/NVIDIA.
  • Phi-3 (3.8B): Requires ~4GB RAM. Runs great on almost anything.
  • 70B Models: Require ~48GB RAM. You’ll need a serious workstation or Mac Studio.

Running AI locally puts you in full control. Whether for privacy compliance or just building cool tools that work offline, the combination of Ollama and .NET is incredibly potent.

Further Reading
#