AURA

Offline AI developer assistant with voice commands and system automation.

Overview

Problem

Cloud-based AI assistants require constant internet access, raise data privacy concerns for enterprise code, and incur recurring API costs. Developers need an AI assistant that works completely offline with full control over their data.

Solution

Created a desktop AI assistant that runs entirely on local hardware using Ollama for LLM inference. Features voice commands, codebase-aware context via ChromaDB embeddings, and a modular plugin architecture for system automation tasks.

Tech Stack

PythonOllamaDockerPyQt6ChromaDBGitWhisper

Architecture

Features

Fully offline AI inference using local LLMs via Ollama
Voice command interface with Whisper speech recognition
Codebase-aware context through ChromaDB vector embeddings
Modular plugin system for extensible automation
Git integration for commit summaries and code review
Desktop GUI built with PyQt6

Technical Decisions

Ollama for local LLM serving — supports multiple model families without GPU lock-in

ChromaDB for persistent vector storage with fast similarity search on local disk

PyQt6 over Electron to keep the stack Python-native and reduce memory overhead

Plugin architecture to allow users to extend functionality without modifying core

Challenges

Optimizing LLM inference speed on consumer hardware without dedicated GPU
Building reliable voice-to-command pipeline with acceptable error rates
Managing vector index updates efficiently as the codebase evolves

Outcomes

Published to PyPI as orkio — installable via pip install orkio
Zero cloud dependency — works on air-gapped machines
MCP server variant for IDE integration