AURA
Offline AI developer assistant with voice commands and system automation.
Overview
Problem
Cloud-based AI assistants require constant internet access, raise data privacy concerns for enterprise code, and incur recurring API costs. Developers need an AI assistant that works completely offline with full control over their data.
Solution
Created a desktop AI assistant that runs entirely on local hardware using Ollama for LLM inference. Features voice commands, codebase-aware context via ChromaDB embeddings, and a modular plugin architecture for system automation tasks.
Tech Stack
Architecture
Features
- Fully offline AI inference using local LLMs via Ollama
- Voice command interface with Whisper speech recognition
- Codebase-aware context through ChromaDB vector embeddings
- Modular plugin system for extensible automation
- Git integration for commit summaries and code review
- Desktop GUI built with PyQt6
Technical Decisions
Ollama for local LLM serving — supports multiple model families without GPU lock-in
ChromaDB for persistent vector storage with fast similarity search on local disk
PyQt6 over Electron to keep the stack Python-native and reduce memory overhead
Plugin architecture to allow users to extend functionality without modifying core
Challenges
- Optimizing LLM inference speed on consumer hardware without dedicated GPU
- Building reliable voice-to-command pipeline with acceptable error rates
- Managing vector index updates efficiently as the codebase evolves
Outcomes
- Published to PyPI as orkio — installable via pip install orkio
- Zero cloud dependency — works on air-gapped machines
- MCP server variant for IDE integration