AURA

Offline AI developer assistant with voice commands and system automation.

View Source

Overview

Problem

Cloud-based AI assistants require constant internet access, raise data privacy concerns for enterprise code, and incur recurring API costs. Developers need an AI assistant that works completely offline with full control over their data.

Solution

Created a desktop AI assistant that runs entirely on local hardware using Ollama for LLM inference. Features voice commands, codebase-aware context via ChromaDB embeddings, and a modular plugin architecture for system automation tasks.

Tech Stack

PythonOllamaDockerPyQt6ChromaDBGitWhisper

Architecture

AURA architecture diagram
High-level architecture of AURA

Features

  • Fully offline AI inference using local LLMs via Ollama
  • Voice command interface with Whisper speech recognition
  • Codebase-aware context through ChromaDB vector embeddings
  • Modular plugin system for extensible automation
  • Git integration for commit summaries and code review
  • Desktop GUI built with PyQt6

Technical Decisions

1

Ollama for local LLM serving — supports multiple model families without GPU lock-in

2

ChromaDB for persistent vector storage with fast similarity search on local disk

3

PyQt6 over Electron to keep the stack Python-native and reduce memory overhead

4

Plugin architecture to allow users to extend functionality without modifying core

Challenges

  • Optimizing LLM inference speed on consumer hardware without dedicated GPU
  • Building reliable voice-to-command pipeline with acceptable error rates
  • Managing vector index updates efficiently as the codebase evolves

Outcomes

  • Published to PyPI as orkio — installable via pip install orkio
  • Zero cloud dependency — works on air-gapped machines
  • MCP server variant for IDE integration

Gallery