—
title: “LEANN: Natural Language Search for Your Local Files” date: “2025-09-24” tags: [“ai”, “search”, “mcp”, “productivity”, “development”, “privacy”] draft: true
Over the last six months, as my use of Claude Code has increased, I’ve been jumping between dozens of different codebases. I’m often jump between a few clients over the course of the month, and I might not touch an OSS project for a few weeks or months at a time.
I recently discovered LEANN, a project that tackles this exact problem with a privacy-first approach to semantic search across local codebases as well as other documents, your emails and messages. After testing it extensively in my consulting work, I can say this is significant - it represents the kind of practical AI application that solves real problems without compromising on privacy or requiring massive infrastructure investments.
LEANN: Privacy-First Semantic Search That Actually Works
LEANN provides local-first solution that indexes your projects using natural language understanding. The key differentiators that make this compelling for consulting work:
Local processing: No code leaves your machine - critical for consulting work with sensitive client codebases. I’ve spent over a decade working with enterprise clients where data privacy isn’t negotiable, and this approach respects those constraints completely.
Semantic understanding: Search for “database connection pooling” and find relevant configs even if they don’t contain those exact words. This is transformative when working with legacy codebases where naming conventions vary wildly across different modules.
Cross-project intelligence: One index can span multiple repositories and project structures. In microservices architectures, understanding how services interact often requires searching across multiple repositories - LEANN’s cross-project indexing solves this problem elegantly.
Lightweight: Uses efficient embedding models that don’t require GPU clusters or cloud connectivity.
The significance here is that we get Google-quality search across our local development environment while maintaining complete privacy and control.
Real-World Setup: Getting LEANN Running in Minutes
The installation process uses modern Python toolchain management, which I appreciate after dealing with too many fragile Python environments over the years:
# Install with uv (the tool I recommend for all Python project management)
uv tool install leann-core --with leann
# Index a project - works with any file structure
leann build my-client-microservices --docs $(git ls-files)
# The magic: natural language queries against your codebase
leann search "how do we handle database timeouts in the payment service"
This approach is superior to traditional IDE search because you’re building a semantic understanding of your entire project structure that persists across IDE restarts and works regardless of which editor you’re using. I can index a complex distributed system once and then ask conceptual questions about it for weeks.
Practical Examples: LEANN in Action
After indexing a typical microservices project, here’s what LEANN searches look like compared to traditional approaches:
Traditional grep approach:
# Looking for authentication code - misses semantic variations
grep -r "authenticate" src/
grep -r "auth" src/
grep -r "login" src/
# Results: lots of noise, missed relevant code using different terminology
LEANN semantic search:
# Natural language query finds conceptually related code
leann search "user authentication and authorization"
# Results include files with:
# - JWT token validation in middleware/security.go
# - OAuth configuration in config/auth.yaml
# - User permission checks in services/user_service.py
# - Session management in handlers/session.js
The difference is striking. LEANN understands that “authentication” relates to tokens, sessions, permissions, and security middleware - even when those files don’t contain the exact word “authenticate.”
Finding Configuration Patterns:
# Traditional approach requires knowing exact config keys
grep -r "timeout" config/
grep -r "retry" config/
grep -r "circuit" config/
# LEANN finds resilience patterns conceptually
leann search "how do we handle service failures and retries"
# Results span multiple config files and code implementations:
# - HTTP timeout configs in api_config.yaml
# - Database connection pool settings in db.yaml
# - Circuit breaker implementations in lib/resilience.go
# - Retry policies in message_handlers.py
Cross-Service Dependency Discovery:
# This is nearly impossible with traditional tools
leann search "which services depend on the user service"
# LEANN finds:
# - Direct API calls in payment_service/client.go
# - Event subscriptions in notification_service/events.py
# - Database foreign key references in schema migrations
# - Docker compose service dependencies
# - Kubernetes service mesh configurations
Emergency Troubleshooting Scenarios: During a production incident, these searches become invaluable:
# Find all error handling for a specific failure mode
leann search "database connection pool exhausted error handling"
# Locate monitoring and alerting setup
leann search "how do we monitor memory usage and garbage collection"
# Find similar bug fixes from project history
leann search "deadlock detection and recovery patterns"
Each query returns relevant code across your entire indexed project structure in under 200ms. Compare this to manually hunting through multiple repositories, documentation sites, and configuration files - the time savings compound quickly.
Claude Code Integration: The MCP Server That Changes Everything
This is where LEANN becomes genuinely transformative for consulting work. The MCP (Model Context Protocol) server integration means Claude Code can automatically search and understand your local projects:
# One-time setup to connect LEANN with Claude Code
claude mcp add --scope user leann-server -- leann_mcp
Now when working with Claude Code, you can ask questions like:
- “Show me all the places we configure Redis timeouts”
- “Find examples of our error handling patterns for API calls”
- “Where do we implement circuit breakers in this microservices architecture”
Claude gets semantic search results from LEANN and can reason about your actual codebase, not just general programming knowledge. The good news is that this creates a development environment that’s genuinely more intelligent - not because it’s connected to the cloud, but because it understands your local context better than any external service ever could.
Why This Matters for Distributed Systems Practitioners
The business impact here is substantial. A senior consultant billing $200/hour who saves 30 minutes daily through better code discovery pays for this tool setup in less than a week while delivering faster client results.
Knowledge Transfer: When onboarding new team members to complex distributed systems, they can ask natural language questions about the codebase instead of spending weeks reading documentation that may be outdated. I’ve seen teams reduce onboarding time from weeks to days using this approach.
Cross-Service Understanding: Traditional search tools make it painful to understand how services interact across repository boundaries. With LEANN’s cross-project indexing, you can ask “where do we call the user service from other microservices” and get results spanning your entire codebase.
Emergency Troubleshooting: During production incidents, LEANN allows you to quickly find relevant code patterns, configuration examples, or similar bug fixes across your entire project history. This can massively slash the time required to identify root causes.
Implementation Strategy and Next Steps
My recommendation for rolling this out effectively:
Start Small: Index one complex project first to validate the approach. I typically start with the most problematic repository - the one where developers consistently struggle to find things.
Team Adoption: Set up shared indexing strategies for team projects. Consider creating organization-wide indexes for common libraries and shared infrastructure code.
CI Integration: Consider automating index rebuilds as part of your deployment pipeline. Fresh indexes ensure search results stay current as codebases evolve.
Documentation Strategy: Use LEANN to bridge the gap between code and documentation. Index both your code repositories and your documentation sites together for comprehensive project understanding.
The objective is to transform how distributed teams discover and understand code, moving from keyword-based archaeology to intelligent, context-aware search.
The Privacy Advantage
For consulting work, the local-first approach isn’t just nice-to-have - it’s essential. Client confidentiality agreements often prohibit sending code to external services, making cloud-based AI coding assistants unusable for sensitive projects. LEANN respects these constraints while still providing modern AI-powered developer experience.
This approach also means your search performance isn’t dependent on internet connectivity or external service availability. The index stays local, searches are instant, and your development workflow remains uninterrupted regardless of network conditions.
Conclusion
LEANN represents exactly the kind of practical AI tooling that makes developers more effective without introducing privacy risks or infrastructure complexity. The combination of semantic search with Claude Code integration creates a development environment where finding relevant code becomes as natural as asking a question.
After using this extensively across multiple client engagements, I can confidently say this approach can save your organization huge sums of cash by reducing the time developers spend hunting through codebases. More importantly, it democratizes deep codebase knowledge across the entire team, reducing the expertise bottlenecks that often slow down distributed systems projects.
The project is actively maintained and available on GitHub. If you’re working with complex, multi-repository systems, I highly recommend giving LEANN a try - it might just change how your team approaches code discovery and understanding.
If you found this post helpful, please consider sharing to your network. I'm also available to help you be successful with your distributed systems! Please reach out if you're interested in working with me, and I'll be happy to schedule a free one-hour consultation.