Skip to content

Deployment

This guide covers how to deploy Nevron to your own server using Docker Compose. This is the recommended method for running Nevron in a production environment.

Server Deployment with Docker Compose

Deploying Nevron to your server is straightforward with our Docker Compose setup. This approach allows you to run the entire application stack with minimal configuration.

Official Docker Image

Nevron is available as an official Docker image on Docker Hub:

docker pull axiomai/nevron:latest

You can also build the image locally if needed:

docker build -t axiomai/nevron:latest .

Deployment Steps

1. Create Required Volumes

First, create the necessary volume directories based on your configuration:

# Create base directories
mkdir -p volumes/nevron/logs

# For ChromaDB (if using ChromaDB as vector store)
mkdir -p volumes/nevron/chromadb

# For Qdrant (if using Qdrant as vector store)
mkdir -p volumes/qdrant/data
mkdir -p volumes/qdrant/snapshots

# For Ollama (if using local LLM)
mkdir -p volumes/ollama/models

Note: You only need to create volumes for the services you plan to use. For example, if you're using ChromaDB, you don't need the Qdrant volumes.

2. Configure Services in docker-compose.yml

Edit your docker-compose.yml file to enable or disable services based on your needs. You can disable a service by adding deploy: { replicas: 0 } to its configuration:

# Example: Disable Qdrant if using ChromaDB
qdrant:
  <<: *service_default
  image: qdrant/qdrant:latest
  deploy:
    replicas: 0  # This disables the Qdrant service
  # ... other configuration ...

# Example: Disable Ollama if using a third-party LLM
ollama:
  <<: *service_default
  image: ollama/ollama:latest
  deploy:
    replicas: 0  # This disables the Ollama service
  # ... other configuration ...

3. Configure Environment Variables

Create and configure your .env file:

cp .env.example .env

Edit the .env file to configure:

  1. Vector Store: Choose between ChromaDB or Qdrant

    # For ChromaDB
    MEMORY_BACKEND_TYPE=chroma
    
    # OR for Qdrant
    MEMORY_BACKEND_TYPE=qdrant
    

  2. LLM Provider: Choose between local Ollama or a third-party LLM

    # For local Ollama
    LLAMA_PROVIDER=ollama
    LLAMA_OLLAMA_MODEL=llama3:8b-instruct
    
    # OR for third-party LLMs (choose one)
    OPENAI_API_KEY=your_key_here
    ANTHROPIC_API_KEY=your_key_here
    XAI_API_KEY=your_key_here
    # etc.
    

For a complete list of configuration options, refer to the Configuration documentation.

4. Start the Services

Launch the Nevron stack:

docker compose up -d

5. Monitor Logs

View the logs to ensure everything is running correctly:

docker compose logs --follow

6. Stop Services (When Needed)

To stop all services:

docker compose down

Configuration Details

Our Docker Compose setup includes:

  1. Service Definitions
  2. Automatic restart policies
  3. Proper logging configuration
  4. Network isolation for security

  5. Volume Management

  6. Persistent storage for logs and data
  7. Configurable volume base directory

  8. Networking

  9. Internal network for service communication
  10. External network for API access

  11. Environment Configuration

  12. Environment file support
  13. Override capability for all settings

Production Considerations

When deploying to production, consider the following:

  1. Security
  2. Use secure storage for API keys and sensitive data
  3. Consider using Docker secrets for sensitive information
  4. Implement proper network security rules

  5. Performance

  6. Configure appropriate resource limits for containers
  7. Monitor resource usage and adjust as needed
  8. Consider using a dedicated server for high-traffic deployments

  9. Reliability

  10. Set up health checks and automatic restarts
  11. Implement proper backup strategies for memory backends
  12. Use a production-grade process manager

  13. Monitoring

  14. Set up proper logging and monitoring
  15. Implement alerting for critical issues
  16. Regularly check logs for errors or warnings

  17. Scaling

  18. For high-load scenarios, consider scaling the vector database separately
  19. Use a load balancer if deploying multiple Nevron instances

For optimal production deployments, we recommend: - Setting ENVIRONMENT=production in your configuration - Regular backups of memory storage - Using a reverse proxy (like Nginx or Traefik) for any exposed endpoints - Implementing proper monitoring and alerting