In a recent project, an AI-powered chat application was successfully deployed from a local setup to Azure cloud infrastructure. The application, initially built using FastAPI, Docker, Postgres, Nginx, and llama.cpp, was adapted for cloud deployment to handle up to 1040 concurrent users with the current Azure credit limits. The cloud architecture leverages Azure Container Apps (ACA), which simplifies Kubernetes management, allowing for automatic scaling based on HTTP requests. The system uses Azure-managed PostgreSQL for database needs, ensuring scalability and reliability. Costs for this setup can range from $13 to $7000 monthly, depending on the hardware. The deployment process was streamlined using Azure CLI and infrastructure-as-code tools like Terraform. The app’s public URL was made memorable through DNS records, and future improvements include setting up CI/CD pipelines, enhancing monitoring, and optimizing the language model server for lower latency and higher concurrency.
Source: towardsdatascience.com
