PicoAI Assistant

Full-Stack Developer & System Architect · 2025

Visit Project

AI/MLFull-StackSystem Architecture

An enterprise-grade, multi-agent AI chat platform with real-time streaming, tool execution, and RAG.

AI Providers

11+

MCP Tools

Services

API Endpoints

Overview

PicoAI Assistant is a three-tier platform (Next.js 15 frontend, Express.js gateway, Laravel 12 backend) that enables businesses to deploy multiple specialized AI assistants from a single self-hosted system. The gateway orchestrates real-time SSE streaming across multiple AI providers while the backend manages data, authentication, and an MCP server for extensible tool execution.

The Problem

Deploy multiple specialized AI assistants (e.g., e-commerce support, internal knowledge bot) from a single platform
Give AI agents access to live business data and external APIs — not just static training data
Maintain full control over agent behavior, tool access, and usage limits per user
Stream responses in real-time with rich content rendering (charts, product carousels, images)
Keep a self-hosted, privacy-first architecture with no dependency on third-party SaaS chat platforms

Solution Architecture

The gateway is stateless (no database) — all persistence routes through Laravel. AI streaming (SSE) happens at the gateway level, keeping the backend focused on data operations. Data flows from Frontend → Gateway (session cookie) → Laravel (Bearer token) → MySQL.

Client Browser

HTTPS

Nginx Reverse Proxy

SSL/TLS Termination

ai-chat-app

Next.js 15

React 19 UI
Zustand
React Query
Tailwind CSS
App Router

ai-gateway

Express.js

SSE Stream
AI Providers
MCP Client
Session Mgmt
Tool Exec

ai-core-service

Laravel 12

MySQL 8.0
Passport OAuth
MCP Server
RAG Pipeline
Livewire Admin

AI Providers

OpenAI

Claude

Ollama

Gemini

MySQL 8.0

Persistent Storage

Tech Stack

Frontend

Next.js 15

React 19

TypeScript

Zustand

React Query

Tailwind CSS

Recharts

Gateway

Express.js 5

TypeScript

OpenAI SDK

Anthropic SDK

Backend

Laravel 12

MySQL 8.0

Laravel Passport

Livewire 3

Infrastructure

Docker Compose

Nginx

MCP (JSON-RPC 2.0)

Key Features

Multi-Agent System

Each AI agent is independently configurable with its own model, instructions, temperature, max tokens, and tool assignments. Agents are managed via many-to-many relationships with tools, allowing fine-grained priority and per-tool settings.

Multi-Provider AI Streaming

A provider abstraction layer normalizes responses across OpenAI, Anthropic Claude, Ollama, and Gemini into a unified event stream. Adding a new provider requires implementing a single interface.

MCP Integration

A full dual-role MCP implementation: Laravel MCP Server with 11+ auto-discovered tools (JSON-RPC 2.0 compliant), and Gateway MCP Client with 60-second session caching and tool format transformation.

RAG Pipeline

Type-aware document processing with strategy-based chunking, OpenAI text-embedding-3-large embeddings, MySQL JSON column storage, and cosine similarity search with configurable relevance thresholds.

External API Integration

First-class Service → Tool → Endpoint architecture for connecting any HTTP API with endpoint discovery, keyword filtering, authentication header injection, and streamed results.

Rich Content Rendering

The chat interface renders product carousels (Embla), interactive charts (Recharts), AI-generated images (gpt-image-1), syntax-highlighted code blocks, and full Markdown.

OAuth2 SSO

Laravel Passport handles OAuth2 token issuance. The gateway exchanges authorization codes for tokens stored in httpOnly cookies. The frontend never sees the Bearer token.

Usage & Quota Management

Per-user token limits (input + output tracking), image generation caps with monthly resets, rate limiting with 429 responses, and date-range analytics for reporting.

Challenges & Solutions

Multi-turn tool execution in streaming

AI models request tool calls mid-stream, and results need to be fed back for follow-up reasoning — all while maintaining the SSE connection. Built a tool follow-up mechanism that accumulates results, reconstructs conversation context with provider-specific formatting, and re-enters the streaming loop.

RAG without a vector database

Adding Pinecone or Weaviate would have increased infrastructure complexity for self-hosted deployments. Stored embeddings as MySQL JSON columns and implemented cosine similarity search in PHP. The architecture allows dropping in a dedicated vector store later without changing the search interface.

Consistent authentication across three services

Three services with different auth mechanisms needed seamless SSO. Laravel Passport handles OAuth2 token issuance, the gateway exchanges authorization codes for tokens stored in httpOnly cookies, and the frontend never sees the Bearer token.

Architecture Decisions

Decision	Rationale
Stateless gateway	No database at the gateway — all data ops proxy through Laravel. Simplifies the gateway to a streaming orchestrator.
MySQL JSON for embeddings	Avoided adding a vector DB dependency. Cosine similarity computed in PHP. Sufficient for current scale; can migrate to pgvector/Pinecone later.
MCP over custom tooling	Adopted the open Model Context Protocol standard for tool interop. Future-proofs integrations and allows third-party MCP servers.
Provider abstraction	Unified AiProvider interface decouples business logic from any single AI vendor. Switching or adding providers requires no changes to streaming or tool execution code.
Feature-based frontend structure	Each feature (agents, auth, chat, conversations) is self-contained with its own api/, components/, stores/. Scales better than layer-based organization.
Laravel Modules	nwidart/laravel-modules for modular backend architecture — keeps domain logic separated as the codebase grows.

Stateless gateway

No database at the gateway — all data ops proxy through Laravel. Simplifies the gateway to a streaming orchestrator.

MySQL JSON for embeddings

Avoided adding a vector DB dependency. Cosine similarity computed in PHP. Sufficient for current scale; can migrate to pgvector/Pinecone later.

MCP over custom tooling

Adopted the open Model Context Protocol standard for tool interop. Future-proofs integrations and allows third-party MCP servers.

Provider abstraction

Unified AiProvider interface decouples business logic from any single AI vendor. Switching or adding providers requires no changes to streaming or tool execution code.

Feature-based frontend structure

Each feature (agents, auth, chat, conversations) is self-contained with its own api/, components/, stores/. Scales better than layer-based organization.

Laravel Modules

nwidart/laravel-modules for modular backend architecture — keeps domain logic separated as the codebase grows.

My Role

Full-Stack Developer & System Architect

Designed the three-tier service architecture and inter-service communication patterns
Built the real-time AI streaming pipeline with multi-provider support
Implemented the MCP server and client for extensible tool execution
Developed the RAG pipeline for knowledge-base-powered AI responses
Created the OAuth2 SSO authentication flow across all services
Built the admin dashboard with Livewire for agent/tool/user management
Configured Docker Compose orchestration with Nginx SSL termination
Wrote OpenAPI documentation with Swagger UI and ReDoc

Gallery

View

1 / 2

Outcome & Impact

Multi-agent deployment — businesses can spin up specialized AI assistants without code changes
Real-time streaming with tool execution provides a responsive, interactive chat experience
Self-hosted architecture gives organizations full control over data and costs
Extensible tool system via MCP allows integration with any external API or data source
Admin dashboard empowers non-technical users to manage agents, tools, and usage limits

Explore PicoAI Assistant