PicoAI Assistant

PicoAI Assistant

Full-Stack Developer & System Architect · 2025

AI/MLFull-StackSystem Architecture

An enterprise-grade, multi-agent AI chat platform with real-time streaming, tool execution, and RAG.

4
AI Providers
11+
MCP Tools
3
Services
28
API Endpoints
PicoAI Assistant

Overview

PicoAI Assistant is a three-tier platform (Next.js 15 frontend, Express.js gateway, Laravel 12 backend) that enables businesses to deploy multiple specialized AI assistants from a single self-hosted system. The gateway orchestrates real-time SSE streaming across multiple AI providers while the backend manages data, authentication, and an MCP server for extensible tool execution.

The Problem

  • Deploy multiple specialized AI assistants (e.g., e-commerce support, internal knowledge bot) from a single platform
  • Give AI agents access to live business data and external APIs — not just static training data
  • Maintain full control over agent behavior, tool access, and usage limits per user
  • Stream responses in real-time with rich content rendering (charts, product carousels, images)
  • Keep a self-hosted, privacy-first architecture with no dependency on third-party SaaS chat platforms

Solution Architecture

The gateway is stateless (no database) — all persistence routes through Laravel. AI streaming (SSE) happens at the gateway level, keeping the backend focused on data operations. Data flows from Frontend → Gateway (session cookie) → Laravel (Bearer token) → MySQL.

Client Browser
HTTPS
Nginx Reverse Proxy
SSL/TLS Termination
ai-chat-app
Next.js 15
  • React 19 UI
  • Zustand
  • React Query
  • Tailwind CSS
  • App Router
ai-gateway
Express.js
  • SSE Stream
  • AI Providers
  • MCP Client
  • Session Mgmt
  • Tool Exec
ai-core-service
Laravel 12
  • MySQL 8.0
  • Passport OAuth
  • MCP Server
  • RAG Pipeline
  • Livewire Admin
AI Providers
OpenAI
Claude
Ollama
Gemini
MySQL 8.0
Persistent Storage

Tech Stack

Frontend
Next.js 15
React 19
TypeScript
Zustand
React Query
Tailwind CSS
Recharts
Gateway
Express.js 5
TypeScript
OpenAI SDK
Anthropic SDK
Backend
Laravel 12
MySQL 8.0
Laravel Passport
Livewire 3
Infrastructure
Docker Compose
Nginx
MCP (JSON-RPC 2.0)

Key Features

1

Multi-Agent System

Each AI agent is independently configurable with its own model, instructions, temperature, max tokens, and tool assignments. Agents are managed via many-to-many relationships with tools, allowing fine-grained priority and per-tool settings.

2

Multi-Provider AI Streaming

A provider abstraction layer normalizes responses across OpenAI, Anthropic Claude, Ollama, and Gemini into a unified event stream. Adding a new provider requires implementing a single interface.

3

MCP Integration

A full dual-role MCP implementation: Laravel MCP Server with 11+ auto-discovered tools (JSON-RPC 2.0 compliant), and Gateway MCP Client with 60-second session caching and tool format transformation.

4

RAG Pipeline

Type-aware document processing with strategy-based chunking, OpenAI text-embedding-3-large embeddings, MySQL JSON column storage, and cosine similarity search with configurable relevance thresholds.

5

External API Integration

First-class Service → Tool → Endpoint architecture for connecting any HTTP API with endpoint discovery, keyword filtering, authentication header injection, and streamed results.

6

Rich Content Rendering

The chat interface renders product carousels (Embla), interactive charts (Recharts), AI-generated images (gpt-image-1), syntax-highlighted code blocks, and full Markdown.

7

OAuth2 SSO

Laravel Passport handles OAuth2 token issuance. The gateway exchanges authorization codes for tokens stored in httpOnly cookies. The frontend never sees the Bearer token.

8

Usage & Quota Management

Per-user token limits (input + output tracking), image generation caps with monthly resets, rate limiting with 429 responses, and date-range analytics for reporting.

Challenges & Solutions

Multi-turn tool execution in streaming

AI models request tool calls mid-stream, and results need to be fed back for follow-up reasoning — all while maintaining the SSE connection. Built a tool follow-up mechanism that accumulates results, reconstructs conversation context with provider-specific formatting, and re-enters the streaming loop.

RAG without a vector database

Adding Pinecone or Weaviate would have increased infrastructure complexity for self-hosted deployments. Stored embeddings as MySQL JSON columns and implemented cosine similarity search in PHP. The architecture allows dropping in a dedicated vector store later without changing the search interface.

Consistent authentication across three services

Three services with different auth mechanisms needed seamless SSO. Laravel Passport handles OAuth2 token issuance, the gateway exchanges authorization codes for tokens stored in httpOnly cookies, and the frontend never sees the Bearer token.

Architecture Decisions

Stateless gateway

No database at the gateway — all data ops proxy through Laravel. Simplifies the gateway to a streaming orchestrator.

MySQL JSON for embeddings

Avoided adding a vector DB dependency. Cosine similarity computed in PHP. Sufficient for current scale; can migrate to pgvector/Pinecone later.

MCP over custom tooling

Adopted the open Model Context Protocol standard for tool interop. Future-proofs integrations and allows third-party MCP servers.

Provider abstraction

Unified AiProvider interface decouples business logic from any single AI vendor. Switching or adding providers requires no changes to streaming or tool execution code.

Feature-based frontend structure

Each feature (agents, auth, chat, conversations) is self-contained with its own api/, components/, stores/. Scales better than layer-based organization.

Laravel Modules

nwidart/laravel-modules for modular backend architecture — keeps domain logic separated as the codebase grows.

My Role

Full-Stack Developer & System Architect

  • Designed the three-tier service architecture and inter-service communication patterns
  • Built the real-time AI streaming pipeline with multi-provider support
  • Implemented the MCP server and client for extensible tool execution
  • Developed the RAG pipeline for knowledge-base-powered AI responses
  • Created the OAuth2 SSO authentication flow across all services
  • Built the admin dashboard with Livewire for agent/tool/user management
  • Configured Docker Compose orchestration with Nginx SSL termination
  • Wrote OpenAPI documentation with Swagger UI and ReDoc

Outcome & Impact

  • Multi-agent deployment — businesses can spin up specialized AI assistants without code changes
  • Real-time streaming with tool execution provides a responsive, interactive chat experience
  • Self-hosted architecture gives organizations full control over data and costs
  • Extensible tool system via MCP allows integration with any external API or data source
  • Admin dashboard empowers non-technical users to manage agents, tools, and usage limits