MongoDB Atlas Setup

This guide provides detailed instructions for setting up MongoDB Atlas as the vector database for the HRMS AI service.

Overview

Setting	Value
Purpose	Vector storage for AI/RAG
Database	`hrms_ai`
Collections	`documents`, `document_chunks`
Vector Index	1536 dimensions (OpenAI)

Why MongoDB Atlas?

Built-in vector search: Native $vectorSearch aggregation
Managed service: No infrastructure to maintain
Free tier available: M0 for development/staging
GCP integration: Deploy in same region as Cloud Run

Step 1: Create Atlas Account and Project

Go to MongoDB Atlas
Sign up or log in
Create new project: Bluewoo HRMS

Step 2: Create Cluster

Staging Cluster (Free)

Click Build a Database
Select M0 (Free)
Provider: Google Cloud
Region: Belgium (europe-west1)
Cluster name: hrms-ai-staging
Click Create

Production Cluster

Click Build a Database
Select M10 (or higher)
Provider: Google Cloud
Region: Belgium (europe-west1)
Cluster name: hrms-ai-prod
Configure:
- Storage: 10GB (auto-scale)
- Backup: Enable continuous backup
Click Create

Step 3: Create Database User

Go to Database Access (left sidebar)
Click Add New Database User
Authentication: Password
Username: hrms_ai_app
Password: Generate secure password (save it!)
Database User Privileges:
- Select Built-in Role: readWrite
- Database: hrms_ai
Click Add User

Step 4: Configure Network Access

For Development (Allow All)

Go to Network Access (left sidebar)
Click Add IP Address
Click Allow Access from Anywhere (0.0.0.0/0)
Click Confirm

⚠️ Production: Restrict to GCP Cloud Run egress IPs or use VPC Peering.

For Production (Recommended)

Option A: IP Access List with GCP NAT

Create Cloud NAT for Cloud Run egress
Add NAT IP to Atlas allowlist

Option B: VPC Peering (Most secure)

Go to Network Access → Peering
Follow Atlas VPC Peering setup for GCP

Step 5: Get Connection String

Go to Database → Click Connect on your cluster
Select Drivers
Driver: Node.js / Version: 5.5 or later
Copy connection string:

mongodb+srv://hrms_ai_app:<password>@hrms-ai-staging.xxxxx.mongodb.net/?retryWrites=true&w=majority

Replace <password> with actual password
Add database name:

mongodb+srv://hrms_ai_app:password@hrms-ai-staging.xxxxx.mongodb.net/hrms_ai?retryWrites=true&w=majority

Step 6: Create Database and Collections

Using mongosh

# Connect to cluster
mongosh "mongodb+srv://hrms-ai-staging.xxxxx.mongodb.net/" \
  --username hrms_ai_app \
  --password <PASSWORD>

# Create database and collections
use hrms_ai

db.createCollection("documents")
db.createCollection("document_chunks")

# Verify
show collections
# Should show: documents, document_chunks

Using Atlas UI

Go to Database → Browse Collections
Click Add My Own Data
Database: hrms_ai
Collection: documents
Click Create
Repeat for document_chunks

Step 7: Create Vector Search Index

This is the critical step for RAG functionality.

Using Atlas UI (Recommended)

Go to Database → Browse Collections → Select hrms_ai.document_chunks
Click Search Indexes tab
Click Create Search Index
Select JSON Editor
Index name: vector_index
Paste this configuration:

{
  "mappings": {
    "dynamic": true,
    "fields": {
      "embedding": {
        "type": "knnVector",
        "dimensions": 1536,
        "similarity": "cosine"
      },
      "tenantId": {
        "type": "string"
      },
      "sourceType": {
        "type": "string"
      }
    }
  }
}

Click Create Search Index
Wait for index status to become Active (may take a few minutes)

Using mongosh

db.document_chunks.createSearchIndex({
  name: "vector_index",
  definition: {
    mappings: {
      dynamic: true,
      fields: {
        embedding: {
          type: "knnVector",
          dimensions: 1536,
          similarity: "cosine"
        },
        tenantId: {
          type: "string"
        },
        sourceType: {
          type: "string"
        }
      }
    }
  }
});

Step 8: Create Standard Indexes

Add indexes for common queries:

// Tenant isolation
db.documents.createIndex({ tenantId: 1 });
db.document_chunks.createIndex({ tenantId: 1 });

// Document lookup
db.documents.createIndex({ tenantId: 1, sourceType: 1, sourceId: 1 }, { unique: true });
db.document_chunks.createIndex({ tenantId: 1, documentId: 1 });

// Verify indexes
db.documents.getIndexes();
db.document_chunks.getIndexes();

Document Schema

documents Collection

interface Document {
  _id: ObjectId;
  tenantId: string;
  sourceType: 'document' | 'employee' | 'policy';
  sourceId: string;
  title: string;
  content: string;
  metadata: Record<string, unknown>;
  createdAt: Date;
  updatedAt: Date;
}

document_chunks Collection

interface DocumentChunk {
  _id: ObjectId;
  tenantId: string;
  documentId: string;
  content: string;
  embedding: number[]; // 1536 dimensions
  chunkIndex: number;
  metadata: {
    sourceType: string;
    sourceId: string;
    title: string;
  };
  createdAt: Date;
}

Vector Search Query

Example RAG query in the AI service:

async function searchSimilarDocuments(
  tenantId: string,
  queryEmbedding: number[],
  limit: number = 5
) {
  const collection = db.collection('document_chunks');
  
  const results = await collection.aggregate([
    {
      $vectorSearch: {
        index: 'vector_index',
        path: 'embedding',
        queryVector: queryEmbedding,
        numCandidates: limit * 10,
        limit: limit,
        filter: {
          tenantId: tenantId
        }
      }
    },
    {
      $project: {
        content: 1,
        metadata: 1,
        score: { $meta: 'vectorSearchScore' }
      }
    }
  ]).toArray();
  
  return results;
}

Connection in AI Service

Environment Variable

MONGODB_URI="mongodb+srv://hrms_ai_app:<password>@hrms-ai-prod.xxxxx.mongodb.net/hrms_ai?retryWrites=true&w=majority"

Service Code

// apps/ai/src/services/mongodb.ts
import { MongoClient, Db } from 'mongodb';

let client: MongoClient | null = null;
let db: Db | null = null;

export async function connectMongo(): Promise<Db> {
  if (db) return db;
  
  const uri = process.env.MONGODB_URI;
  if (!uri) {
    throw new Error('MONGODB_URI not configured');
  }
  
  client = new MongoClient(uri);
  await client.connect();
  
  db = client.db('hrms_ai');
  
  // Verify connection
  await db.command({ ping: 1 });
  console.log('Connected to MongoDB Atlas');
  
  return db;
}

export async function getDocumentsCollection() {
  const database = await connectMongo();
  return database.collection('documents');
}

export async function getChunksCollection() {
  const database = await connectMongo();
  return database.collection('document_chunks');
}

export async function disconnectMongo() {
  if (client) {
    await client.close();
    client = null;
    db = null;
  }
}

Cluster Tiers and Pricing

Tier	RAM	Storage	Use Case	Monthly Cost
M0	Shared	512 MB	Development	Free
M2	Shared	2 GB	Small staging	~$9
M5	Shared	5 GB	Large staging	~$25
M10	2 GB	10 GB	Small production	~$57
M20	4 GB	20 GB	Production	~$140
M30	8 GB	40 GB	High traffic	~$280

Monitoring

Atlas Metrics

Go to Database → Metrics
Key metrics:
- Operations/second
- Document reads/writes
- Index size
- Query targeting (should be >95%)

Vector Search Metrics

Go to Search → Your index
Monitor:
- Index size
- Search latency
- Query volume

Troubleshooting

Connection Timeout

MongoNetworkError: connection timed out

Fix: Check IP whitelist in Network Access. Add your IP or 0.0.0.0/0.

Authentication Failed

MongoServerError: Authentication failed

Fix:

Verify username/password
Check user has access to hrms_ai database
URL-encode special characters in password

Vector Search Not Working

MongoServerError: $vectorSearch is not allowed

Fix:

Ensure you're using Atlas (not self-hosted MongoDB)
Verify vector index exists and is Active
Check index name matches query

Index Not Ready

Index is not ready for queries

Fix: Wait for index status to become "Active" (can take several minutes for large collections).

Security Best Practices

Use dedicated user: Don't use Atlas admin account
Restrict network: Use VPC Peering for production
Enable audit logs: For compliance tracking
Rotate credentials: Update password periodically
Use Secret Manager: Store connection string securely

MongoDB Atlas Setup

On this page