Mitigate AI Platform
App maintanance

Monthly Checks

Monthly maintenance checklist for long-term application health

This guide outlines comprehensive maintenance tasks that should be performed monthly to ensure long-term application health, security, and cost optimization.

Overview

Frequency: Monthly

Priority: High - Prevents long-term issues and ensures security compliance

Pre-Check Requirements

  • Super admin access to the application
  • Access to LLM provider account settings
  • Access to OpenID Connect provider (if enabled)
  • Access to backup storage

1. LLM Provider Status Review

Model Deprecation Monitoring

LLM providers regularly deprecate older models. Failing to update before deprecation can cause service outages.

Steps to Monitor Model Deprecations

  1. Check Deprecation Schedule

    • Visit your LLM provider's documentation/deprecation page
    • Review upcoming deprecations (6-12 months ahead)
    • Check current model against deprecation list
  2. Current Model Configuration

    Check these environment variables (not limited to):

    LLM_DEFAULT_MODEL                # e.g., "gpt-4.1"
    LLM_DEFAULT_EMBEDDING_MODEL      # e.g., "text-embedding-3-large"
  3. Action Plan

    If deprecation is scheduled:

    • Note deprecation date
    • Test replacement model
    • Plan migration timeline (2-3 months before deprecation)
    • Update configuration
    • Test thoroughly before deprecation date

Example with OpenAI:

API Credit Balance & Usage Review

Steps to Review API Usage

  1. Access Billing Dashboard

    • Navigate to your LLM provider's billing/account section
  2. Review Metrics

    • Current credit balance
    • Usage over last 30 days
    • Daily average cost
    • Projected monthly cost
    • Compare vs. budget
  3. Actions

    • Add credits if balance low (<1 month runway)
    • Adjust budget if usage increased
    • Set up billing alerts (if not configured)
    • Investigate unusual spikes

Example with OpenAI:

Rate Limit & Performance Review

  1. Check Rate Limit Issues

    • Review Sentry for rate limit errors
    • Check provider dashboard for throttling events
    • Assess if upgrade needed
  2. Performance Metrics

    • Review Langfuse for average latencies
    • Check if performance degraded
    • Compare month-over-month trends
  3. Optimization Opportunities

    • Review prompt efficiency
    • Check if caching can be improved
    • Assess token usage optimization

2. Authentication & Access Control Review

User Access Audit

  1. Access User List

    • Navigate to: [APP_HOST]/admin/users
    • Review all active users
  2. Verify User Status

    • All users should have legitimate access
    • Remove inactive users
    • Verify users still with organization
    • Check for unknown or suspicious accounts

OpenID Connect Provider Review

Note: Only applicable if OPENID_CONNECT_ROLES_ENABLED=true

Steps to Review OpenID Connect Users

  1. Access App Registration

    • Navigate to your identity provider's admin console
    • Find your application registration
  2. Review User Assignments

    • Check users and groups assigned to the application
    • Verify assigned users are current employees
    • Remove access for departed users
  3. Review Role Assignments

    If roles are configured:

    • Check app roles section
    • Verify role mappings (admin, super_admin)
    • Ensure only appropriate users have elevated roles
  4. Review Token Configuration

    • Verify token lifetime settings
    • Check optional claims configuration
    • Ensure roles are included in token claims
  5. Review Group Memberships (if using groups for role mapping)

    • Review admin group members
    • Verify super admin group members
    • Update group memberships as needed

Example with Microsoft Entra ID (Azure AD):

  • Navigate to Azure Portal → Enterprise Applications
  • Find your application registration
  • Check "Users and groups" section
  • Verify "App roles" configuration
  • Ensure role claims are included in token

Session Security Review

  1. Check Session Timeout

    # Current setting (in minutes)
    SESSION_TIMEOUT_MINUTES=1440  # 24 hours default
  2. Recommendations

    • Standard users: 24 hours (1440 minutes)
    • Sensitive environments: 8 hours (480 minutes)
    • High-security: 1 hour (60 minutes)
  3. Action

    • Verify timeout is appropriate for security posture
    • Adjust if needed

3. Backup Verification

  1. Locate Latest Backup

    • Check backup/s storage location
    • Verify backup/s were created this week
    • Check backup/s file sizes (should be non-zero and reasonable)
  2. Backup Retention Check

    • Verify weekly backups are retained
    • Check monthly archives exist
    • Confirm old backups are cleaned up per policy

On this page