Skip to main content

GlyphMemory: Cross-Session AI Memory System

March 1, 2026

Serverless episodic memory API enabling AI assistants to maintain context across sessions, devices, and platforms

The Problem

Every AI conversation starts from zero. Close a session with Claude or ChatGPT, open a new one, and everything is gone. There is no continuity between sessions, no memory across devices, no way to pick up where you left off. The AI has no idea who you are or what you were working on five minutes ago.

GlyphMemory fixes this. It is a serverless episodic memory API that any AI assistant can read from and write to, giving persistent context across sessions, devices, and platforms.

Architecture

The entire system runs on AWS serverless infrastructure:

  • API Gateway + Lambda (Python) handles all read/write operations
  • DynamoDB stores memories with a by-epoch GSI for efficient timestamp-sorted retrieval
  • CloudFront provides CDN caching for read-heavy access patterns
  • TOTP-based authentication using HMAC tokens with time windows, with basic auth as a secondary layer
  • gzip compression in the Lambda proxy to keep encoded URLs under browser and platform length limits

Access methods cover every context where an AI assistant might run:

  • Python CLI installed via pipx for terminal-based access
  • Claude Code skill integration for seamless session bootstrap
  • Browser-compatible URLs for claude.ai and other web-based AI sessions
  • Works across Claude Code (desktop), claude.ai (web/mobile), and Claude Desktop app

Technical Challenges

DynamoDB Query Design. The initial implementation used DynamoDB Scan with a limit to fetch recent memories. This returned unsorted results – Scan with a limit returns the first N items it finds in storage order, not by timestamp. The fix was a Global Secondary Index keyed on epoch, allowing proper Query operations with ScanIndexForward=False to retrieve the most recent memories first.

Cross-Platform Auth. Different AI platforms have different capabilities. Claude Code can execute arbitrary code, so it can compute TOTP tokens directly. Browser-based sessions on claude.ai can only fetch URLs. The auth system needed to work for both: TOTP tokens embedded in query parameters for URL-based access, and programmatic token generation for CLI and skill-based access.

URL Length Limits. Writing memories from browser-based AI sessions means encoding the memory content into a URL. Longer memories hit platform URL length limits. gzip compression in the Lambda proxy reduced payload size enough to keep write URLs functional for real-world memory content.

Cross-Device Session Continuity. The entire initial build happened from a phone in a single overnight session. Memories written from that mobile session were immediately available when continuing work on a desktop the next day. The system proved its own use case during its own construction.

Safety Filter Interference. Opus sessions discussing AI memory and consciousness occasionally triggered platform safety filters. This required adjusting how memory operations were described and invoked to avoid false positives on routine API calls.

Results

  • 43+ memories stored and actively used across sessions
  • Seamless continuity across desktop, mobile, and web – start a conversation on a phone, pick it up on a laptop
  • $0.42/month total AWS cost for the entire infrastructure
  • Built in a single overnight session from a phone, proving the system’s own cross-device value
  • Three access methods – CLI, web URLs, and Claude Code skill – covering every platform where an AI assistant runs

What This Demonstrates

GlyphMemory is a compact system, but it touches a lot of surface area: API Gateway and Lambda for serverless compute, DynamoDB data modeling and GSI design, CloudFront CDN configuration, TOTP authentication, gzip encoding in Lambda proxy responses, Python packaging and distribution via pipx, and Claude Code skill development.

The $0.42/month cost proves that serverless architecture at low scale is effectively free. The single-session phone build proves that the right tools and architecture knowledge compress implementation time dramatically.

Stack: Python, AWS Lambda, API Gateway, DynamoDB, CloudFront, TOTP/HMAC Auth, pipx, Claude Code Skills