message-extractor (0.1.0)
Installation
[registry]
default = "gitea"
[registries.gitea]
index = "sparse+ " # Sparse index
# index = " " # Git
[net]
git-fetch-with-cli = truecargo add message-extractor@0.1.0About this package
Message Extractor
A Rust-based system for extracting, monitoring, and visualizing conversations from AI coding assistants in real-time.
Overview
Message Extractor provides a complete solution for working with AI assistant conversations:
- Core Library (
message-extractor): Extract messages from 7+ AI assistants - Watcher Service (
message-watcher): Monitor files and stream updates via SSE - Web UI (
frontend): Real-time visualization with search and filtering
Why this exists: If you use multiple AI coding assistants, you might want to aggregate their conversations, search across them, or analyze usage patterns. This project makes that easy.
┌──────────┐ ┌─────────────────┐ ┌──────────────┐
│ Session │───▶│ message-watcher │───▶│ Browser │
│ Files │ │ (Rust + SSE) │ │ (Yew + WASM) │
└──────────┘ └─────────────────┘ └──────────────┘
.jsonl/.json File watching Real-time UI
Quick Start (Full System)
Get the complete system running in 2 minutes:
# 1. Clone and build
git clone https://github.com/yourusername/message-extractor.git
cd message-extractor
cargo build --release
# 2. Start backend (Terminal 1)
mise run watcher:run
# 3. Start frontend (Terminal 2)
cd frontend && mise exec -- trunk serve
# 4. Open browser
open http://localhost:8080
You'll see messages from all your AI assistants streaming in real-time! 🎉
Features
Core Library
- Trait-based design for extensible provider support
- Async I/O for efficient file processing
- Type-safe error handling with
thiserror - 7 provider implementations:
- Claude
- Codex
- Copilot
- Gemini
- OpenCode
- Droid
- Cursor
Watcher Service
- Real-time monitoring with
notifyfile watcher - Incremental reading - only new messages, not entire files
- SSE streaming to multiple clients
- Bounded history prevents memory leaks
- Concurrency control prevents resource exhaustion
Web Frontend
- Live message feed with auto-scroll
- Filter by provider (Claude, Cursor, etc.)
- Search messages with highlighting
- Dark/Light mode with localStorage persistence
- Export to JSON for filtered results
- Session browser explore all past conversations
Use Cases
- Cross-assistant search: Find that snippet across all your AI tools
- Usage analytics: Track which assistants you use most
- Conversation backup: Keep all your AI interactions
- Research: Analyze conversation patterns
- Debugging: Monitor what messages are being sent
System Components
📚 message-extractor (Library)
The core library that knows how to parse each AI assistant's session format.
Location: Root directory Documentation: See ARCHITECTURE.md
Quick example:
let registry = ExtractorRegistry::new();
let messages = registry
.get(Provider::Claude)?
.extract_messages(Path::new("session.jsonl"))
.await?;
👁️ message-watcher (Service)
Real-time file watcher that monitors your AI assistant directories and broadcasts updates.
Location: message-watcher/
Documentation: message-watcher/README.md
Features:
- Type-state pattern for lifecycle safety
- Bounded message history (10k limit)
- Semaphore-based concurrency control
- Path sanitization for PII protection
🖥️ frontend (Web UI)
Yew-based WebAssembly frontend for visualizing message streams.
Location: frontend/
Documentation: frontend/README.md
Features:
- Real-time SSE connection
- Provider and text filtering
- Dark mode support
- Message export
Installation
Using the Library
Add this to your Cargo.toml:
[dependencies]
message-extractor = "0.1.0"
Usage
use message_extractor::{ExtractorRegistry, Provider};
use std::path::Path;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create registry with all providers
let registry = ExtractorRegistry::new();
// Get extractor for a specific provider
let extractor = registry
.get(Provider::Claude)
.expect("Claude extractor should be registered");
// Extract messages from a session file
let messages = extractor
.extract_messages(Path::new("~/.claude/projects/session-123.jsonl"))
.await?;
// Process messages
for msg in messages {
println!("{:?}: {}", msg.role, msg.content);
if let Some(ts) = msg.timestamp {
println!(" at {}", ts);
}
}
Ok(())
}
Architecture
Core Types
SimpleMessage: Contains role, content, and optional timestampMessageRole: Enum for User, Assistant, and System rolesProvider: Enum for all supported providers
Trait Design
The MessageExtractor trait defines the interface all providers implement:
#[async_trait]
pub trait MessageExtractor: Send + Sync {
async fn extract_messages(&self, session_path: &Path) -> Result<Vec<SimpleMessage>>;
fn provider(&self) -> Provider;
}
Registry Pattern
The ExtractorRegistry provides a centralized way to access all provider extractors:
let registry = ExtractorRegistry::new();
let extractor = registry.get(Provider::Claude)?;
Provider Details
Claude
- Format: JSONL (line-delimited JSON)
- Location:
~/.claude/projects/*.jsonl - Content: Supports both string and array content blocks
Codex
- Format: JSONL
- Fields:
role,content,timestamp
Copilot
- Format: JSON
- Structure:
{ "history": [...] } - Timestamp: Unix timestamp (seconds)
Gemini
- Format: JSON
- Structure:
{ "contents": [...] } - Content: Parts array with text blocks
OpenCode
- Format: JSONL
- Fields:
role,content,created_at
Droid
- Format: JSON
- Structure:
{ "messages": [...] }
Cursor
- Format: JSON
- Structure:
{ "conversation": [...] } - Timestamp: Unix timestamp (milliseconds)
Helper Functions
Filter Conversation
Remove system messages from a message list:
use message_extractor::filter_conversation;
let filtered = filter_conversation(messages);
// Only user and assistant messages remain
Running the Example
cargo run --example extract tests/fixtures/claude_session.jsonl
Running Tests
# Run all tests
cargo test
# Run with verbose output
cargo test -- --nocapture
# Run specific test
cargo test test_claude_extractor
Development
Building
cargo build
Linting
cargo clippy
Formatting
cargo fmt
Extending
To add a new provider:
- Create
src/providers/new_provider.rs - Implement the
MessageExtractortrait - Add to
src/providers/mod.rs - Register in
ExtractorRegistry::new() - Add test fixture in
tests/fixtures/ - Add integration test
Example:
use async_trait::async_trait;
use crate::extractor::MessageExtractor;
use crate::types::{SimpleMessage, Provider};
use crate::error::Result;
pub struct NewProviderExtractor;
#[async_trait]
impl MessageExtractor for NewProviderExtractor {
async fn extract_messages(&self, session_path: &Path) -> Result<Vec<SimpleMessage>> {
// Implementation here
}
fn provider(&self) -> Provider {
Provider::NewProvider
}
}
Troubleshooting
No Messages Appearing
Problem: Frontend shows "Waiting for messages..." but files exist.
Solutions:
-
Check if watcher is running:
curl http://localhost:3030/health # Should return: OK -
Check if your AI assistant files are in the expected locations:
ls ~/.claude/projects/*.jsonl ls ~/.cursor/sessions/*.json # etc. -
Check watcher logs for errors:
# Look for "Failed to extract messages" or "Unknown provider" -
Verify file extensions:
- The watcher only processes
.jsonand.jsonlfiles - Other extensions are silently skipped
- The watcher only processes
Connection Failed
Problem: Frontend shows "Connection error".
Solutions:
- Verify backend is running on port 3030
- Check for port conflicts:
lsof -i :3030 - Try restarting both backend and frontend
High Memory Usage
Problem: Watcher using too much memory.
Solutions:
- The message history is bounded to 10,000 messages by default
- Reduce it in
message-watcher/src/config.rs:max_history_size: 5_000 // Lower limit - Check for excessive file changes triggering many extractions
Missing Providers
Problem: Some providers don't show up.
Solutions:
- Verify the provider's directory exists
- Check that you have session files in that directory
- Ensure session files are valid JSON/JSONL
Build Failures
Problem: cargo build fails.
Solutions:
- Update Rust toolchain:
rustup update stable - Clean and rebuild:
cargo clean cargo build - Check that all dependencies are available
Frontend Not Loading
Problem: Browser shows blank page.
Solutions:
- Check browser console for errors
- Verify WebAssembly is enabled
- Clear browser cache
- Rebuild frontend:
cd frontend trunk clean trunk serve
FAQ
Q: Does this work with all AI assistants? A: Currently supports 7 assistants. See Contributing Guide to add more.
Q: Is my data sent anywhere? A: No. Everything runs locally. The server binds to localhost only.
Q: Can I filter messages before they're stored? A: Not yet, but this is a planned feature. Currently, filtering happens in the UI.
Q: What's the performance impact? A: Minimal. The watcher uses incremental reading and bounded concurrency.
Q: Can I run this on a server? A: Yes, but you must add authentication first. See Security Documentation.
Q: How do I export all my conversations? A: Use the "EXPORT" button in the UI, or use the library directly:
let messages = registry.get(provider)?.extract_messages(path).await?;
let json = serde_json::to_string_pretty(&messages)?;
std::fs::write("export.json", json)?;
Further Reading
- Architecture Overview - Deep dive into system design
- Contributing Guide - How to contribute
- Watcher Documentation - Service details
- Frontend Documentation - UI features
License
MIT
Contributing
Contributions are welcome! Please read CONTRIBUTING.md for guidelines.
Dependencies
| ID | Version |
|---|---|
| async-trait | ^0.1 |
| chrono | ^0.4 |
| serde | ^1.0 |
| serde_json | ^1.0 |
| thiserror | ^1.0 |
| tokio | ^1.0 |
| criterion | ^0.5 |
| tempfile | ^3.8 |
| tokio-test | ^0.4 |