matrix-profile-rs

Recently Updated Rust Polars

Contents

Concept

Matrix Profile algorithms (STOMP, SCRIMP++, SCAMP) in native Rust for motif discovery and anomaly detection in time series data. Achieves 2.5x speedup via SIMD (AVX2/NEON), handles datasets exceeding RAM through memory-budgeted tiling, and integrates with Polars as a native DataFrame operation.

8,700 lines of Rust. 58 tests.


Technical Reports


Architecture


Features


Current Status

v1.0 MVP Shipped (2026-02-22) Current Milestone: v1.1 Streaming (defining requirements) Progress: v1.0 complete, v1.1 (planning phase)


Quick Facts

   
Status Recently Updated
Stack Rust, Polars

The Insight

Time series analysis typically requires either slow Python libraries or complex manual implementation. matrix-profile-rs provides Matrix Profile algorithms (STOMP, SCRIMP++, SCAMP) in native Rust with ergonomic APIs for motif discovery and anomaly detection, achieving C-level performance with Python-level usability through Polars integration.


What This Is

A high-performance Rust implementation of Matrix Profile algorithms for time series analysis with SIMD acceleration, out-of-memory tiling support, and Polars ecosystem integration. Matrix Profiles enable pattern discovery, anomaly detection, and similarity search in univariate time series without domain knowledge or parameter tuning.

Think of it as “find repeating patterns and anomalies in any time series data” with a simple API: df.select(pl.col("ts").mp().stomp(m=20)) for Polars users, or direct Rust APIs for maximum performance and scale.


Core Value

Performance at scale with ergonomic APIs — achieve 2.5x speedup via SIMD, handle datasets larger than RAM via tiling, while maintaining simple .motifs(k) / .discords(k) interfaces.


Current Milestone: v1.1 Streaming

Goal: Enable real-time Matrix Profile computation for live time series data.

Target features:


Problem It Solves

Time series analysis requires identifying:

Existing solutions:

matrix-profile-rs provides production-quality implementations with:


Requirements


# Validated

Core Algorithms:

Ergonomics & API:

Performance:

Ecosystem:


# Active

v1.1 Streaming (current milestone):

Future Enhancements (v1.2+):


# Out of Scope


# Algorithm Stack


# Data Flow

Time Series Data (Array1<f64> or Polars Series)
    ↓
Matrix Profile Calculation (STOMP/SCAMP/SCRIMP++)
    ↓ (SIMD acceleration transparent on contiguous data)
MatrixProfile struct (distances, indices, metadata)
    ↓
Discovery APIs → .top_k_motifs() / .top_k_discords()
    ↓
Polars DataFrame (via to_dataframe()) or Rust types

# Key Components


Context


# Codebase State

Shipped v1.0 (2026-02-22): 8,705 LOC Rust across 7 phases, 26 plans

Tech Stack:

Test Coverage:


# Known Issues & Tech Debt

Low-Priority (8 items documented in TECH-DEBT.md):

Resolved:


# User Feedback

None yet — v1.0 is initial release. Expecting feedback on:


Key Decisions

Decision Rationale Outcome Status
Separate matrix-profile-rs and stump-rs crates Port stump-rs for reference, build matrix-profile-rs fresh for API design freedom Clean APIs, reference validation working ✓ Good
Vec-backed MatrixProfile with sentinels Binding-friendly, avoids Option overhead Clean FFI surface, efficient ✓ Good
SIMD transparent dispatch via contiguity check Zero API changes, automatic acceleration 2.5x speedup, cases accelerated ✓ Good
Metadata columns (mp_*) for Polars DataFrame Polars schema API unstable, columns self-describing DataFrames fully self-describing ✓ Good
Tiling with memory budget Enables N>10^6 datasets, user-controlled memory Validated at N=10^6 under 64MB ✓ Good
Ignore scale tests for CI N=10k test takes 98s, larger tests minutes Green CI, manual validation for releases ✓ Good
Feature-gate Polars integration Keep default build dependency-light Default build green, Polars optional ✓ Good
SCRIMP++ budget-based anytime User-controlled trade-off: speed vs accuracy budget finds motifs of time ✓ Good

Use Cases

Predictive Maintenance:

Healthcare:

Finance:

Operations:


Why Rust + Polars?

Performance: Native compiled code achieving 2.5x speedup via SIMD, no JIT warmup, efficient memory usage Ergonomics: Polars integration makes Matrix Profiles a DataFrame operation (.mp().stomp(m)) Distribution: Single binary, no runtime dependencies, easy deployment Correctness: Strong typing catches errors at compile time Scalability: Tiling strategy handles datasets larger than RAM


Technical Details

Matrix Profile Basics:

Algorithm Complexity:

Performance Achieved (v1.0):

Last updated: 2026-02-22 after v1.0 milestone, v1.1 milestone started


Current Status

2026-03-09 — Completed 15-01-PLAN.md: MultiStreamingState core + Join distance kernels + Batch Join STOMP