Pattern Classifier

This model was trained to classify which patterns a subject model was trained on, based on neuron activation signatures.

Dataset

Training Dataset: maximuspowers/muat-fourier-10
Input Mode: signature
Number of Patterns: 14

Patterns

The model predicts which of the following 14 patterns the subject model was trained on:

palindrome
sorted_ascending
sorted_descending
alternating
contains_abc
starts_with
ends_with
no_repeats
has_majority
increasing_pairs
decreasing_pairs
vowel_consonant
first_last_match
mountain_pattern

Model Architecture

Signature Encoder: [512, 256, 256, 128]
Activation: relu
Dropout: 0.2
Batch Normalization: True

Training Configuration

Optimizer: adam
Learning Rate: 0.001
Batch Size: 16
Loss Function: BCE with Logits (with pos_weight for training, unweighted for validation)

Test Set Performance

F1 Macro: 0.3612
F1 Micro: 0.3491
Hamming Accuracy: 0.7797
Exact Match Accuracy: 0.0400
BCE Loss: 0.4108

Per-Pattern Accuracy (Test Set)

When a model was trained on a pattern, what % of the time does the classifier detect it:

Pattern	Recall (Detection Rate)
palindrome	89.7%
sorted_ascending	79.8%
sorted_descending	85.4%
alternating	82.4%
contains_abc	84.0%
starts_with	85.7%
ends_with	83.8%
no_repeats	83.1%
has_majority	68.2%
increasing_pairs	83.8%
decreasing_pairs	83.3%
vowel_consonant	59.3%
first_last_match	86.2%
mountain_pattern	82.3%

Usage

import torch
from huggingface_hub import hf_hub_download

# Download the model
checkpoint_path = hf_hub_download(repo_id='maximuspowers/muat-fourier-10-classifier', filename='best_model.pt')
checkpoint = torch.load(checkpoint_path)

Downloads last month: 4

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train maximuspowers/muat-fourier-10-classifier

Collection including maximuspowers/muat-fourier-10-classifier

Meta-UAT

Collection

Weight space learning experiments (interpreting behavior through activation signatures) • 16 items • Updated 3 days ago