40+ detection methods. Every one cited.
Search, filter and inspect every detector in the IPLYR forensic stack. Each method links to its paper and to a /methods slug page with implementation, test count, and adversarial-robustness notes.
Calibrated membership-inference attack against pretraining data. Uses per-token log-likelihood vs. neighborhood entropy to achieve state-of-the-art AUROC across LLaMA, Pythia, GPT-NeoX families. Decouples token informativeness from membership signal, producing a clean p-value under permutation null.
Detection of copyrighted content via paraphrase-distinguishing probes. Forces the model to choose between original passages and high-quality paraphrases; preference for the original is statistically attributable to memorization.
Non-verbatim recall measurement. Replaces exact-match with semantic-equivalence scoring under a calibrated neighborhood, recovering memorization signal that survives stylistic rewriting and translation.
Probabilistic discoverable extraction with confidence bounds. Estimates the probability that a target string is recoverable from a model under a budgeted prompt distribution, with formal lower-bound guarantees.
Training-data attribution at scale via random-projection of gradients. Returns a per-training-example influence score on a target query.
Scalable TRAK variant for billion-parameter diffusion models. Uses distilled gradient surrogates and locality-sensitive hashing for tractable attribution across catalog-scale training indices.
Low-rank adapter probes that fingerprint memorization-prone parameter subspaces. Surface circuits where training-data residue concentrates.
Speech and music MIA combining acoustic loss-curvature, mel-spectrogram divergence, and waveform-level NN-distance.
Contrastive style descriptors capturing artist-specific visual fingerprints invariant to subject matter. Validated for Getty / artist-style cases.
Melodic similarity across pitch-class profiles and rhythmic envelopes for music memorization claims.
Spatiotemporal action-level latent localization for short-form video memorization detection.
Forensic detection of model merges and lineage via centered kernel alignment between candidate parent models and a target.
Fisher / Stouffer / Simes / harmonic-mean / Cauchy combiners over independent MIA detectors. Patentable ensemble — top novelty rating.
Collective dataset inference at the concept level for image distributions.
Two-stage detector that disentangles RAG-retrieval recall from parametric memorization.
Two-sample tests (MMD, Kolmogorov–Smirnov, energy distance) for concept-level recall in generative models.
Diffusion-model attribution via inverted embedding probes. Recovers a pseudo-token that best activates target style; statistical significance of activation vs. distractors yields attribution score. Patentable invention, top novelty rating.
Self-supervised copy detection embeddings combined with DINOv2 features for image-level provenance and near-duplicate detection.
Six-axis verifier for unlearning and deletion claims: verbatim, knowledge, privacy, bias, utility, and scalability.
Tight empirical lower bound on the differential-privacy epsilon of a deployed model via membership-inference adversary.
- ·8th Circuit case-law coverage is currently 0 cases — flagged gap for next dataset expansion.
- ·Audio and video modules (MelodySim, STALL) are Tier 2–3; full Daubert-tier hardening is in progress.
- ·Reports are evidentiary inputs, not legal conclusions. Daubert-tier classification reflects the current SaTML 2025 framework; ultimate admissibility is the court's determination.