Peter Szemraj PRO

pszemraj

https://pszemraj.carrd.co/

AI & ML interests

metallic intuition

Recent Activity

upvoted a paper 5 days ago

Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

published a model 7 days ago

pszemraj/0xProto-368K-GGUF

updated a model 7 days ago

pszemraj/0xProto-368K-GGUF

View all activity

Organizations

upvoted a paper 5 days ago

Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

Paper • 2605.22791 • Published 9 days ago • 30

upvoted an article 8 days ago

Article

Introducing the Ettin Reranker Family

tomaarsen

•

11 days ago

• 48

upvoted a paper 9 days ago

Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 54

upvoted 3 papers 16 days ago

upvoted an article 17 days ago

Article

Adding Benchmaxxer Repellant to the Open ASR Leaderboard

bezzam, Steveeeeeeen, eustlb, SBruccoleriAppen, jmss-appen, c-e-ford-appen, wgb14, YukaiHuang, like2026, logicbean, ally-lxl

•

24 days ago

• 17

upvoted a paper 17 days ago

Investigating Efficiently Extending Transformers for Long Input Summarization

Paper • 2208.04347 • Published Aug 8, 2022 • 1

upvoted an article 18 days ago

Article

EMO: Pretraining mixture of experts for emergent modularity

allenai

•

21 days ago

• 38

upvoted an article 25 days ago

Article

Multimodal Embedding & Reranker Models with Sentence Transformers

tomaarsen

•

Apr 9

• 60

upvoted a collection 26 days ago

OlmPool

Collection

Collection of models from the paper "Cracks in the Foundation: Seemingly Minor Architectural Choices Impact Long Context Extension". • 26 items • Updated 29 days ago • 5

upvoted a paper 27 days ago

A Survey on LLM-based Conversational User Simulation

Paper • 2604.24977 • Published Apr 27 • 8

upvoted a paper 28 days ago

Efficient Training on Multiple Consumer GPUs with RoundPipe

Paper • 2604.27085 • Published about 1 month ago • 40

upvoted a paper 30 days ago

Why Fine-Tuning Encourages Hallucinations and How to Fix It

Paper • 2604.15574 • Published Apr 16 • 25

upvoted a collection 30 days ago

Olmo 3.1

Collection

The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated Dec 23, 2025 • 52

upvoted an article 30 days ago

Article

Granite 4.1 LLMs: How They’re Built

ibm-granite

•

about 1 month ago

• 76

upvoted a paper 30 days ago

Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora

Paper • 2604.24819 • Published Apr 27 • 89

upvoted 2 collections about 1 month ago

Laguna XS.2

Collection

Designed for agentic coding and long-horizon work on a local machine. Apache 2.0. • 5 items • Updated 22 days ago • 23

Parakeet ASR

Collection

NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants. • 16 items • Updated 10 days ago • 75

upvoted a paper about 1 month ago

Multi-User Large Language Model Agents

Paper • 2604.08567 • Published Mar 19 • 27

Peter Szemraj PRO

AI & ML interests

Recent Activity

Organizations

pszemraj's activity

Introducing the Ettin Reranker Family

Adding Benchmaxxer Repellant to the Open ASR Leaderboard

EMO: Pretraining mixture of experts for emergent modularity

Multimodal Embedding & Reranker Models with Sentence Transformers

Granite 4.1 LLMs: How They’re Built