Attention-Seeker-V1

Attention-Seeker-V1 is a Transformer-from-scratch implementation designed as a high-fidelity recreation of the "Attention Is All You Need" architecture. This model serves as the backend for an interactive React-based visualization tool, aiming to peel back the "black box" of Neural Machine Translation (NMT).

Model Details

Model Description

Developed by: Kinjal Chakraborty
Model type: Standard Encoder-Decoder Transformer
Language(s): English to French
License: MIT
Task: Sequence-to-Sequence Translation

Technical Specifications

Layers: 6 Encoder layers, 6 Decoder layers
Attention Heads: 8
Embedding Dimension ($d_{model}$): 512
Max Sequence Length: 5000 tokens
Dropout: 0.1
Vocabulary Size: 30,000 tokens
Tokenizer: Word-level tokenizer with whitespace preprocessing. (Note: Experimental BPE and Regex tokenizers were also developed as part of this project's research phase).

Uses

Intended Use

This model is primarily an educational tool. It is optimized for use with the Attention-Seeker Frontend to visualize:

Multi-Head Attention weights
Encoder-Decoder cross-attention
Positional Encodings and Layer Normalization effects

Out-of-Scope Use

Due to compute-constrained training (1 epoch), this model is a proof-of-concept. It should not be used for production-grade translation or sensitive localization tasks.

Training Details

Training Data

Dataset: Helsinki-NLP/opus_books (English-French subset)
Size: ~127,000 sentence pairs

Training Procedure

Hardware: NVIDIA GeForce RTX 4060 Laptop GPU
Training Time: ~2.5 Hours
Optimizer: Adam
Learning Rate: 1e-4 (Fixed)
Batch Size: 4
Epochs: 1 (Proof of Concept)
Loss Function: Cross-Entropy Loss

Evaluation

Sample Translation

Input (English)	Output (French)
hello how are you?	comment vous êtes - vous ? [EOS]

How to Get Started

To load this model into the Attention-Seeker Python inference engine, found on github and access and interact with the front-end.

https://github.com/nullPointer0x43/Attention-Seeker

Or load and run the docker images found on:

Downloads last month: -; Downloads are not tracked for this model. How to track

kinjal2
/

Attention-Seeker-Model