Projects ¶
Project Dashboard
2025
Models on Hugging Face
- LoRA Model (Phi‑4‑based English Chatbot) – Lightweight chatbot fine‑tuned with LoRA and Unsloth on FineTome‑100k, 4‑bit quantized .
- LLaMA 3.2‑3B Chat Model – Instruction‑tuned 3B conversational model supporting 2048‑token context, LoRA‑trained .
- Vision‑Language Medical Captioning Model – Fine‑tuned on radiology images via QLoRA and LoRA adapters for image‑to‑text tasks .
- Kashur GPT (Kashmiri Model) – Transformer-based decoder trained for Kashmiri text generation with demo available via Space .
- English SpeechT5 TTS – Fine‑tuned SpeechT5 on LJSpeech + custom data, optimized for technical English pronunciation .
- Turkish SpeechT5 TTS – Turkish TTS fine‑tuned on native dataset with MOS evaluation, implemented as part of internship project .
- Kashmiri LLM BERT‑Base – Experimental Kashmiri BERT‑base model trained from scratch (~45 K corpus), mainly for research .
Datasets on Hugging Face
- vision_language_pairs_data – Multimodal image‑text dataset for vision-language modeling
- environmental_sound_data_hnm – Environmental sound recordings dataset for audio classification
- human_genome_variants_hnm – Annotated genomic variants for bioinformatics research
- BITCOIN_TRANSACTION_DATASET_hnm – Blockchain transaction records for financial analysis
- kashmiri_multilingual_dictionary_dataset – Parallel dictionary dataset for translation and NLP tasks
- fairness_benchmark_data_HNM – Bias/fairness benchmarking dataset for ethical ML evaluation
- 17k‑high‑quality‑stack‑overflow‑QA‑pairs – Cleaned QA dataset for programming Q&A use cases
- Kashmiri_Text_Corpus_Cleaned_2025_HNM – Clean Kashmiri corpus curated in 2025 for NLP research
- kashmiri_sharegpt – ShareGPT-format conversational dataset for Kashmiri chatbot fine-tuning
- 200_Kashmiri_food_dishes – Cultural dataset of Kashmiri dish names/details for NLP tasks
- Kashmiri Text Corpus Dataset – Core Kashmiri corpus for academic NLP research
- Urdu Text Corpus (Sentence wise) Dataset – Sentence-level Urdu corpus for NLP study
- Persian Text Corpus Dataset – Persian language corpus for academic NLP tasks
- 40K Kashmiri text recognition dataset – OCR dataset with 40K image-text pairs for text recognition
- 30K Kashmiri text recognition dataset – OCR dataset (~30K samples) for Kashmiri script detection
- Urdu News Text Corpus Dataset – Urdu news articles corpus for research and NLP
- English TTS Dataset – Spoken English dataset curated for TTS training
Spaces by HNM (Omarrran)
- PyTorch‑8‑Bit‑Weight‑Quantizer – Web app to quantize neural network weights to 8-bit precision .
- Video_Rackground_Removal – In-browser real-time background removal using Transformers.js and MODNet .
- DSPy Summarization HNM – PDF uploader summarizer Space that chunks and summarizes documents .
- II Eleven Labs Speech – Speech synthesis demo using Eleven Labs API key input functionality .
Note: For the most up‑to‑date list of models, datasets, and Spaces, please visit the Omarrran profile on Hugging Face.
2024 – Best Projects (TTS, Fine-Tuning, BERT, Deep Learning)
- Fine-Tuning Microsoft SpeechT5 TTS for Turkish:
This project focuses on customizing Microsoft’s SpeechT5 TTS model for Turkish language synthesis, showcasing advanced multilingual speech applications.
View Project: Turkish TTS Fine-Tuning
- Training BERT for Sentiment Analysis:
Builds on the BERT transformer architecture to classify sentiment (positive, negative, neutral) with high accuracy.
View Project: BERT IMDB Reviews
- Deep Learning-based Inpainting & Object Removal:
An advanced approach that uses CNNs for removing objects or filling in missing regions in images seamlessly.
View Project: Inpainting Using Flux
2023
- Music Genre Classifier via CNN:
Classifies audio clips into different genres using convolutional neural networks and MFCC feature extraction.
View Project: Music Genre Classifier
- Human Activity Recognition:
Detects user activities (walking, standing, etc.) using smartphone sensor data and machine learning models.
View Project: Human Activity Recognition
2022
- Credit Card Fraud Detection:
Utilizes machine learning techniques to identify fraudulent transactions within real-world financial datasets.
View Project: Credit Card Fraud Detection
- Detecting Fake News:
Builds a text classification model to distinguish real news articles from fake ones.
View Project: Fake News Detection
- Personal Movie Recommendation with LGBM Ranker:
Provides tailored movie suggestions based on user preferences using LightGBM for ranking.
View Project: Movie Recommendator
2021
- Spam Email Detection with Deep Learning:
Utilizes neural networks and NLP preprocessing to classify emails as spam or ham.
View Project: Spam Detection
- Face Recognition using LBPH:
Implements OpenCV’s LBPH for recognizing faces in real-time camera feeds.
View Project: Face Recognition
- Handwritten Digit Classifier (MNIST):
Employs a simple neural network for digit classification on the MNIST dataset.
View Project: Digit Classifier
Some Fun and Learning Projects
-
AI LiveKit Voice Assistant
A voice assistant using LiveKit for real-time interaction with Deepgram (speech-to-text), OpenAI (LLM + TTS), and Silero (VAD).
View Project: AI LiveKit Voice Assistant
-
Object Eraser
Automatically remove objects from images by naming them or drawing bounding boxes, seamlessly handling shadows/reflections.
View Project: Object Eraser
-
Math Solver with Gradio
Web app that uses OCR (Qwen2-VL) & NLP (Qwen2-Math) to solve math queries from images or text, all via a Gradio interface.
View Project: Math Solver with Gradio
-
Python Discord Bot
A simple Discord bot built with
discord.py
to demonstrate basic command handling and user interaction.View Project: Python Discord Bot
-
Brain Tumor Detection
Applies image processing and ML models to MRI scans for brain tumor segmentation and classification.
View Project: Brain Tumor Detection
-
OCR with Multiple Languages
Extracts text (in various languages) from images and offers text search within extracted content.
View Project: Multi-Language OCR
-
Healthcare Chatbot
Machine learning–based symptom diagnosis system using decision tree and SVM, offering precaution recommendations.
View Project: Healthcare Chatbot
-
Speech Emotion Recognition
Analyzes audio features to classify emotional states (e.g., happy, sad, angry) using deep learning.
View Project: Speech Emotion Recognition
-
Stock Price Prediction
Predicts next-day stock behavior (up/down) from historical market data using various ML models.
View Project: Stock Price Predictor
-
Hate Speech Detection
Classifies tweets into hate speech, offensive language, or neither using text preprocessing and Keras-based models.
View Project: Hate Speech Detection