Abdullah Hashmat

Research Assistant

DFKI Germany & RPTU Kaiserslautern-Landau

B.S. Computer Science, LUMS (2025)

About

I am a Computer Science graduate from LUMS specializing in model compression, NLP for low-resource languages, and LLM bias & jailbreak evaluations. Currently working as a Research Assistant for DFKI Germany in collaboration with RPTU in the domain of bioinformatics, exploring the use of LLMs and ML models for molecular property prediction.

My research interests include model compression, AI fairness & Interpretability, reasoning, jailbreaking LLMs, and developing AI solutions for speech processing and analytics.

For my final year thesis, I created PakBBQ, a bias benchmarking QA dataset with 17,180 Urdu/English examples across 8 sociocultural categories. I evaluated 6 multilingual LLMs, revealing cross-linguistic bias gaps and effective bias mitigation strategies. Thus work culminated in a paper accepted at EMNLP 2025 main conference (Core: A*).

News & Updates

November 2025

Paper accepted at EMNLP 2025 Main Conference (Core: A*)

June 2025

Started as Research Assistant at DFKI Germany

May 2025

Graduated from LUMS with B.S. in Computer Science

January 2025

Appointed as Teaching Assistant for CS5302 Foundations of Generative AI

January 2025

Appointed as Teaching Assistant for AI600 Machine Learning (MS in AI)

September 2024

Appointed as Teaching Assistant for CS535 Machine Learning

September 2023

Joined AI in Healthcare Initiative (AIHI) as a Research Assistant

September 2023

Joined Center for Speech and Language Technologies (CSaLT) as a Research Assistant

September 2023

Appointed as Teaching Assistant for CS100 Computational Problem Solving

Publications

PakBBQ: A Culturally Adapted Bias Benchmark for QA

EMNLP 2025 (Main Conference, Core: A*)

With the widespread adoption of Large Language Models (LLMs), it's crucial to ensure fairness across all user communities. Most LLMs are trained on Western-centric data, neglecting low-resource languages and regional contexts. To address this, we introduce PakBBQ — an extension of the Bias Benchmark for Question Answering (BBQ) dataset — featuring 214 templates and 17,180 QA pairs in English and Urdu across 8 bias dimensions: age, disability, appearance, gender, socio-economic status, religion, regional affiliation, and language formality.

Our experiments show: (i) a 12% accuracy gain with disambiguation, (ii) stronger counter-bias behaviors in Urdu, and (iii) framing effects reducing stereotypes in negatively posed questions.

[PDF] [Code]

Research Interests

Large Language Models

LLM Biases and Fairness
Jailbreaking & Robustness
Architectural Optimization

Model Compression

Knowledge Distillation Frameworks
Model Sparsity
OOD Generalizable Vision Models

Bioinformatics

Contrastive Learning Strategies
LLMs for Molecular Property Prediction
Prompt Engineering & Agentic Workflows

Future Explorations

Reasoning Paths in LLMs
Multimodal LLM Biases
Fair & Ethical AI

Selected Projects

2025 FYP

PakBBQ: Bias Benchmark

Culturally adapted QA benchmark for multilingual LLM bias evaluation. 17,180 examples in Urdu/English across 8 bias categories.

View Project →

2025 Research

Bias Mitigation KD

Confidence-aware knowledge distillation achieving balanced shape/texture bias. Improved OOD generalization on vision models.

View Project →

2024 Low-Resource NLP

AI for All

Evaluating LLMs in Sindhi, Pashto, and Urdu. Bias and jailbreak testing on lightweight transformers.

View Project →

2024 AWS & LLMs

AI Stock Insights

AWS-based platform for stock analysis. S3, Lambda, ECS, Postgres with LLM-powered chatbot.

View Project →

2024 Agentic AI

Data Science Chatbot

End-to-end AI chatbot for EDA and data cleaning. Fine-tuned GPT-3.5, deployed on Hugging Face.

View Project →

2024 Game Dev

Platformer Game

2D platformer in Unity with 10+ levels. Hand-drawn pixel art, dynamic lighting, puzzle mechanics.

View Project →

Skills

Programming

Python
JavaScript / TypeScript
C++ / C / C#
Matlab
SQL

AI & ML

PyTorch
TensorFlow
scikit-learn
Transformers
NLTK

Data & Tools

NumPy & Pandas
GeoPandas
DynamoDB
Matplotlib & Seaborn

Development

React & MERN Stack
Redux
Tailwind CSS

Cloud & DevOps

AWS
Google APIs
Neon Postgre

Other

Gradio
LaTeX
Unity
Hugging Face