vCard - Personal Portfolio

Email
aderakh1@uci.edu
Location

Irvine, California, USA

Portfolio

About me

I am a Ph.D. candidate in Computer Science at UC Irvine, specializing in trustworthy AI and machine learning. My research focuses on enhancing the security and robustness of large language models (LLMs), detecting social engineering threats, and advancing multimodal AI systems. I've developed methods to defend AI models against adversarial "jailbreak" attacks and techniques to identify phone-based scams, with my work published in top conferences and journals, accumulating over 200 citations.

Beyond academia, I have industry experience as a data science intern, applying machine learning to solve practical challenges. Passionate about teaching and mentorship, I have guided graduate students through machine learning and deep learning courses at UCI. Additionally, I actively contribute to the research community as a reviewer for respected journals, championing the ethical deployment of AI. I aim to leverage this blend of research excellence, industry insight, and teaching experience to drive innovation and ethical practices in AI.

Areas of Expertise

Programming

Professional programming in Python, SQL, MATLAB, Java, C/C++, with expertise in AI/ML frameworks.
Data Analytics

Advanced pipeline development: Web scraping, feature engineering, statistical analysis, and visualizations.
Machine Learning & AI

Advanced model development with PyTorch/TensorFlow: Deep learning, adversarial defenses, and multimodal system optimization.
Cybersecurity & Safety in AI

Focused on AI safety, including adversarial defenses for LLMs and detecting social engineering attacks.

Credentials

Education

University of California, Irvine
2019 — Present
Ph.D. Candidate in Computer Science
Research Focus: ML and AI Trustworthiness in NLP, LLMs, VLMs, and Multi-Modal Models, emphasizing alignment, safety, and reliability.
University of California, Irvine
2019 — 2023
M.Sc. in Computer Science, GPA: 3.98/4.0
Completed course-based Master's degree during Ph.D.
Sharif University of Technology
2014 — 2017
M.Sc. in Computer Engineering, GPA: 3.9/4.0
Specialization: Artificial Intelligence and Robotics
Thesis: "Analyzing Purchase Satisfaction Using Opinion Mining"
K.N. Toosi University of Technology
2009 — 2014
B.Sc. in Computer Engineering - Hardware, GPA: 3.52/4.0
Thesis: "Text Summarization Using LSA and NMF"

Experience

Research Assistant - Secure Systems and Software Laboratory
2019 — Present
University of California, Irvine (Prof. Ian Harris)
• Developed Adversarial Prompt Shield (APS) to defend against jailbreaking attacks on LLMs
• Created novel machine learning approaches for detecting telephone-based social engineering attacks
• Led groundbreaking human studies on telephone scams with 186 participants
• Conducted NSF-funded research on detecting social engineering attacks
Research Assistant
2017 — 2019
Sharif University of Technology (Prof. Hamid Beigy)
• Conducted innovative research on opinion mining techniques to analyze customer purchase satisfaction
• Leveraged advanced machine learning for large-scale social media analysis
• Completed thesis on analyzing purchase satisfaction using opinion mining

Publications

Robust Safety Classifier Against Jailbreaking Attacks: Adversarial Prompt Shield
Jinhwa Kim, Ali Derakhshan, Ian G. Harris. Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH) at NAACL, 2024.
Robust Safety Classifier for Large Language Models: Adversarial Prompt Shield
Jinhwa Kim, Ali Derakhshan, Ian G. Harris. arXiv preprint, 2023.
Mitra Behzadi at SemEval-2022 Task 5: Multimedia Automatic Misogyny Identification Method Based on CLIP
Mitra Behzadi, Ali Derakhshan, Ian G. Harris. Proceedings of the 16th International Workshop on Semantic Evaluation, 2022.
Detecting Telephone-Based Social Engineering Attacks Using Scam Signatures
Ali Derakhshan, Ian G. Harris, Mitra Behzadi. Proceedings of the ACM Workshop on Security and Privacy Analytics, 2021.
Rapid Cyber-Bullying Detection Method Using Compact BERT Models
Mitra Behzadi, Ian G. Harris, Ali Derakhshan. IEEE 15th International Conference on Semantic Computing (ICSC), 2021.
A Study of Targeted Telephone Scams Involving Live Attackers
Ian G. Harris, Ali Derakhshan, Marcel Carlsson. International Workshop on Socio-Technical Aspects in Security and Trust, 2020.
Sentiment Analysis on Stock Social Media for Stock Price Movement Prediction
Ali Derakhshan, Hamid Beigy. Engineering Applications of Artificial Intelligence, 2019.

Peer-Reviews:

TDSC Reviewer – Transactions on Dependable and Secure Computing (TDSC), 2023
Complexity Reviewer – Complexity Journal (COMPLEXiTY), 2022
SFI Reviewer – Springer Open Financial Innovation (SFI), 2021
IJAMCS Reviewer – International Journal of Applied Mathematics and Computer Science (IJAMCS), 2021

Portfolio

About me

Areas of Expertise

Programming

Data Analytics

Machine Learning & AI

Cybersecurity & Safety in AI

Credentials

Education

University of California, Irvine

University of California, Irvine

Sharif University of Technology

K.N. Toosi University of Technology

Experience

Research Assistant - Secure Systems and Software Laboratory

Research Assistant

Publications

Robust Safety Classifier Against Jailbreaking Attacks: Adversarial Prompt Shield
Jinhwa Kim, Ali Derakhshan, Ian G. Harris. Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH) at NAACL, 2024.

Robust Safety Classifier for Large Language Models: Adversarial Prompt Shield
Jinhwa Kim, Ali Derakhshan, Ian G. Harris. arXiv preprint, 2023.

Mitra Behzadi at SemEval-2022 Task 5: Multimedia Automatic Misogyny Identification Method Based on CLIP
Mitra Behzadi, Ali Derakhshan, Ian G. Harris. Proceedings of the 16th International Workshop on Semantic Evaluation, 2022.

Detecting Telephone-Based Social Engineering Attacks Using Scam Signatures
Ali Derakhshan, Ian G. Harris, Mitra Behzadi. Proceedings of the ACM Workshop on Security and Privacy Analytics, 2021.

Rapid Cyber-Bullying Detection Method Using Compact BERT Models
Mitra Behzadi, Ian G. Harris, Ali Derakhshan. IEEE 15th International Conference on Semantic Computing (ICSC), 2021.

A Study of Targeted Telephone Scams Involving Live Attackers
Ian G. Harris, Ali Derakhshan, Marcel Carlsson. International Workshop on Socio-Technical Aspects in Security and Trust, 2020.

Sentiment Analysis on Stock Social Media for Stock Price Movement Prediction
Ali Derakhshan, Hamid Beigy. Engineering Applications of Artificial Intelligence, 2019.

Peer-Reviews:

TDSC Reviewer – Transactions on Dependable and Secure Computing (TDSC), 2023

Complexity Reviewer – Complexity Journal (COMPLEXiTY), 2022

SFI Reviewer – Springer Open Financial Innovation (SFI), 2021

IJAMCS Reviewer – International Journal of Applied Mathematics and Computer Science (IJAMCS), 2021

Contact

Contact Form

Areas of Expertise

Programming

Data Analytics

Machine Learning & AI

Cybersecurity & Safety in AI

Daniel lewis

Education

University of California, Irvine

University of California, Irvine

Sharif University of Technology

K.N. Toosi University of Technology

Experience

Research Assistant - Secure Systems and Software Laboratory

Research Assistant

Robust Safety Classifier Against Jailbreaking Attacks: Adversarial Prompt Shield Jinhwa Kim, Ali Derakhshan, Ian G. Harris. Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH) at NAACL, 2024.

Robust Safety Classifier for Large Language Models: Adversarial Prompt Shield Jinhwa Kim, Ali Derakhshan, Ian G. Harris. arXiv preprint, 2023.

Mitra Behzadi at SemEval-2022 Task 5: Multimedia Automatic Misogyny Identification Method Based on CLIP Mitra Behzadi, Ali Derakhshan, Ian G. Harris. Proceedings of the 16th International Workshop on Semantic Evaluation, 2022.

Detecting Telephone-Based Social Engineering Attacks Using Scam Signatures Ali Derakhshan, Ian G. Harris, Mitra Behzadi. Proceedings of the ACM Workshop on Security and Privacy Analytics, 2021.

Rapid Cyber-Bullying Detection Method Using Compact BERT Models Mitra Behzadi, Ian G. Harris, Ali Derakhshan. IEEE 15th International Conference on Semantic Computing (ICSC), 2021.

A Study of Targeted Telephone Scams Involving Live Attackers Ian G. Harris, Ali Derakhshan, Marcel Carlsson. International Workshop on Socio-Technical Aspects in Security and Trust, 2020.

Sentiment Analysis on Stock Social Media for Stock Price Movement Prediction Ali Derakhshan, Hamid Beigy. Engineering Applications of Artificial Intelligence, 2019.

Peer-Reviews:

TDSC Reviewer – Transactions on Dependable and Secure Computing (TDSC), 2023

Complexity Reviewer – Complexity Journal (COMPLEXiTY), 2022

SFI Reviewer – Springer Open Financial Innovation (SFI), 2021

IJAMCS Reviewer – International Journal of Applied Mathematics and Computer Science (IJAMCS), 2021

Contact Form

Robust Safety Classifier Against Jailbreaking Attacks: Adversarial Prompt Shield
Jinhwa Kim, Ali Derakhshan, Ian G. Harris. Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH) at NAACL, 2024.

Robust Safety Classifier for Large Language Models: Adversarial Prompt Shield
Jinhwa Kim, Ali Derakhshan, Ian G. Harris. arXiv preprint, 2023.

Mitra Behzadi at SemEval-2022 Task 5: Multimedia Automatic Misogyny Identification Method Based on CLIP
Mitra Behzadi, Ali Derakhshan, Ian G. Harris. Proceedings of the 16th International Workshop on Semantic Evaluation, 2022.

Detecting Telephone-Based Social Engineering Attacks Using Scam Signatures
Ali Derakhshan, Ian G. Harris, Mitra Behzadi. Proceedings of the ACM Workshop on Security and Privacy Analytics, 2021.

Rapid Cyber-Bullying Detection Method Using Compact BERT Models
Mitra Behzadi, Ian G. Harris, Ali Derakhshan. IEEE 15th International Conference on Semantic Computing (ICSC), 2021.

A Study of Targeted Telephone Scams Involving Live Attackers
Ian G. Harris, Ali Derakhshan, Marcel Carlsson. International Workshop on Socio-Technical Aspects in Security and Trust, 2020.

Sentiment Analysis on Stock Social Media for Stock Price Movement Prediction
Ali Derakhshan, Hamid Beigy. Engineering Applications of Artificial Intelligence, 2019.