Krishna Teja Chitty-Venkata

Senior Machine Learning Research Engineer

LinkedIn
Google Scholar
ResearchGate
GitHub
ORCID ORCID
HuggingFace

I am a Senior Machine Learning Research Engineer at Red Hat Inc . I work in the Machine Learning Research team of Red Hat AI on quantization and sparsity of Large Language Models. My goal is to make LLM inference cheap, energy-efficient and production-ready at Red Hat.

Prior to joining Red Hat, I was a Postdoctoral Researcher in the AI/ML team of Argonne Leadership Computing Facility division at the Argonne National Laboratory (US Department of Energy, Office of Science). I worked on optimizing Large Language Model training, finetuning and inference on GPUs and AI accelerators, AI for science applications and HPC for AI. I finished my PhD in Computer Engineering at Iowa State University under the guidance of Professor Arun Somani. My dissertation research was primarily focused on developing pruning, quantization and neural architecture search algorithms for optimizing neural network inference on hardware accelerators, CPUs and GPUs. During the course of my PhD, I interned at AMD, Intel and Argonne National Laboratory, working on several deep learning optimization research problems. I received my Bachelor’s degree from the University College of Engineering, Osmania University, Hyderabad, India.

My research interests are as follows:

Improving Efficiency of Neural Networks on AI Accelerators
Optimizations for Large Langauge Models (LLMs) and Vision Language Models (VLMs)
Performance improvement of Convolutional Neural Network (CNN) and Vision Transformer (ViT)
Efficient Finetuning and Inference methods
Hardware-aware Neural Architecture Search
Pruning, Quantization, KV Cache Optimization/Reduction
Performance Modeling and Benchmarking for ML workloads
AI for science applications and HPC for AI