Krishna Teja Chitty-Venkata

I am a Postdoctoral Researcher in the AI/ML team (previously datascience) of Argonne Leadership Computing Facility division at the Argonne National Laboratory (US Department of Energy, Office of Science). I primarily work at the intersection of Systems and Machine Learning. Broadly, I am interested in optimizing neural network training, finetuning and inference on general purpose and AI accelerators, AI for science applications and HPC for AI.

I finished my PhD in computer engineering at Iowa State University under the guidance of Professor Arun Somani. My dissertation research was primarily focused on developing pruning, quantization and neural architecture search algorithms for optimizing neural network inference on hardware accelerators, CPUs and GPUs. During the course of my PhD, I interned at AMD, Intel and Argonne National Laboratory, working on several deep learning optimization research problems. Prior to receiving PhD, I received my Bachelor’s degree from the University College of Engineering, Osmania University, Hyderabad, India.

My research interests are as follows:

  • Improving Efficiency of Neural Networks on AI Accelerators
  • Optimizations for Large Langauge Models (LLMs) and Vision Language Models (VLMs)
  • Performance improvement of Convolutional Neural Network (CNN) and Vision Transformer (ViT)
  • Efficient Finetuning and Inference methods
  • Hardware-aware Neural Architecture Search
  • Pruning, Quantization, KV Cache Optimization/Reduction
  • Performance Modeling and Benchmarking for ML workloads
  • AI for science applications and HPC for AI


Education

Doctor of Philosophy (PhD)

  • Iowa State University, Ames, IA, USA
  • Dissertation: Hardware-aware Design, Search and Optimization of Deep Neural Networks
  • Computer Engineering
  • Supervisor: Dr. Arun K. Somani
  • 2017-2023

Bachelor of Engineering

  • University College of Engineering, Osmania University, Hyderabad, India
  • Electronics and Communication Engineering
  • 2013-2017

Publications

  • [PMBS at SC 2024] LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators
    Krishna Teja Chitty-Venkata, Siddhisanket Raskar, Bharat Kale, Farah Ferdaus, Aditya Tanikanti, Ken Raffenetti, Valerie Taylor, Murali Emani, Venkatram Vishwanath
    2024 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High-Performance Computer Systems (PMBS), Atlanta, GA, USA, 2024
    [Paper] [Code]

  • [MDPI AI 2024] ConVision Benchmark: A Contemporary Framework to Benchmark CNN and ViT Models
    Shreyas Bangalore Vijayakumar, Krishna Teja Chitty-Venkata, Kanishk Arya, Arun K Somani
    MDPI AI Journal
    [Paper] [Code]

  • [EuroPar 2024] WActiGrad: Structured Pruning for Efficient Finetuning and Inference of Large Language Models on AI Accelerators
    Krishna Teja Chitty-Venkata, Varuni Katti Sastry, Murali Emani, Venkatram Vishwanath, Sanjif Shanmugavelu, Sylvia Howland
    European Conference on Parallel Processing
    [Paper]

  • [IEEE Access 2023] Differentiable Neural Architecture, Mixed Precision and Accelerator Co-search
    Krishna Teja Chitty-Venkata, Yiming Bian, Murali Emani, Venkatram Vishwanath, Arun K Somani
    IEEE Access Journal
    [Paper]

  • [JSA 2023] A Survey of Techniques for Optimizing Transformer Inference
    Krishna Teja Chitty-Venkata, Yiming Bian, Murali Emani, Venkatram Vishwanath, Arun K Somani
    Journal of Systems Architecture Journal
    [Paper]

  • [IEEE Access 2023] Neural Architecture Search Benchmarks: Insights and Survey
    Krishna Teja Chitty-Venkata, Yiming Bian, Murali Emani, Venkatram Vishwanath, Arun K Somani
    IEEE Access Journal
    [Paper] [GitHub]

  • [ACM CSUR 2022] Neural Architecture Search Survey: A Hardware Perspective
    Krishna Teja Chitty-Venkata, Arun K Somani
    ACM Computing Surveys
    [Paper]

  • [IEEE Access 2022] Neural Architecture Search for Transformers: A Survey
    Krishna Teja Chitty-Venkata, Arun K Somani
    IEEE Access Journal
    [Paper]

  • [ACM HPDC 2022] Efficient Design Space Exploration for Sparse Mixed Precision Neural Architectures
    Krishna Teja Chitty-Venkata, Arun K Somani
    31st International Symposium on High-Performance Parallel and Distributed Computing (HPDC)
    [Paper]

  • [IEEE ICIP 2021] Searching Architecture and Precision for U-net based Image Restoration Tasks
    Krishna Teja Chitty-Venkata, Arun K Somani
    2021 IEEE International Conference on Image Processing (ICIP)
    [Paper]

  • [IEEE ASAP 2021] Array-aware Neural Architecture Search
    Krishna Teja Chitty-Venkata, Arun K Somani
    2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP)
    [Paper]

  • [IEEE HPCC/DSS 2021] Calibration Data-based CNN Filter Pruning for Efficient Layer Fusion
    Krishna Teja Chitty-Venkata, Arun K Somani
    2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP)
    [Paper]

  • [IEEE PRDC 2020] Model Compression on Faulty Array-based Neural Network Accelerator
    Krishna Teja Chitty-Venkata, Arun K Somani
    2020 IEEE 25th Pacific Rim International Symposium on Dependable Computing (PRDC)
    [Paper]

  • [IEEE ASAP 2020] Array Aware Training/Pruning: Methods for Efficient Forward Propagation on Array-based Neural Network Accelerators
    Krishna Teja Chitty-Venkata, Arun K Somani
    2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)
    [Paper]

  • [IEEE ASAP 2019] Impact of Structural Faults on Neural Network Performance
    Krishna Teja Chitty-Venkata, Arun K Somani
    2019 IEEE 30th International Conference on Application-specific Systems, Architectures and Processors (ASAP)
    [Paper]