I am an Applied Scientist at VCV-Science group at Apple where I work on vision foundation models. Prior to that, I was a CS Ph.D. student and graduate researcher at Oregon State University supervised by Prof. Fuxin Li.

My research interests mainly include foundation models, generative models, 2D/3D computer vision, interpretability, long-tail problem, self-supervised learning.

News [outdated]


Publications


    “Cycle-Consistent Counterfactuals by Latent Transformations” (CVPR 22)

    Saeed Khorram, Li Fuxin
    [PDF] [Code] [Poster]

    CounterFactual (CF) visual explanations try to find images similar to the query image that change the decision of a vision system to a specified outcome. Existing methods either require inference-time optimization or joint training with a generative adversarial model which makes them time-consuming and difficult to use in practice. We propose a novel approach, Cycle-Consistent Counterfactuals by Latent Transformations (C3LT), which learns a latent transformation that automatically generates visual CFs by steering in the latent space of generative models. Our method uses cycle consistency between the query and CF codes in the latent space which helps our training to find better solutions. C3LT can be easily plugged into any state-of-the-art pretrained generative network. This enables our method to generate high-quality and interpretable CF examples at high-resolution images such as those in ImageNet. In addition to several established metrics for evaluating CF explanations, we introduce a novel metric tailored to assess the quality of the generated CF examples and validate the effectiveness of our method on an extensive set of experiments.

    From Heatmaps to Structured Explanations of Image Classifiers (Applied AI Letters 21)

    Li Fuxin, Zhongang Qi, Saeed Khorram, Vivswan Shitole, Prasad Tadepalli, Minsuk Kahng, Alan Fern
    [PDF]

    This paper summarizes our endeavors in the past few years in terms of explaining image classifiers, with the aim of including negative results and insights we have gained. The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network, without relying on human linguistic concepts. This helps users understand network classifications that are less intuitive and substantially improves user performance on a difficult fine-grained classification task of discriminating among different species of seagulls. Realizing that an important missing piece is a reliable heatmap visualization tool, we have developed I-GOS and iGOS++ utilizing integrated gradients to avoid local optima in heatmap generation, which improved the performance across all resolutions. During the development of those visualizations, we realized that for a significant number of images, the classifier has multiple different paths to reach a confident prediction. This has lead to our recent development of structured attention graphs (SAGs), an approach that utilizes beam search to locate multiple coarse heatmaps for a single image, and compactly visualizes a set of heatmaps by capturing how different combinations of image regions impact the confidence of a classifier. Through the research process, we have learned much about insights in building deep network explanations, the existence and frequency of multiple explanations, and various tricks of the trade that make explanations work. In this paper, we attempt to share those insights and opinions with the readers with the hope that some of them will be informative for future researchers on explainable deep learning.

    Re-Understanding Finite-State Representations of Recurrent Policy Networks (ICML 21)

    Mohamad H. Danesh, Anurag Koul, Alan Fern, Saeed Khorram
    [PDF] [Code] [Presentation]

    We introduce an approach for understanding control policies represented as recurrent neural networks. Recent work has approached this problem by transforming such recurrent policy networks into finite-state machines (FSM) and then analyzing the equivalent minimized FSM. While this led to interesting insights, the minimization process can obscure a deeper understanding of a machine’s operation by merging states that are semantically distinct. To address this issue, we introduce an analysis approach that starts with an unminimized FSM and applies more-interpretable reductions that preserve the key decision points of the policy. We also contribute an attention tool to attain a deeper understanding of the role of observations in the decisions. Our case studies on 7 Atari games and 3 control benchmarks demonstrate that the approach can reveal insights that have not been previously noticed.

    Stochastic Block-ADMM for Training Deep Networks (pre-print)

    Saeed Khorram, Xiao Fu, Mohammad H. Danesh, Zhongang Qi, Li Fuxin
    [PDF]

    In this paper, we propose Stochastic Block-ADMM as an approach to train deep neural networks in batch and online settings. Our method works by splitting neural networks into an arbitrary number of blocks and utilizes auxiliary variables to connect these blocks while optimizing with stochastic gradient descent. This allows training deep networks with non-differentiable constraints where conventional backpropagation is not applicable. An application of this is supervised feature disentangling, where our proposed DeepFacto inserts a non-negative matrix factorization (NMF) layer into the network. Since backpropagation only needs to be performed within each block, our approach alleviates vanishing gradients and provides potentials for parallelization. We prove the convergence of our proposed method and justify its capabilities through experiments in supervised and weakly-supervised settings.

    iGOS++: Integrated Gradient Optimized Saliency by Bilateral Perturbations (ACM-CHIL 21)

    Saeed Khorram*, Tyler Lawson*, Li Fuxin
    [PDF] [Code] [Poster] [Presentation]

    The black-box nature of the deep networks makes the explanation for "why" they make certain predictions extremely challenging. Saliency maps are one of the most widely-used local explanation tools to alleviate this problem. One of the primary approaches for generating saliency maps is by optimizing a mask over the input dimensions so that the output of the network is influenced the most by the masking. However, prior work only studies such influence by removing evidence from the input. In this paper, we present iGOS++, a framework to generate saliency maps that are optimized for altering the output of the black-box system by either removing or preserving only a small fraction of the input. Additionally, we propose to add a bilateral total variation term to the optimization that improves the continuity of the saliency map especially under high resolution and with thin object parts. The evaluation results from comparing iGOS++ against state-of-the-art saliency map methods show significant improvement in locating salient regions that are directly interpretable by humans. We utilized iGOS++ in the task of classifying COVID-19 cases from x-ray images and discovered that sometimes the CNN network is overfitted to the characters printed on the x-ray images when performing classification. Fixing this issue by data cleansing significantly improved the precision and recall of the classifier.

    Visualizing Deep Networks by Optimizing with Integrated Gradients (CVPRW 19/AAAI 20)

    Zhongang Qi, Saeed Khorram, Li Fuxin
    [PDF] [Demo] [Code] [Medium]

    Understanding and interpreting the decisions made by deep learning models is valuable in many domains. In computer vision, computing heatmaps from a deep network is a popular approach for visualizing and understanding deep networks. However, heatmaps that do not correlate with the network may mislead human, hence the performance of heatmaps in providing a faithful explanation to the underlying deep network is crucial. In this paper, we propose I-GOS, which optimizes for a heatmap so that the classification scores on the masked image would maximally decrease. The main novelty of the approach is to compute descent directions based on the integrated gradients instead of the normal gradient, which avoids local optima and speeds up convergence. Compared with previous approaches, our method can flexibly compute heatmaps at any resolution for different user needs. Extensive experiments on several benchmark datasets show that the heatmaps produced by our approach are more correlated with the decision of the underlying deep network, in comparison with other stateof-the-art approaches.

    Embedding Deep Networks into Visual Explanations (Artificial Intelligence Journal 2020)

    Zhongang Qi, Saeed Khorram, Li Fuxin
    [PDF] [Code]

    We propose a novel Explanation Neural Network (XNN) to explain the predictions made by a deep network. The XNN works by embedding a high-dimensional activation vector of a deep network layer non-linearly into a low-dimensional explanation space while retaining faithfulness i.e., the original deep learning predictions can be constructed from the few concepts extracted by our explanation network. We then visualize such concepts for human to learn about the high-level concepts that deep learning is using to make decisions. We propose an algorithm called Sparse Reconstruction Autoencoder (SRAE) for learning the embedding to the explanation space. SRAE aims to reconstruct part of the original feature space while retaining faithfulness. A pull-away term is applied to SRAE to make the explanation space more orthogonal. A visualization system is then introduced for human understanding of the features in the explanation space. The proposed method is applied to explain CNN models in image classification tasks. We conducted a human study, which shows that the proposed approach outperforms a saliency map baseline, and improves human performance on a difficult classification task. Also, several novel metrics are introduced to evaluate the performance of explanations quantitatively without human involvement.

  • Mehran Soltani, Mohammad Hasan Shammakhi, Saeed Khorram, and Hamid Sheikhzadeh. ”Combined mRMR filter and sparse Bayesian classifier for analysis of gene expression data.”, International Conference of Signal Processing and Intelligent Systems (ICSPIS), IEEE, 2016. [PDF]

  • Mohamadreza Jafaryani, Saeed Khorram, Vahid Pourahmadi, and Minoo Shahbazi. “Sleep Stage Scoring Using Joint Frequency-Temporal and Unsupervised Features,”, International Conference on New Research Achievements in Electrical and Computer Engineering (ICNRAECE), IEEE, 2016. [PDF]