Machine Leaning Blog

Siddhartha

Research Engineer

Can You Win Everything with A Lottery Ticket?

20 March 2023

TL;DR Recent works have demonstrated that it is possible to obtain sparse networks that are competitive with dense networks. The outperformance of sparse networks have been investigated primarily in terms of test set accuracy. In this post, I will attempt to summarize the paper Can You Win Everything with A...

Are wider nets better given the same number of parameters?

17 January 2023

Introduction Prior work has demonstrated that increasing the parameter count while increasing the network width improves the performance. Since these studies did not control for increased parameter count, it was hard to establish the exact cause behind the improved performance. A recent paper Are wider nets better given the same...

Accelerating deep neural network inference via structured pruning

19 December 2022

It has been nearly 5 years since lottery ticket hypothesis (LTH) was proposed. Countless theoretical papers have been written on it. Even after achieving sparsity as high as 95% without a significant drop in performance, the application of LTH in industry remains relatively limited. One of the main factors that...

Constraining Neural Network activations with Partial Differential Equations

26 August 2022

Introduction Recently I came across an interesting paper: Condensing CNNs with Partial Differential Equations by Kag et al. The paper proposes a way to enforce constraints on the intermediate activations in a CNN model. What is the advantage of enforcing these constraints ? Through extensive experiments on different datasets [ImageNet...

An interesting pattern in the final fully connected layer weights

15 May 2022

TL;DR I have written a more formal summary of my findings in here. Github: https://github.com/sidml/interpret-fc-layer You can replicate my experiment results by running the notebooks in git repo or by running this Kaggle Notebook for fitting ALD and this for visualizing the internal split. The results of fitting ALD to...

Self Supervised Learning for Gravitational Waves

11 October 2021

Context: It is often hard to find a large collection of correctly labelled data. This scarcity of labelled real data exists for gravitational waves as well. One way to combat this shortage is to train on simulated data and hope the trained model is useful for the real dataset as...

Notes from our 3rd place solution in kaggle rainforest audio detection challenge

05 July 2021

Recently I participated in Kaggle’s Rainforest Connection Species Audio Detection challenge with my friend Ee Kin. In this post I share the psuedo labels generation procedure, performance of different model architectures and loss functions in more detail. Ee Kin’s had also posted on Kaggle Forum. I have created a notebook...

Understanding Centernet

05 November 2019

Recently I came across a very nice paper Objects as Points by Zhou et al. I found the approach pretty interesting and novel. It doesn’t use anchor boxes and requires minimal post-processing. The essential idea of the paper is to treat objects as points denoted by their centers rather than...

CAM visualization of EfficientNet

05 June 2019

Recently Google AI Research published a paper titled “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks”. In this paper the authors propose a new architecture which achieves state of the art classification accuracy on ImageNet while being 8.4x smaller and 6.1x faster on inference than the best existing CNN. It...

Understanding KL Divergence

26 May 2019

I got curious about KL Divergence after reading the Variational Auto Encoder Paper. So, I decided to investigate it to get a better intuition. KL Divergence is a measure of how one probability distribution $P$ is different from a second probability distribution $Q$. If two distributions are identical, their KL...

SRDCF-for-Object-Tracking

23 May 2019

Spatially Regularized Correlation Filters In this post I discuss the basics of the SRDCF framework. In most standard DCF approaches like MOSSE, the size of target search region used for training the classifier is limited. A naive increase in search region may lead to more emphasis on background instead of...

Introduction to Visual Object Tracking

22 May 2019

Introduction The aim of visual tracking is to track an arbitrary object in any environment given the initial position and size of the target object. It is an important area of research with applications in wide variety of tasks like automated surveillance, vehicle tracking, traffic accident detection and robotics. Many...

Siddhartha

Can You Win Everything with A Lottery Ticket?

Are wider nets better given the same number of parameters?

Accelerating deep neural network inference via structured pruning

Constraining Neural Network activations with Partial Differential Equations

An interesting pattern in the final fully connected layer weights

Self Supervised Learning for Gravitational Waves

Notes from our 3rd place solution in kaggle rainforest audio detection challenge

Understanding Centernet

CAM visualization of EfficientNet

Understanding KL Divergence

SRDCF-for-Object-Tracking

Introduction to Visual Object Tracking

Recent posts

Consciousness and the nature of self

Constraining Neural Network activations with Partial Differential Equations

An interesting pattern in the final fully connected layer weights

Self Supervised Learning for Gravitational Waves

Notes from our 3rd place solution in kaggle rainforest audio detection challenge