Akshit Tyagi

Summary

I am an ML Engineer at Google Research working on creating ML models for healthcare applications. I graduated with an MS CS from the College of Information and Computer Sciences(CICS) at UMass, Amherst. Before that, I was an undergrad at IIT, Delhi. You can find my resume here.

I am interested broadly in the field of Theoretical Machine Learning and NLP and have worked in projects involving computer vision, natural language processing and reinforcement learning. Some of my research areas are:

Major Projects and Internships

  1. Faster Bayesian Parameter estimation for Neural nets

    Currently working on making the process of MCMC sampling for bayesian parameter estimation quicker. Starting off from the work of bayesian dark knowledge, we want to come up with estimators more robust to dataset uncertainty, while keeping the inference time for the estimation process to be the same or better.

  2. Causal Inference models for multi-model Medical Diagnosis data

    Currently working on building hierarchical bayesian diagnostic models for MIMIC dataset. We are working on formalizing the problem as a causal inference problem and investigating the advantages such a formulation can have.

  3. Building models for Noisy Conversational Question Answering

    Worked with Alexa's Implicit Memory team on finding out the best way to integrate noise robustness in text based Q&A models using transformers and ELMo embeddings. We add noise to the CoQA dataset, creating a noisy version of CoQA. Evaluating models like FlowQA on noisy-CoQA exposes the problems with modern text based NLP models to structured noise coming through speech recognition and ambient noise sources. We use stabilization of layers to get noise robust embeddings for models and are continuing the work to build in robustness for such models.

  4. Natural Language Understanding through Intent Classification

    Worked with Alexa's Spoken Language Understanding team on building a fast intent classifier. This will be integrated with the voice assistant as a tool to identify the intent of the utterance spoken by the user in a dialogue form. We were able to build in an early exiting strategy, that can be integrated with LSTMs and affine neural networks, traditionally used for such tasks. We are able to show that much shallower networks are able to perform reasonably well on the task, while being significantly faster and smaller in the number of parameters. Accepted for conference proceedings at ICASSP 2020. Link to paper.

  5. Getting better at Game Playing by transfer of skill

    Worked on transfer learning in the context of game playing. Transfer Learning has been used to learn policies from 10 of the 11 Atari games and use these as policy initiliasers for the last game. A generative model is fit for the simulations of the first ten\\& and then fine-tuned by Joint Training and Feature Extraction for the eleventh game. Results show promising transfer of policy in the context of Atari games.

  6. WordVector Based Moderation Pipeline for Amazon Marketplace

    We developed an end-to-end distributed pipeline for scoring the risk factor of a specific advertisment campaign on the Amazon Marketplace. This uses the text description of the ad and the associated photograph. Using these two, we developed a feature extractor to encode the risk of the advertisement and then use an ensemble of regressors to give the majority score of the advertisment. This was used in conjunction with the manual moderation team to make it more efficient for them to score only those ads on which our pipeline wasn't confident enough.

  7. Minimax Bot for playing the game of Tak

    The created bot uses Minimax to search over all the possible set of game states, and selects the best ones using a utility function. The utility function has been trained by seeing which features affect the quality of gameplay the most. The code and a description of the strategy used can be found here

  8. Using Genetic Algorithms for approximately solving the Traveling Salesman Problem

    A specific tour of the travelling salesman was modelled as a gene for the Genetic Algorithm(GA). This gene was then mutated using two different crossover techniques: CX and PMX; and both the types of offsprings were included in the final population for the next generation. Fitness of a gene was the Laplace smoothened reciprocal of the total distance of the tour. The crossover process was parallelised thus allowing for very large starting populations and also made the GA approach the maxima in a small number of iterations. For more on the algorithm and the code go here.

  9. Local Search Algorithms with Parallelization

    Implemented a set of Local Search algorithms including: HillClimibingWithRandomRestarts, HillClimbingwithTabu, BeamSearch, BeamSearchWithTabu and BeamSearchWithRandomRestarts. This is used to search for the global maxima for the cost of a resource allocation problem. Parallelization was done to introduce three threads to update a common max value, to make searching more time efficient. The code can be found here

  10. Echo State Networks for Stock Prediction

    We use an Echo State Network or a reservoir RNN for capturing the movement of a specific stock. The echo state network takes in the input as the stock price of the previous day, the learned state of the echo state network and outputs the predicted value of the index on the current day. The code for the ESN can be found here

Other Projects and Apps

Teaching Experience