On Bias and Variance Decomposition of Offline Policy Evaluation Estimators

Note: This blog post is an unpublished partial draft of a larger work, jointly written with Aishwarya Mandyam.

Introduction

Evaluation is a critical component of learning contextual bandit policies that can be deployed in high-risk settings. One way to perform this evaluation is to directly deploy it in...

Grading Complex Interactive Coding Programs with Reinforcement Learning

[Summary] tl;dr: A tremendous amount of effort has been poured into training AI algorithms to competitively play games that computers have traditionally had trouble with, such as the retro games published by Atari, Go, DotA, and StarCraft II. The practical machine learning knowledge accumulated in developing these algorithms has paved...

Rational Speech Act and Controllable Image Caption Generation

[Paper]

In the past two years, controlling generative models (such as GANs and VAEs) have been widely studied in the Computer Vision literature. The idea is that once these large capacity neural network models learn the data manifold of millions of images, it has internalized some knowledge about the...

Reading Group -- Semantically Equivalent Adversarial Rules for Debugging NLP Models

(Originally written for the Stanford NLP Blog)

Robustness is a central concern in engineering. Our suspension bridges need to stand against strong wind and so it won’t collapse like the Tacoma Narrows Bridge [video]. Our nuclear reactors need to be highly fault tolerant so that Fukushima Daiichi incident won’t...

A Generative Model of Discourse

Short Preface

There has been a lot interests on sentence representation learning, similar to the explosion of word embedding.

Based on my limited education in linguistics, I do not believe linguistis have an agreed upon and exhaustive set of “properties” that a sentence representation¹...

A Tutorial on Torchtext

About 2-3 months ago, I encountered this library: Torchtext. I nonchalantly scanned through the README file and realize I have no idea how to use it or what kind of problem is it solving. I moved on.

Last week, there was a paper deadline, and I was tasked to...

A Geometric Analysis of Lagrangian, Dual Problem, and KKT Conditions

Preface

This is an article providing another perspective on understanding Lagrangian and dual problem. These two topics are essential to convex and non-convex optimization. Since it is a blog post, the proper background to understand this article is kept rather low. If you need to brush up on convex...

A Collection of Numpy Tricks

Numpy Array overrides many operations, so deciphering them could be uneasy. Here are a collection of what I would consider tricky/handy moments from Numpy.

Trick 1: Collection1 == Collection2

The == in Numpy, when applied to two collections mean element-wise comparison, and the returned result is an...

Understand Numpy Reshape, Transpose, and Theano Dimshuffle

When you work with Numpy, you work with multidimensional arrays (or tensors). I have to admit such concept was not too easy for me to grasp in the beginning, but after some delibration, it became relatively easy. This post uses the term tensor/multidimensional array interchangeably.

for a tensor of shape...