
Paroma Varma, Bryan He, Payal Bajaj, Nishith Khandwala, Imon Banerjee, Daniel Rubin, Christopher Ré
In Advances in Neural Information Processing Systems (NIPS), December 2017.
Obtaining enough labeled data to robustly train complex discriminative models is a major bottleneck in the machine learning pipeline. A popular solution is combining multiple sources of weak supervision using generative models. The structure of these models affects the quality of the training labels, but is difficult to learn without any ground truth labels. We instead rely on weak supervision sources having some structure by virtue of being encoded programmatically. We present Coral, a paradigm that infers generative model structure by statically analyzing the code for these heuristics, thus significantly reducing the amount of data required to learn structure. We prove that Coral's sample complexity scales quasilinearly with the number of heuristics and number of relations identified, improving over the standard sample complexity, which is exponential in n for learning nth degree relations. Empirically, Coral matches or outperforms traditional structure learning approaches by up to 3.81 F1 points. Using Coral to model dependencies instead of assuming independence results in better performance than a fully supervised model by 3.07 accuracy points when heuristics are used to label radiology data without ground truth labels.

Stephen H. Bach, Bryan He, Alexander Ratner, Christopher Ré
In Proceedings of the 34th International Conference on Machine Learning, August 2017.
Curating labeled training data has become the primary bottleneck in machine learning. Recent frameworks address this bottleneck with generative models to synthesize labels at scale from weak supervision sources. The generative model’s dependency structure directly affects the quality of the estimated labels, but selecting a structure automatically without any labeled data is a distinct challenge. We propose a structure estimation method that maximizes the l_{1}regularized marginal pseudolikelihood of the observed data. Our analysis shows that the amount of unlabeled data required to identify the true structure scales sublinearly in the number of possible dependencies for a broad class of models. Simulations show that our method is 100x faster than a maximum likelihood approach and selects 1/4 as many extraneous dependencies. We also show that our method provides an average of 1.5 F1 points of improvement over existing, userdeveloped information extraction applications on realworld data such as PubMed journal abstracts.

Bryan He, Christopher De Sa, Ioannis Mitliagkas, Christopher Ré
In Advances in Neural Information Processing Systems (NIPS), December 2016.
Gibbs sampling is a Markov Chain Monte Carlo sampling technique that iteratively samples variables from their conditional distributions. There are two common scan orders for the variables: random scan and systematic scan. Due to the benefits of locality in hardware, systematic scan is commonly used, even though most statistical guarantees are only for random scan. While it has been conjectured that the mixing times of random scan and systematic scan do not differ by more than a logarithmic factor, we show by counterexample that this is not the case, and we prove that that the mixing times do not differ by more than a polynomial factor under mild conditions. To prove these relative bounds, we introduce a method of augmenting the state space to study systematic scan using conductance.

Bryan He, Mosalam Ebrahimi, Leon Palafox, Lakshminarayan Srinivasan
In Journal of Neural Engineering, January 2016.
Objective, Approach. A growing number of prototypes for diagnosing and treating neurological and psychiatric diseases are predicated on access to highquality brain signals, which typically requires surgically opening the skull. Where endovascular navigation previously transformed the treatment of cerebral vascular malformations, we now show that it can provide access to brain signals with substantially higher signal quality than scalp recordings. Main results. While endovascular signals were known to be larger in amplitude than scalp signals, our analysis in rabbits borrows a standard technique from communication theory to show endovascular signals also have up to 100× better signaltonoise ratio. Significance. With a viable minimallyinvasive path to highquality brain signals, patients with brain diseases could one day receive potent electroceuticals through the bloodstream, in the course of a brief outpatient procedure.

Bryan He, Yisong Yue
In Advances in Neural Information Processing Systems (NIPS), December 2015.
Interactive submodular set cover is an interactive variant of submodular set cover over a hypothesis class of submodular functions, where the goal is to satisfy all sufficiently plausible submodular functions to a target threshold using as few (costweighted) actions as possible. It models settings where there is uncertainty regarding which submodular function to optimize. In this paper, we propose a new extension, which we call smooth interactive submodular set cover, that allows the target threshold to vary depending on the plausibility of each hypothesis. We present the first algorithm for this more general setting with theoretical guarantees on optimality. We further show how to extend our approach to deal with realvalued functions, which yields new theoretical results for realvalued submodular set cover for both the interactive and noninteractive settings.

Bryan He, Alexander Wein, Lav Varshney, Julius Kusuma, Andrew Richardson, Lakshminarayan Srinivasan
In Journal of Neurophysiology, July 2015.
Efficient spike acquisition techniques are needed to bridge the divide from creating large multielectrode arrays (MEA) to achieving wholecortex electrophysiology. In this paper, we introduce generalized analog thresholding (gAT), which achieves millisecond temporal resolution with sampling rates as low as 10 Hz. Consider the torrent of data from a single 1,000channel MEA, which would generate more than 3 GB/min using standard 30kHz Nyquist sampling. Recent neural signal processing methods based on compressive sensing still require Nyquist sampling as a first step and use iterative methods to reconstruct spikes. Analog thresholding (AT) remains the best existing alternative, where spike waveforms are passed through an analog comparator and sampled at 1 kHz, with instant spike reconstruction. By generalizing AT, the new method reduces sampling rates another order of magnitude, detects more than one spike per interval, and reconstructs spike width. Unlike compressive sensing, the new method reveals a simple closedform solution to achieve instant (noniterative) spike reconstruction. The base method is already robust to hardware nonidealities, including realistic quantization error and integration noise. Because it achieves these considerable specifications using hardwarefriendly components like integrators and comparators, generalized AT could translate largescale MEAs into implantable devices for scientific investigation and medical technology.

Bryan He, Alexander Wein, Lakshminarayan Srinivasan
In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 2015.
Conventional Nyquist sampling and reconstruction of square waves at a finite rate will always result in aliasing because square waves are not band limited. Based on methods for signals with finite rate of innovation (FRI), generalized Analog Thresholding (gATn) is able to sample square waves at a much lower rate under ideal conditions. The target application is efficient, realtime, implantable neurotechnology that extracts spiking neural signals from the brain. This paper studies the effect of integrator noise and quantization error on the accuracy of reconstructed square waves. We explore realistic values for integrator noise and input signal amplitude, using specifications from the Texas Instruments IVC102 integrator chip as a firstpass example because of its readilyavailable data sheet. ADC resolution is varied from 1 to 16 bits. This analysis indicates that gAT1 is robust against these hardware nonidealities where gAT2 degrades less gracefully, which makes gAT1 a prime target for hardware implementation in a custom integrated circuit.

Bryan He
In Theoretical Computer Science, May 2014.
Mosaic floorplans are rectangular structures subdivided into smaller rectangular sections and are widely used in VLSI circuit design. Baxter permutations are a set of permutations that have been shown to have a onetoone correspondence to objects in the Baxter combinatorial family, which includes mosaic floorplans. An important problem in this area is to find short binary string representations of the set of nblock mosaic floorplans and Baxter permutations of length n. The best known representation is the QuarterState Sequence which uses 4n bits. This paper introduces a simple binary representation of nblock mosaic floorplan using 3n−3 bits. It has been shown that any binary representation of nblock mosaic floorplans must use at least (3n−o(n)) bits. Therefore, the representation presented in this paper is optimal (up to an additive lower order term).

Bryan He, Lakshminarayan Srinivasan
In Information Theory and Applications Workshop (ITA), February 2014.
The prototypical braincomputer interface (BCI) algorithm translates brain activity into changes in the states of a computer program, for typing or cursor movement. Most approaches use neural decoding which learns how the user has encoded their intent in their noisy neural signals. Recent adaptive decoders for cursor movement improved BCI performance by modeling the user as a feedback controller; when this model accounts for adaptive control, the neural decoder is termed coadaptive. This recent collection of controlinspired neural decoding strategies disregards a major antecedent conceptual realization, whereby the user could be induced to adopt an encoding strategy (control policy) such that the encoderdecoder pair (or equivalently, controllerplant pair) is optimal under a desired cost function. We call this alternate conceptual approach neural shaping, in contradistinction to neural decoding. Previous work illuminated the general form of optimal controllerplant pair under a cost representing information gain. For BCI applications requiring the user to issue discretevalued commands, the informationgainoptimal pair, based on the posterior matching scheme, can be userfriendly. In this paper, we discuss the application of neural shaping to cursor control with continuousvalued states based on continuousvalued user commands. We examine the problem of jointly optimizing controller and plant under quadratic expected cost and restricted linear plant dynamics. This simplification reduces joint controllerplant selection to a static optimization problem, similar to approaches in structural engineering and other areas. This perspective suggests that recent BCI approaches that alternate between adaptive neural decoders and static neural decoders could be local Paretooptimal, representing a suboptimal iterativetype solution to the optimal joint controllerplant problem.

Kevin Kowalski, Bryan He, Lakshminarayan Srinivasan
In Neural Computation, September 2013.
The closedloop operation of brainmachine interfaces (BMI) provides a context to discover foundational principles behind humancomputer interaction, with emerging clinical applications to stroke, neuromuscular diseases, and trauma. In the canonical BMI, a user controls a prosthetic limb through neural signals that are recorded by electrodes and processed by a decoder into limb movements. In laboratory demonstrations with ablebodied test subjects, parameters of the decoder are commonly tuned using training data that include neural signals and corresponding overt arm movements. In the application of BMI to paralysis or amputation, arm movements are not feasible, and imagined movements create weaker, partially unrelated patterns of neural activity. BMI training must begin naive, without access to these prototypical methods for parameter initialization used in most laboratory BMI demonstrations.
Naive adaptive BMI refer to a class of methods recently introduced to address this problem. We first identify the basic elements of existing approaches based on adaptive filtering and define a decoder, ReFITPPF to represent these existing approaches. We then present Joint RSE, a novel approach that logically extends prior approaches. Using recently developed human and syntheticsubjects closedloop BMI simulation platforms, we show that Joint RSE significantly outperforms ReFITPPF and nonadaptive (static) decoders. Control experiments demonstrate the critical role of jointly estimating neural parameters and user intent. In addition, we show that nonzero sensorimotor delay in the user significantly degrades ReFITPPF but not Joint RSE, owing to differences in the prior on intended velocity. Paradoxically, substantial differences in the nature of sensory feedback between these methods do not contribute to differences in performance between Joint RSE and ReFITPPF. Instead, BMI performance improvement is driven by machine learning, which outpaces rates of human learning in the humansubjects simulation platform. In this regime, nuances of errorrelated feedback to the human user are less relevant to rapid BMI mastery.

Bryan He
On Frontiers in Algorithmics and Algorithmic Aspects in Information and Management, May 2012.
A floorplan is a rectangle subdivided into smaller rectangular blocks by horizontal and vertical line segments. Two floorplans are considered equivalent if and only if there is a bijection between the blocks in the two floorplans such that the corresponding blocks have the same horizontal and vertical boundaries. Mosaic floorplans use the same objects as floorplans but use an alternative definition of equivalence. Two mosaic floorplans are considered equivalent if and only if they can be converted into equivalent floorplans by sliding the line segments that divide the blocks. The QuarterState Sequence method of representing mosaic floorplans uses 4n bits, where n is the number of blocks. This paper introduces a method of representing an nblock mosaic floorplan with a (3n − 3)bit binary string. It has been proven that the shortest possible binary string representation of a mosaic floorplan has a length of (3n − o(n)) bits. Therefore, the representation presented in this paper is asymptotically optimal. Baxter permutations are a set of permutations defined by prohibited subsequences. There exists a bijection between mosaic floorplans and Baxter permutations. As a result, the methods introduced in this paper also create an optimal binary string representation of Baxter permutations.