
Bryan He, Ludvig Bergenstråhle, Linnea Stenbeck, Abubakar Abid, Alma Andersson, Åke Borg, Jonas Maaskola, Joakim Lundeberg, and James Zou
Nature Biomedical Engineering, 2020.
Spatial transcriptomics allows for the measurement of RNA abundance at a high spatial resolution, making it possible to systematically link the morphology of cellular neighbourhoods and spatially localized gene expression. Here, we report the development of a deep learning algorithm for the prediction of local gene expression from haematoxylinandeosinstained histopathology images using a new dataset of 30,612 spatially resolved gene expression data matched to histopathology images from 23 patients with breast cancer. We identified over 100 genes, including known breast cancer biomarkers of intratumoral heterogeneity and the colocalization of tumour growth and immune activation, the expression of which can be predicted from the histopathology images at a resolution of 100 µm. We also show that the algorithm generalizes well to The Cancer Genome Atlas and to other breast cancer gene expression datasets without the need for retraining. Predicting the spatially resolved transcriptome of a tissue directly from tissue images may enable imagebased screening for molecular biomarkers with spatial variation.

Bas Hofstra, Vivek V. Kulkarni, Sebastian MunozNajar Galvez, Bryan He, Dan Jurafsky, and Daniel A. McFarland
Proceedings of the National Academy of Sciences of the United States of America, 2020.
Prior work finds a diversity paradox: Diversity breeds innovation, yet underrepresented groups that diversify organizations have less successful careers within them. Does the diversity paradox hold for scientists as well? We study this by utilizing a nearcomplete population of ∼1.2 million US doctoral recipients from 1977 to 2015 and following their careers into publishing and faculty positions. We use text analysis and machine learning to answer a series of questions: How do we detect scientific innovations? Are underrepresented groups more likely to generate scientific innovations? And are the innovations of underrepresented groups adopted and rewarded? Our analyses show that underrepresented groups produce higher rates of scientific novelty. However, their novel contributions are devalued and discounted: For example, novel contributions by gender and racial minorities are taken up by other scholars at lower rates than novel contributions by gender and racial majorities, and equally impactful contributions of gender and racial minorities are less likely to result in successful scientific careers than for majority groups. These results suggest there may be unwarranted reproduction of stratification in academic careers that discounts diversity’s role in innovation and partly explains the underrepresentation of some groups in academia.

David Ouyang, Bryan He, Amirata Ghorbani, Neal Yuan, Joseph Ebinger, Curt P. Langlotz, Paul A. Heidenreich, Robert A. Harrington, David H. Liang, Euan A. Ashley, and James Y. Zou
Nature, 2020.
Accurate assessment of cardiac function is crucial for diagnosing cardiovascular disease, screening for cardiotoxicity and deciding clinical management in patients with critical illness. However human assessment of cardiac function focuses on a limited sampling of cardiac cycles and has significant interobserver variability despite years of training. To overcome this challenge, we present the first beattobeat deep learning algorithm that surpasses human expert performance in the critical tasks of segmenting the left ventricle, estimating ejection fraction, and assessing cardiomyopathy. Trained on echocardiogram videos, our model accurately segments the left ventricle with a Dice Similarity Coefficient of 0.92, predicts ejection fraction with mean absolute error of 4.1%, and reliably classifies heart failure with reduced ejection fraction (AUC of 0.97). Prospective evaluation with repeated human measurements confirms that our model has less variance than experts. By leveraging information across multiple cardiac cycles, our model can identify subtle changes in ejection fraction, is more reproducible than human evaluation, and lays the foundation for precise diagnosis of cardiovascular disease. As a new resource to promote further innovation, we also make publicly available one of the largest medical video dataset of over 10,000 annotated echocardiograms.

Amirata Ghorbani, David Ouyang, Abubakar Abid, Bryan He, Jonathan H. Chen, Robert A. Harrington, David H. Liang, Euan A. Ashley, James Y. Zou
npj Digital Medicine, 2020.
Echocardiography uses ultrasound technology to capture high temporal and spatial resolution images of the heart and surrounding structures, and is the most common imaging modality in cardiovascular medicine. Using convolutional neural networks on a large new dataset, we show that deep learning applied to echocardiography can identify local cardiac structures, estimate cardiac function, and predict systemic phenotypes that modify cardiovascular risk but not readily identifiable to human interpretation. Our deep learning model, EchoNet, accurately identified the presence of pacemaker leads (AUC = 0.89), enlarged left atrium (AUC = 0.86), left ventricular hypertrophy (AUC = 0.75), left ventricular end systolic and diastolic volumes (R2 = 0.74 and R2 = 0.70), and ejection fraction (R2 = 0.50), as well as predicted systemic phenotypes of age (R2 = 0.46), sex (AUC = 0.88), weight (R2 = 0.56), and height (R2 = 0.33). Interpretation analysis validates that EchoNet shows appropriate attention to key cardiac structures when performing humanexplainable tasks and highlights hypothesisgenerating regions of interest when predicting systemic phenotypes difficult for human interpretation. Machine learning on echocardiography images can streamline repetitive tasks in the clinical workflow, provide preliminary interpretation in areas with insufficient qualified cardiologists, and predict phenotypes challenging for human evaluation.

Peng Xu, Bryan He, Christopher De Sa, Ioannis Mitliagkas, Chris Ré
Proceedings of the TwentyFirst International Conference on Artificial Intelligence and Statistics, 2018.
Principal component analysis (PCA) is one of the most powerful tools for analyzing matrices in machine learning. In this paper, we study methods to accelerate power iteration in the stochastic setting by adding a momentum term. While in the deterministic setting, power iteration with momentum has optimal iteration complexity, we show that naively adding momentum to a stochastic method does not always result in acceleration. We perform a novel, tight variance analysis that reveals a "breakingpoint variance" beyond which this acceleration does not occur. Combining this insight with modern variance reduction techniques yields a simple version of power iteration with momentum that achieves the optimal iteration complexities in both the online and offline setting. Our methods are embarrassingly parallel and can produce wallclocktime speedups. Our approach is very general and applies to many nonconvex optimization problems that can now be accelerated using the same technique.

Paroma Varma, Bryan He, Payal Bajaj, Nishith Khandwala, Imon Banerjee, Daniel Rubin, Christopher Ré
Advances in Neural Information Processing Systems (NeurIPS), 2017.
Obtaining enough labeled data to robustly train complex discriminative models is a major bottleneck in the machine learning pipeline. A popular solution is combining multiple sources of weak supervision using generative models. The structure of these models affects the quality of the training labels, but is difficult to learn without any ground truth labels. We instead rely on weak supervision sources having some structure by virtue of being encoded programmatically. We present Coral, a paradigm that infers generative model structure by statically analyzing the code for these heuristics, thus significantly reducing the amount of data required to learn structure. We prove that Coral's sample complexity scales quasilinearly with the number of heuristics and number of relations identified, improving over the standard sample complexity, which is exponential in n for learning nth degree relations. Empirically, Coral matches or outperforms traditional structure learning approaches by up to 3.81 F1 points. Using Coral to model dependencies instead of assuming independence results in better performance than a fully supervised model by 3.07 accuracy points when heuristics are used to label radiology data without ground truth labels.

Stephen H. Bach, Bryan He, Alexander Ratner, Christopher Ré
Proceedings of the 34th International Conference on Machine Learning, 2017.
Curating labeled training data has become the primary bottleneck in machine learning. Recent frameworks address this bottleneck with generative models to synthesize labels at scale from weak supervision sources. The generative model’s dependency structure directly affects the quality of the estimated labels, but selecting a structure automatically without any labeled data is a distinct challenge. We propose a structure estimation method that maximizes the l_{1}regularized marginal pseudolikelihood of the observed data. Our analysis shows that the amount of unlabeled data required to identify the true structure scales sublinearly in the number of possible dependencies for a broad class of models. Simulations show that our method is 100x faster than a maximum likelihood approach and selects 1/4 as many extraneous dependencies. We also show that our method provides an average of 1.5 F1 points of improvement over existing, userdeveloped information extraction applications on realworld data such as PubMed journal abstracts.

Bryan He, Christopher De Sa, Ioannis Mitliagkas, Christopher Ré
Advances in Neural Information Processing Systems (NeurIPS), 2016.
Gibbs sampling is a Markov Chain Monte Carlo sampling technique that iteratively samples variables from their conditional distributions. There are two common scan orders for the variables: random scan and systematic scan. Due to the benefits of locality in hardware, systematic scan is commonly used, even though most statistical guarantees are only for random scan. While it has been conjectured that the mixing times of random scan and systematic scan do not differ by more than a logarithmic factor, we show by counterexample that this is not the case, and we prove that that the mixing times do not differ by more than a polynomial factor under mild conditions. To prove these relative bounds, we introduce a method of augmenting the state space to study systematic scan using conductance.

Bryan He, Mosalam Ebrahimi, Leon Palafox, Lakshminarayan Srinivasan
Journal of Neural Engineering, 2016.
Objective, Approach. A growing number of prototypes for diagnosing and treating neurological and psychiatric diseases are predicated on access to highquality brain signals, which typically requires surgically opening the skull. Where endovascular navigation previously transformed the treatment of cerebral vascular malformations, we now show that it can provide access to brain signals with substantially higher signal quality than scalp recordings. Main results. While endovascular signals were known to be larger in amplitude than scalp signals, our analysis in rabbits borrows a standard technique from communication theory to show endovascular signals also have up to 100× better signaltonoise ratio. Significance. With a viable minimallyinvasive path to highquality brain signals, patients with brain diseases could one day receive potent electroceuticals through the bloodstream, in the course of a brief outpatient procedure.

Bryan He, Yisong Yue
Advances in Neural Information Processing Systems (NeurIPS), 2015.
Interactive submodular set cover is an interactive variant of submodular set cover over a hypothesis class of submodular functions, where the goal is to satisfy all sufficiently plausible submodular functions to a target threshold using as few (costweighted) actions as possible. It models settings where there is uncertainty regarding which submodular function to optimize. In this paper, we propose a new extension, which we call smooth interactive submodular set cover, that allows the target threshold to vary depending on the plausibility of each hypothesis. We present the first algorithm for this more general setting with theoretical guarantees on optimality. We further show how to extend our approach to deal with realvalued functions, which yields new theoretical results for realvalued submodular set cover for both the interactive and noninteractive settings.

Bryan He, Alexander Wein, Lav Varshney, Julius Kusuma, Andrew Richardson, Lakshminarayan Srinivasan
Journal of Neurophysiology, 2015.
Efficient spike acquisition techniques are needed to bridge the divide from creating large multielectrode arrays (MEA) to achieving wholecortex electrophysiology. In this paper, we introduce generalized analog thresholding (gAT), which achieves millisecond temporal resolution with sampling rates as low as 10 Hz. Consider the torrent of data from a single 1,000channel MEA, which would generate more than 3 GB/min using standard 30kHz Nyquist sampling. Recent neural signal processing methods based on compressive sensing still require Nyquist sampling as a first step and use iterative methods to reconstruct spikes. Analog thresholding (AT) remains the best existing alternative, where spike waveforms are passed through an analog comparator and sampled at 1 kHz, with instant spike reconstruction. By generalizing AT, the new method reduces sampling rates another order of magnitude, detects more than one spike per interval, and reconstructs spike width. Unlike compressive sensing, the new method reveals a simple closedform solution to achieve instant (noniterative) spike reconstruction. The base method is already robust to hardware nonidealities, including realistic quantization error and integration noise. Because it achieves these considerable specifications using hardwarefriendly components like integrators and comparators, generalized AT could translate largescale MEAs into implantable devices for scientific investigation and medical technology.

Bryan He, Alexander Wein, Lakshminarayan Srinivasan
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015.
Conventional Nyquist sampling and reconstruction of square waves at a finite rate will always result in aliasing because square waves are not band limited. Based on methods for signals with finite rate of innovation (FRI), generalized Analog Thresholding (gATn) is able to sample square waves at a much lower rate under ideal conditions. The target application is efficient, realtime, implantable neurotechnology that extracts spiking neural signals from the brain. This paper studies the effect of integrator noise and quantization error on the accuracy of reconstructed square waves. We explore realistic values for integrator noise and input signal amplitude, using specifications from the Texas Instruments IVC102 integrator chip as a firstpass example because of its readilyavailable data sheet. ADC resolution is varied from 1 to 16 bits. This analysis indicates that gAT1 is robust against these hardware nonidealities where gAT2 degrades less gracefully, which makes gAT1 a prime target for hardware implementation in a custom integrated circuit.

Bryan He
Theoretical Computer Science, 2014.
Mosaic floorplans are rectangular structures subdivided into smaller rectangular sections and are widely used in VLSI circuit design. Baxter permutations are a set of permutations that have been shown to have a onetoone correspondence to objects in the Baxter combinatorial family, which includes mosaic floorplans. An important problem in this area is to find short binary string representations of the set of nblock mosaic floorplans and Baxter permutations of length n. The best known representation is the QuarterState Sequence which uses 4n bits. This paper introduces a simple binary representation of nblock mosaic floorplan using 3n−3 bits. It has been shown that any binary representation of nblock mosaic floorplans must use at least (3n−o(n)) bits. Therefore, the representation presented in this paper is optimal (up to an additive lower order term).

Bryan He, Lakshminarayan Srinivasan
Information Theory and Applications Workshop (ITA), 2014.
The prototypical braincomputer interface (BCI) algorithm translates brain activity into changes in the states of a computer program, for typing or cursor movement. Most approaches use neural decoding which learns how the user has encoded their intent in their noisy neural signals. Recent adaptive decoders for cursor movement improved BCI performance by modeling the user as a feedback controller; when this model accounts for adaptive control, the neural decoder is termed coadaptive. This recent collection of controlinspired neural decoding strategies disregards a major antecedent conceptual realization, whereby the user could be induced to adopt an encoding strategy (control policy) such that the encoderdecoder pair (or equivalently, controllerplant pair) is optimal under a desired cost function. We call this alternate conceptual approach neural shaping, in contradistinction to neural decoding. Previous work illuminated the general form of optimal controllerplant pair under a cost representing information gain. For BCI applications requiring the user to issue discretevalued commands, the informationgainoptimal pair, based on the posterior matching scheme, can be userfriendly. In this paper, we discuss the application of neural shaping to cursor control with continuousvalued states based on continuousvalued user commands. We examine the problem of jointly optimizing controller and plant under quadratic expected cost and restricted linear plant dynamics. This simplification reduces joint controllerplant selection to a static optimization problem, similar to approaches in structural engineering and other areas. This perspective suggests that recent BCI approaches that alternate between adaptive neural decoders and static neural decoders could be local Paretooptimal, representing a suboptimal iterativetype solution to the optimal joint controllerplant problem.

Kevin Kowalski, Bryan He, Lakshminarayan Srinivasan
Neural Computation, 2013.
The closedloop operation of brainmachine interfaces (BMI) provides a context to discover foundational principles behind humancomputer interaction, with emerging clinical applications to stroke, neuromuscular diseases, and trauma. In the canonical BMI, a user controls a prosthetic limb through neural signals that are recorded by electrodes and processed by a decoder into limb movements. In laboratory demonstrations with ablebodied test subjects, parameters of the decoder are commonly tuned using training data that include neural signals and corresponding overt arm movements. In the application of BMI to paralysis or amputation, arm movements are not feasible, and imagined movements create weaker, partially unrelated patterns of neural activity. BMI training must begin naive, without access to these prototypical methods for parameter initialization used in most laboratory BMI demonstrations.
Naive adaptive BMI refer to a class of methods recently introduced to address this problem. We first identify the basic elements of existing approaches based on adaptive filtering and define a decoder, ReFITPPF to represent these existing approaches. We then present Joint RSE, a novel approach that logically extends prior approaches. Using recently developed human and syntheticsubjects closedloop BMI simulation platforms, we show that Joint RSE significantly outperforms ReFITPPF and nonadaptive (static) decoders. Control experiments demonstrate the critical role of jointly estimating neural parameters and user intent. In addition, we show that nonzero sensorimotor delay in the user significantly degrades ReFITPPF but not Joint RSE, owing to differences in the prior on intended velocity. Paradoxically, substantial differences in the nature of sensory feedback between these methods do not contribute to differences in performance between Joint RSE and ReFITPPF. Instead, BMI performance improvement is driven by machine learning, which outpaces rates of human learning in the humansubjects simulation platform. In this regime, nuances of errorrelated feedback to the human user are less relevant to rapid BMI mastery.

Bryan He
Frontiers in Algorithmics and Algorithmic Aspects in Information and , May 2012.
A floorplan is a rectangle subdivided into smaller rectangular blocks by horizontal and vertical line segments. Two floorplans are considered equivalent if and only if there is a bijection between the blocks in the two floorplans such that the corresponding blocks have the same horizontal and vertical boundaries. Mosaic floorplans use the same objects as floorplans but use an alternative definition of equivalence. Two mosaic floorplans are considered equivalent if and only if they can be converted into equivalent floorplans by sliding the line segments that divide the blocks. The QuarterState Sequence method of representing mosaic floorplans uses 4n bits, where n is the number of blocks. This paper introduces a method of representing an nblock mosaic floorplan with a (3n − 3)bit binary string. It has been proven that the shortest possible binary string representation of a mosaic floorplan has a length of (3n − o(n)) bits. Therefore, the representation presented in this paper is asymptotically optimal. Baxter permutations are a set of permutations defined by prohibited subsequences. There exists a bijection between mosaic floorplans and Baxter permutations. As a result, the methods introduced in this paper also create an optimal binary string representation of Baxter permutations.