When editing or revising we often write in a non-linear manner.
Writing an email
An existing system might suggest something like “great to me” because it only considers the preceding text but not the subsequent text.
A better suggestion in this case would be something like “good with one exception” since the writer is not completely satisfied and suggesting a further revision.
Writing a novel
When you don’t have a concrete idea on how to connect two scenes, the system can suggest a way to connect the fragmented ideas.
Fill in the blanks?
Consider the following sentence with blanks:
She ate ____ for ____
To fill in the blanks, one needs to consider both preceding and subsequent text (in this case, “She ate” and “for”). There can be many reasonable ways to fill in the blanks:
She ate leftover pasta for lunch
She ate chocolate ice cream for dessert
She ate toast for breakfast before leaving for school
She ate rather quickly for she was in a hurry that evening
The task of filling in the blanks is known as text infilling in the field of Natural Language Processing (NLP). It is the task of predicting blanks (or missing spans) of text at any position in text.
The general definition of text infilling considers text with an arbitrary number of blanks where each blank can represent one of more missing words.
Language modeling is a special case of text infilling where only the preceding text is present and there is only one blank at the end.
She ate leftover pasta for ____
In recent few years, a number of large-scale language models are introduced and shown to achieve human-like performance. These models are often pre-trained on massive amount of unlabeled data, requiring huge amount of computation and resource.
Our goal is to take these existing language models and make them perform the more general task of filling in the blanks.
How can we make a language model fill in the blanks?
Our approach is infilling by language modeling. With this approach, one can simply (1) download an existing pre-trained language model and (2) enable it to fill in any number and length of blanks in text by fine-tuning it on artificially generated examples.
Main advantages of our framework are as follows:
- Conceptual simplicity: Minimal change to standard language model training
- Model-agnostic framework: Leverage massively pre-trained language models
Now, let’s see what happens at training and test time!
1. Manufacture infilling examples
Suppose we have plain text as our data:
Data: She ate leftover pasta for lunch.
To produce an infilling example for given data, first generate input by randomly replacing some tokens in the data with [blank] tokens.
Input: She ate [blank] for [blank].
Then, generate a target by concatenating the replaced tokens, separated by the [answer] token.
Target: leftover pasta [answer] lunch [answer]
Finally, construct the complete infilling example by concatenating input, a special separator token [sep], and target.
Infilling example: She ate [blank] for [blank]. [sep] leftover pasta [answer] lunch [answer]
Looking for a script to automate this step? It is available here!
2. Download your favorite language model
For instance, OpenAI GPT-2.
3. Fine-tune the model on infilling examples
Now, let’s fine-tune the model on the infilling examples using standard language model training methodology.
Once trained, we can use the language model to infill at test time.
As input, the model takes incomplete text with [blank] and generates a target.
Input: He drinks [blank] after [blank].
Target: water [answer] running [answer]
You can then construct the complete text by simply replacing [blank] tokens in the input with predicted answers in the target in a deterministic fashion.
Output: He drinks water after running.
Our framework incurs almost no computational overhead compared to language modeling. This is particularly good when considering models like GPT-2 whose memory usage grows quadratically with sequence length.
Our framework requires minimal change to the vocabulary of existing language models. Specifically, you need three additional tokens: [blank], [answer], and [sep].
Our framework offers the ability to attend to the entire context on both sides of a blank with the simplicity of decoding from language models.
The following is a short story consisting of five sentences. One of the sentences is swapped with a sentence generated by our model. Can you find it?
Q. Identify one of the five sentences generated by machine.
 Patty was excited about having her friends over.  She had been working hard preparing the food.  Patty knew her friends wanted pizza.  All of her friends arrived and were seated at the table.  Patty had a great time with her friends.
Q. Identify one of the five sentences generated by machine.
 Yesterday was Kelly’s first concert.  She was nervous to get on stage.  When she got on stage the band was amazing.  Kelly was then happy.  She couldn’t wait to do it again.
(Answers are at the end of the post.)
In our experiments, we sampled a short story from ROCstories (Mostafazadeh et al., 2016), randomly replaced one of the sentences with a [blank] token, and infilled with a sentence generated by a model. Then, we asked 100 people to identify which of the sentences in a story was machine-generated.
|System||How many people were fooled?||Generated sentence|
|BERT (Devlin et al., 2019)||20%||favoritea “, Mary brightly said.|
|Self-Attention Model (Zhu et al., 2019)||29%||She wasn’t sure she had to go to the store.|
|Standard Language Model (Radford et al., 2019)||41%||She went to check the tv.|
|Infilling by Language Model (Ours)||45%||Patty knew her friends wanted pizza.|
|Human||78%||She also had the place looking spotless.|
The results show that people have difficulty identifying sentences infilled by our model as machine-generated 45% of the time. Generated sentences in the table are the system outputs for sentence  in the first story of the Turing test.
More experiments and analysis can be found in our paper.
Try it out!
We have a demo where you can explore the infilling functionality for multiple variable-length spans and different granularities (e.g. words, phrases, and sentences) on the domains of short stories, scientific abstracts, and song lyrics!
- Chris Donahue firstname.lastname@example.org
- Mina Lee email@example.com
- Percy Liang firstname.lastname@example.org
Answers:  and