A dictionary-like definition of texture segmentation would be the following: ``the partitioning of an image into regions, each of which contains a single texture distinct from its neighbors.'' However, this definition does little to explain the inherent difficulties and practical limitations involved in this problem. To start with, we should consider the two terms ``texture'' and ``segmentation'' separately.
Mathematically, image segmentation is well-defined. An image consists of an array of pixels, and we want to give each pixel a label. A region consists of a connected group of pixels that share the same label. But what constitutes a ``proper'' region? Ideally, we would want each region to represent a different ``object'' in the image. But what is an object? If a bookshelf is filled with books, do we want to consider each book as a separate object, or do we want the bookshelf and everything in it to be a single object? If we see a computer monitor sitting on a console with a keyboard attached to it, is there one object or three? It is clear, then, that there is no one segmentation of an image that can be considered to be ``right.'' The ``right'' segmentation exists only in the mind of the observer, which can change not only between observers, but within the same observer at different times.
Even more slippery is the notion of texture. Whereas segmentation has at least a formal definition, texture has less of one. A typical definition in the literature is ``one or more basic local patterns that are repeated in a periodic manner.'' However, it is not clear exactly what the pattern is or how it is repeated. It is not even clear whether texture is an inherent property of all things, or whether some objects or regions lack texture altogether.
There are two approaches to defining texture, which may be thought of as ``top-down'' and ``bottom-up.'' Top-down models claim that there is a basic element, called a texel or a texton, and a placement rule, and the rule defines how and where the elements are placed. This definition works well if the texture consists of bricks or piles of pennies. The bottom-up approach claims that texture is a property that can be derived from the statistics of small groups of pixels, such as mean and variance. This works better for textures like quartz and grass where it is difficult to see individual elements. The dividing line between the two approaches is by no means clear.
Putting the two terms back together, we see the difficulties of the task. We wish to partition an image into regions of homogeneous texture, but we cannot always agree on when two texture samples are similar to each other. Furthermore, two different objects with a common boundary may have the same texture and be lumped together, which may or may not be desired. A texture segmentation will invariably break up certain objects which contain multiple textures, and group together pieces of different objects into one region.
Furthermore, there are all the practical limitations that come from using a computer to perform a visual task. The most important (but not the only) question is that of modeling. How do we formulate a precise, mathematical definition of a concept that we can't even define in words? The question of operator size is also important. Smaller operators are susceptible to image noise and variation, and they will inevitably create numerous small regions. Larger operators do a worse job of localizing the boundary between two textures, and they may produce confusing results if they are often straddling the boundaries between two or three different textures. If operators of different sizes are used on the same image, the question then becomes one of integrating information at different scales into one output, and how to account for the fact that the scale of the textures in the image may not correspond to any of the scales chosen for the operators.
The remainder of this paper is divided into two parts. The first takes on the task of describing the different models and ideas that people have applied to texture, while the second describes different techniques used to perform segmentation.