A user-controllable framework that unifies style transfer methods

Neural style transfer is the use of neural networks to transfer the style of an input image – e.g. a famous painting – to another input image – e.g. a backyard photograph.

Researchers have proposed a variety of techniques for performing style transfer, but which one works best? There is no right answer to that question; Viewers’ opinions differ. In the results reported in previous style transfer articles, the most preferred methods rarely receive more than two-thirds of reviewers’ votes, while the least preferred methods rarely receive less than 5%.

Assign and mix

In style transfer, the first step is to pass both the content example and the style example to the same visual encoder, which is typically pretrained on a broad object recognition task. The encoder produces a representation of each image, where each image region has an associated feature vector.

Related content

Two methods presented at CVPR achieve state-of-the-art results by imposing additional structure on the representation space.

The function vectors will typically code for visual information – about e.g. colors and orientations of gradients – but also semantic information – indicating, for example, that a certain image area depicts part of an eye.

Style transfer typically involves (1) reshuffling elements of the style image to reflect the contents of the content image, (2) warping the content image so that its overall statistics resemble those of the style image, or (3) a combination of the two. We assimilate all such approaches to the assign-and-mix model.

The “assign” step in assign-and-mix corresponds to approach (1). It involves the assignment matrix, which assigns feature vectors from the style representation to regions of a new image, controlled by the content representation. Although previous style transfer approaches use a variety of techniques to find correspondences between style and content features, we analyze several of them in the paper and show that they can often be assimilated to the task-matrix model.

Related content

Techniques that mix public and private training data can meet differential privacy criteria, while slicing errors increase by 60%-70%.

The assignment for a particular point in the new image can be a single vector from the style encoding, or it can be a weighted combination of vectors. In the first case, the assignment matrix is binary: each matrix entry is either 0 gold 1. This is a minimum-entropy assignment.

In contrast, if each point in the new content image consists of a weighted combination of each vector in the style image, the assignment matrix has higher entropy. There are existing style transfer methods with binary assignment matrices and there are existing approaches with high-entropy matrices, and our method can approximate both.

After the allocation step, we proceed to the mixing phase, which corresponds to procedure (2), above. In this phase, we review the encoding of the new, synthetic image, and for each image region, we measure the distance between its encoding and the original content sample. Then, we mix the feature vectors from the original content encoding according to the degree of divergence. This ensures that the new image preserves the content of the original.

The proposed approach. Epsilon is the parameter used to control the range of entropy values for the assignment matrix; f_s→c is the reconstruction of the content image, using features of the style image, produced by the task matrix.

The computational bottleneck in this process is the creation of multiple allocation matrices with different degrees of entropy. However, we show in our paper that the Sinkhorn-Knopp algorithm, which allows matrices to be rewritten in a standardized form that enables efficient solution, can be applied to the problem of constructing assignment matrices.

In the paper, we rewrite three previous style transfer methods using the assign-and-mix format. We chose these methods because their assignment matrices cover the full range of entropies. Our method should be able to approximate the output of any style transfer models whose assignment matrix entropies also fall within a more restricted range.

Leave a Comment Cancel reply