Documentation¶

While this documentation aims to go beyond a simple listing of parameters and instead attempts to explain some of the principles behind the functions, please see the section “Usage” for more details and usage examples including code and flow field visualisations.

“Pure PyTorch” Setting¶

PURE_PYTORCH is a toolbox-wide boolean variable which can be set. It enables the user to choose whether oflibpytorch makes use of a more accurate, but much slower and non-differentiable SciPy-based function, or a less precise, but significantly faster and fully differentiable PyTorch-only function. By default, PURE_PYTORCH is set to True.

oflibpytorch.get_pure_pytorch()¶: Returns the state of the toolbox-wide boolean variable PURE_PYTORCH. If True, a PyTorch-only method replaces scipy.interpolate.griddata(). The latter, while significantly slower and not differentiable, provides a more accurate result.

oflibpytorch.set_pure_pytorch(warn: Optional[bool] = None)¶

Set the state of the toolbox-wide boolean variable PURE_PYTORCH to True. This means a faster PyTorch-only method is used instead of scipy.interpolate.griddata(), affording significant speed increases (an order of magnitude). It also means all main methods that output a tensor are differentiable in the PyTorch context. However, the results are less accurate.

Parameters: warn – Boolean determining whether a warning is printed in console. Useful for debugging, defaults to False

oflibpytorch.unset_pure_pytorch(warn: Optional[bool] = None)¶

Set the state of the toolbox-wide boolean variable PURE_PYTORCH to False. This means scipy.interpolate.griddata() is used instead of a faster PyTorch-only method. The results will be more accurate, but significantly slower (an order of magnitude). Most importantly, not all methods will be differentiable anymore.

Parameters: warn – Boolean determining whether a warning is printed in console. Useful for debugging, defaults to False

Using the Flow Class¶

This section documents the custom flow class and all its class methods. It is the recommended way of using oflibpytorch and makes the full range of functionality available to the user.

Flow Constructors and Operators¶

class oflibpytorch.Flow(flow_vectors: Union[numpy.ndarray, torch.Tensor], ref: Optional[str] = None, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None, device: Optional[Union[torch.device, int, str]] = None)¶

__init__(flow_vectors: Union[numpy.ndarray, torch.Tensor], ref: Optional[str] = None, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None, device: Optional[Union[torch.device, int, str]] = None)¶

Flow object constructor. For a more detailed explanation of the arguments, see the class attributes vecs, ref, mask, and device.

Parameters

flow_vectors – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down.
ref – Flow reference, either t for “target”, or s for “source”. Defaults to t
mask – Numpy array or pytorch tensor of shape \((H, W)\) containing a boolean mask indicating where the flow vectors are valid. Defaults to True everywhere
device – Tensor device, either a torch.device or a valid input to torch.device(), such as a string (cpu or cuda). For a device of type cuda, the device index defaults to torch.cuda.current_device(). If the input is None, it defaults to the device of the given flow_vectors, or torch.device('cpu') if the flow_vectors are a numpy array

property vecs¶

Flow vectors, a torch tensor of shape \((N, 2, H, W)\). The first dimension contains the batch size, the second the flow vectors. These are in the order horizontal component first, vertical component second (OpenCV convention). They are defined as positive towards the right and the bottom, meaning the origin is located in the left upper corner of the \(H \times W\) flow field area.

Returns: Flow vectors as torch tensor of shape \((N, 2, H, W)\), dtype float, device self.device

property vecs_numpy¶

Convenience function to get the flow vectors as a numpy array of shape \((N, H, W, 2)\). Otherwise same as vecs: The last dimension contains the flow vectors, in the order of horizontal component first, vertical component second (OpenCV convention). They are defined as positive towards the right and the bottom, meaning the origin is located in the left upper corner of the \(H \times W\) flow field area.

Returns: Flow vectors as a numpy array of shape \((N, H, W, 2)\), dtype float32

property ref¶

Flow reference, a string: either s for “source” or t for “target”. This determines whether the regular grid of shape \((H, W)\) associated with the flow vectors should be understood as the source of the vectors (which then point to any other position), or the target of the vectors (whose start point can then be any other position). The flow reference t is the default, meaning the regular grid refers to the coordinates the pixels whose motion is being recorded by the vectors end up at.

Applying a flow with reference s is known as “forward” warping, while reference t corresponds to what is termed “backward” or “reverse” warping.

Caution

If PURE_PYTORCH is set to False, calling apply() on a flow field with reference ref s (“source”) requires a call to scipy.interpolate.griddata(), which is quite slow. Using a flow field with reference ref t avoids this and will therefore be significantly faster. Similarly, calling track() on a flow field with reference ref t (“source”) also requires a call to scipy.interpolate.griddata(), in which case using a flow field with reference ref s instead is faster.

If PURE_PYTORCH is True, the call to scipy.interpolate.griddata() is replaced with a PyTorch-only interpolation function which will yield slightly less accurate result, but avoids any speed penalty and, most notably, is differentiable.

Tip

If some algorithm get_flow() is set up to calculate a flow field with reference t (or s) as in flow_one_ref = get_flow(img1, img2), it is very simple to obtain the flow in reference s (or t) instead: simply call the algorithm with the images in the reversed order, and multiply the resulting flow vectors by -1: flow_other_ref = -1 * get_flow(img2, img1)

Returns: Flow reference, as string of value t or s

property mask¶

Flow mask as a torch tensor of shape \((N, H, W)\) and type bool. This array indicates, for each flow vector, whether it is considered “valid”. As an example, this allows for masking of the flow based on object segmentations. It is also necessary to keep track of which flow vectors are valid when different flow fields are combined, as those operations often lead to undefined (partially or fully unknown) points in the given \(H \times W\) area where the flow vectors are either completely unknown, or will not have valid values.

Returns: Flow mask as a torch tensor of shape \((N, H, W)\) and type bool

property mask_numpy¶

Convenience function to get the mask as a numpy array of shape \((N, H, W)\). Otherwise same as mask: this array indicates, for each flow vector, whether it is considered “valid”. As an example, this allows for masking of the flow based on object segmentations. It is also necessary to keep track of which flow vectors are valid when different flow fields are combined, as those operations often lead to undefined (partially or fully unknown) points in the given \(H \times W\) area where the flow vectors are either completely unknown, or will not have valid values.

Returns: mask as a numpy array of shape \((N, H, W)\) and type bool

property device¶

The device of all flow object tensors, as a torch.device

Returns: Tensor device as a torch.device

property shape¶

Shape (resolution) \((N, H, W)\) of the flow, corresponding to the batch size (can be 1) and the flow field shape \((H, W)\)

Returns: Tuple of the shape (resolution) \((N, H, W)\) of the flow object

classmethod zero(shape: Union[list, tuple], ref: Optional[str] = None, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None, device: Optional[Union[torch.device, int, str]] = None) → Flow ¶

Flow object constructor, zero everywhere

Parameters

shape – List or tuple of the shape \((H, W)\) or \((N, H, W)\) of the flow field
ref – Flow reference, string of value t (“target”) or s (“source”). Defaults to t
mask – Numpy array or torch tensor of shape \((H, W)\) or \((N, H, W)\) and type bool indicating where the flow vectors are valid. Defaults to True everywhere
device – Tensor device, either a torch.device or a valid input to torch.device(), such as a string (cpu or cuda). For a device of type cuda, the device index defaults to torch.cuda.current_device(). If the input is None, it defaults to torch.device('cpu')

Returns

Flow object, zero everywhere

classmethod from_matrix(matrix: Union[numpy.ndarray, torch.Tensor], shape: Union[list, tuple], ref: Optional[str] = None, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None, device: Optional[Union[torch.device, int, str]] = None, matrix_is_inverse: Optional[bool] = None) → Flow ¶

Flow object constructor, based on transformation matrix input

The output flow vectors are differentiable with respect to the input matrix, if given as a tensor.

Parameters

matrix – Transformation matrix to be turned into a flow field, as numpy array or torch tensor of shape \((3, 3)\) or \((N, 3, 3)\)
shape – List or tuple of the shape \((H, W)\) of the flow field
ref – Flow reference, string of value t (“target”) or s (“source”). Defaults to t
mask – Numpy array or torch tensor of shape \((H, W)\) and type bool indicating where the flow vectors are valid. Defaults to True everywhere
device – Tensor device, either a torch.device or a valid input to torch.device(), such as a string (cpu or cuda). For a device of type cuda, the device index defaults to torch.cuda.current_device(). If the input is None, it defaults to torch.device('cpu')
matrix_is_inverse – Boolean determining whether the given matrix is already the inverse of the desired transformation. Is useful for flow with reference t to avoid calculation of the pseudo-inverse, but will throw a ValueError if used for flow with reference s to avoid accidental usage. Defaults to False

Returns

Flow object

classmethod from_transforms(transform_list: list, shape: Union[list, tuple], ref: Optional[str] = None, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None, device: Optional[Union[torch.device, int, str]] = None, padding: Optional[list] = None) → Flow ¶

Flow object constructor, based on a list of transforms. If padding values are given, the given shape is padded accordingly. The transforms values are also adjusted, e.g. by shifting scaling and rotation centres.

Parameters

transform_list –
List of transforms to be turned into a flow field, where each transform is expressed as a list of [transform name, transform value 1, … , transform value n]. Supported options:
- Transform translation, with values horizontal shift in px, vertical shift in px
- Transform rotation, with values horizontal centre in px, vertical centre in px, angle in degrees, counter-clockwise
- Transform scaling, with values horizontal centre in px, vertical centre in px, scaling fraction
shape – List or tuple of the shape \((H, W)\) of the flow field
ref – Flow reference, string of value t (“target”) or s (“source”). Defaults to t
mask – Numpy array or torch tensor of shape \((H, W)\) and type bool indicating where the flow vectors are valid. Defaults to True everywhere
device – Tensor device, either a torch.device or a valid input to torch.device(), such as a string (cpu or cuda). For a device of type cuda, the device index defaults to torch.cuda.current_device(). If the input is None, it defaults to torch.device('cpu')
padding – List or tuple of shape \((4)\) with padding values [top, bot, left, right]

Returns

Flow object

classmethod from_kitti(path: str, load_valid: Optional[bool] = None, device: Optional[Union[torch.device, int, str]] = None) → Flow ¶

Loads the flow field contained in KITTI uint16 png images files, optionally including the valid pixels. Follows the official instructions on how to read the provided .png files on the KITTI optical flow dataset website.

Parameters

path – String containing the path to the KITTI flow data (uint16, .png file)
load_valid – Boolean determining whether the valid pixels are loaded as the flow mask. Defaults to True
device – Tensor device, either a torch.device or a valid input to torch.device(), such as a string (cpu or cuda). For a device of type cuda, the device index defaults to torch.cuda.current_device(). If the input is None, it defaults to torch.device('cpu')

Returns

A flow object corresponding to the KITTI flow data, with flow reference ref s.

classmethod from_sintel(path: str, inv_path: Optional[str] = None, device: Optional[Union[torch.device, int, str]] = None) → Flow ¶

Loads the flow field contained in Sintel .flo byte files, including the invalid pixels if required. Follows the official instructions provided alongside the .flo data on the Sintel optical flow dataset website.

Parameters

path – String containing the path to the Sintel flow data (.flo byte file, little Endian)
inv_path – String containing the path to the Sintel invalid pixel data (.png, black and white)
device – Tensor device, either a torch.device or a valid input to torch.device(), such as a string (cpu or cuda). For a device of type cuda, the device index defaults to torch.cuda.current_device(). If the input is None, it defaults to torch.device('cpu')

Returns

A flow object corresponding to the Sintel flow data, with flow reference ref s

copy() → Flow ¶

Copy a flow object by constructing a new one with the same vectors vecs, reference ref, mask mask, and device device

The output flow vectors are differentiable with respect to the input flow vectors.

Returns: Copy of the flow object

to_device(device) → Flow ¶

Returns a new flow object on the desired torch device

The output flow vectors are differentiable with respect to the input flow vectors.

Parameters: device – Tensor device, either a torch.device or a valid input to torch.device(), such as a string (cpu or cuda). For a device of type cuda, the device index defaults to torch.cuda.current_device(). If the input is None, it defaults to torch.device('cpu')
Returns: New flow object on the desired torch device

__str__() → str¶

Enhanced string representation of the flow object, containing the flow reference ref, shape shape, and device device

Returns: String representation

select(item: Optional[int] = None) → Flow ¶

Returns a single-item flow object from a batched flow object, e.g. for iterating through or visualising

The output flow vectors are differentiable with respect to the input flow vectors.

Parameters: item – Element in batch to be selected, as an integer. Defaults to ``None’’, returns the whole flow object
Returns: Same flow object if input is ``None’’, else new flow object with batch size \(N\) of 1

__getitem__(item: Union[int, list, slice]) → Flow ¶

Mimics __getitem__ of a torch tensor, returning a new flow object cut accordingly

The output flow vectors are differentiable with respect to the input flow vectors.

Will throw an error if mask.__getitem__(item) or vecs.__getitem__(item) (corresponding to mask[item] and vecs[item]) throw an error. Also throws an error if sliced vecs or mask are not suitable to construct a new flow object with, e.g. if the number of dimensions is too low.

Parameters: item – Slice used to select a part of the flow
Returns: New flow object cut as a corresponding torch tensor would be cut

__add__(other: Union[numpy.ndarray, torch.Tensor, Flow]) → Flow ¶

Adds a flow object, a numpy array, or a torch tensor to a flow object

The output flow vectors are differentiable with respect to the input flow vectors.

Caution

This is not equal to applying the two flows sequentially. For that, use combine_flows() with mode set to 3.

Caution

If this method is used to add two flow objects, there is no check on whether they have the same reference ref.

Parameters: other – Flow object, numpy array, or torch tensor corresponding to the addend. Adding a flow object will adjust the mask of the resulting flow object to correspond to the logical union of the augend / addend masks. If a batch dimension is given, it has to match the batch dimension of the flow object, or one of them needs to be 1 in order to be broadcast correctly
Returns: New flow object corresponding to the sum

__sub__(other: Union[numpy.ndarray, torch.Tensor, Flow]) → Flow ¶

Subtracts a flow object, a numpy array, or a torch tensor from a flow object

The output flow vectors are differentiable with respect to the input flow vectors.

Caution

This is not equal to subtracting the effects of applying flow fields to an image. For that, use combine_flows() with mode set to 1 or 2.

Caution

If this method is used to subtract two flow objects, there is no check on whether they have the same reference ref.

Parameters: other – Flow object, numpy array, or torch tensor corresponding to the subtrahend. Subtracting a flow object will adjust the mask of the resulting flow object to correspond to the logical union of the minuend / subtrahend masks. If a batch dimension is given, it has to match the batch dimension of the flow object, or one of them needs to be 1 in order to be broadcast correctly
Returns: New flow object corresponding to the difference

__mul__(other: Union[float, int, bool, list, numpy.ndarray, torch.Tensor]) → Flow ¶

Multiplies a flow object with a single number, a list, a numpy array, or a torch tensor

The output flow vectors are differentiable with respect to the input flow vectors.

Parameters

other –

Multiplier, options:

can be converted to a float
a list of shape \((2)\)
a numpy array or torch tensor of the same shape \((H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((H, W, 2)\) or \((2, H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((N, 2, H, W)\) as the flow object

Returns

New flow object corresponding to the product

__truediv__(other: Union[float, int, bool, list, numpy.ndarray, torch.Tensor]) → Flow ¶

Divides a flow object by a single number, a list, a numpy array, or a torch tensor

The output flow vectors are differentiable with respect to the input flow vectors.

Parameters

other –

Divisor, options:

can be converted to a float
a list of shape \((2)\)
a numpy array or torch tensor of the same shape \((H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((H, W, 2)\) or \((2, H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((N, 2, H, W)\) as the flow object

Returns

New flow object corresponding to the quotient

__pow__(other: Union[float, int, bool, list, numpy.ndarray, torch.Tensor]) → Flow ¶

Exponentiates a flow object by a single number, a list, a numpy array, or a torch tensor

The output flow vectors are differentiable with respect to the input flow vectors.

Parameters

other –

Exponent, options:

can be converted to a float
a list of shape \((2)\)
a numpy array or torch tensor of the same shape \((H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((H, W, 2)\) or \((2, H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((N, 2, H, W)\) as the flow object

Returns

New flow object corresponding to the power

__neg__() → Flow ¶

Returns a new flow object with all the flow vectors inverted

The output flow vectors are differentiable with respect to the input flow vectors.

Caution

This is not equal to inverting the transformation a flow field corresponds to! For that, use invert().

Returns: New flow object with inverted flow vectors

Manipulating the Flow¶

Flow.resize(scale: Union[float, int, list, tuple]) → Flow ¶

Resize a flow field, scaling the flow vectors values vecs accordingly.

The output flow vectors are differentiable with respect to the input flow vectors.

Parameters

scale –

Scale used for resizing, options:

Integer or float of value scaling applied both vertically and horizontally
List or tuple of shape \((2)\) with values [vertical scaling, horizontal scaling]

Returns

New flow object scaled as desired

Flow.pad(padding: Optional[list] = None, mode: Optional[str] = None) → Flow ¶

Pad the flow with the given padding. Padded flow vecs values are either constant (set to 0), reflect the existing flow values along the edges, or replicate those edge values. Padded mask values are set to False.

The output flow vectors are differentiable with respect to the input flow vectors.

Parameters

padding – List or tuple of shape \((4)\) with padding values [top, bot, left, right]
mode – String of the numpy padding mode for the flow vectors, with options constant (fill value 0), reflect, replicate (see documentation for torch.nn.functional.pad()). Defaults to constant

Returns

New flow object with the padded flow field

Flow.unpad(padding: Optional[list] = None) → Flow ¶

Cuts the flow according to the padding values, effectively undoing the effect of pad()

The output flow vectors are differentiable with respect to the input flow vectors.

Parameters: padding – List or tuple of shape \((4)\) with padding values [top, bot, left, right]
Returns: New flow object, cut according to the padding values

Flow.invert(ref: Optional[str] = None) → Flow ¶

Inverting a flow: img₁ – f –> img₂ becomes img₁ <– f – img₂. The smaller the input flow, the closer the inverse is to simply multiplying the flow by -1.

If the toolbox-wide variable PURE_PYTORCH is set to True (default, see also set_pure_pytorch()), the output flow field vectors are differentiable with respect to the input flow field vectors.

Parameters: ref – Desired reference of the output field, defaults to the reference of original flow field
Returns: New flow object, inverse of the original

Flow.switch_ref(mode: Optional[str] = None) → Flow ¶

Switch the reference ref between s (“source”) and t (“target”)

If the toolbox-wide variable PURE_PYTORCH is set to True (default, see also set_pure_pytorch()), the output flow field vectors are differentiable with respect to the input flow field vectors.

Caution

Do not use mode=invalid if avoidable: it does not actually change any flow values, and the resulting flow object, when applied to an image, will no longer yield the correct result.

Parameters

mode –

Mode used for switching, available options:

invalid: just the flow reference attribute is switched without any flow values being changed. This is functionally equivalent to simply assigning flow.ref = 't' for a “source” flow or flow.ref = 's' for a “target” flow
valid: the flow field is switched to the other coordinate reference, with flow vectors recalculated accordingly

Returns

New flow object with switched coordinate reference

Flow.combine(other: Flow, mode: int, ref: Optional[str] = None) → Flow ¶

Function that returns the result of the combination of two flow objects of the same shape shape in whichever reference ref required.

If the toolbox-wide variable PURE_PYTORCH is set to True (default, see also set_pure_pytorch()), the output flow field vectors are differentiable with respect to the input flow fields.

Tip

All of the flow field combinations in this function rely on some combination of the apply(), invert(), and switch_ref() methods. If PURE_PYTORCH is set to False, some of these methods will call scipy.interpolate.griddata(), which can be very slow (several seconds) - but the result will be more accurate compared to using the PyTorch-only setting.

All formulas used in this function have been derived from first principles. The base formula is \(flow_1 ⊕ flow_2 = flow_3\), where \(⊕\) is a non-commutative flow composition operation. This can be visualised with the start / end points of the flows as follows:

S = Start point    S1 = S3 ─────── f3 ────────────┐
E = End point       │                             │
f = flow           f1                             v
                    └───> E1 = S2 ── f2 ──> E2 = E3

The main difficulty in combining flow fields is that it would be incorrect to simply add up or subtract flow vectors at one location in the flow field area \(H \times W\). This appears to work given e.g. a translation to the right, and a translation downwards: the result will be the linear combination of the two vectors, or a translation towards the bottom right. However, looking more closely, it becomes evident that this approach isn’t actually correct: A pixel that has been moved from S1 to E1 by the first flow field f1 is then moved from that location by the flow vector of the flow field f2 that corresponds to the new pixel location E1, not the original location S1. If the flow vectors are the same everywhere in the field, the difference will not be noticeable. However, if the flow vectors of f2 vary throughout the field, such as with a rotation around some point, it will!

In this case (corresponding to calling f1.combine(f2, mode=3)), and if the flow reference ref is s (“source”), the solution is to first apply the inverse of f1 to f2, essentially linking up each location E1 back to S1, and then to add up the flow vectors. Analogous observations apply for the other permutations of flow combinations and reference ref values.

Note

This is consistent with the observation that two translations are commutative in their application - the order does not matter, and the vectors can simply be added up at every pixel location -, while a translation followed by a rotation is not the same as a rotation followed by a translation: adding up vectors at each pixel cannot be the correct solution as there wouldn’t be a difference based on the order of vector addition.

Parameters

other – Flow object to combine with, shape (including batch size) needs to match
mode –
Integer determining how the input flows are combined, where the number corresponds to the position in the formula \(flow_1 ⊕ flow_2 = flow_3\):
- Mode 1: self corresponds to \(flow_2\), flow corresponds to \(flow_3\), the result will be \(flow_1\)
- Mode 2: self corresponds to \(flow_1\), flow corresponds to \(flow_3\), the result will be \(flow_2\)
- Mode 3: self corresponds to \(flow_1\), flow corresponds to \(flow_2\), the result will be \(flow_3\)
ref – Desired output flow reference, defaults to the reference of self

Returns

New flow object

Flow.combine_with(flow: Flow, mode: int, thresholded: Optional[bool] = None) → Flow ¶

Function that returns the result of the combination of two flow objects of the same shape shape and reference ref. If the toolbox-wide variable PURE_PYTORCH is set to True (default, see also set_pure_pytorch()), the output flow field vectors are differentiable with respect to the input flow fields.

Caution

This method will in future be deprecated in favour of combine(), using a more general algorithm that can both combine and output flow objects in any reference frame.

Tip

All of the flow field combinations in this function rely on some combination of the apply() and invert() methods. If PURE_PYTORCH is set to False, and mode is 1 or 2, these methods will call scipy.interpolate.griddata(), which can be very slow (several seconds) - but the result will be more accurate compared to using the PyTorch-only setting.

All formulas used in this function have been derived from first principles. The base formula is \(flow_1 ⊕ flow_2 = flow_3\), where \(⊕\) is a non-commutative flow composition operation. This can be visualised with the start / end points of the flows as follows:

S = Start point    S1 = S3 ─────── f3 ────────────┐
E = End point       │                             │
f = flow           f1                             v
                    └───> E1 = S2 ── f2 ──> E2 = E3

The main difficulty in combining flow fields is that it would be incorrect to simply add up or subtract flow vectors at one location in the flow field area \(H \times W\). This appears to work given e.g. a translation to the right, and a translation downwards: the result will be the linear combination of the two vectors, or a translation towards the bottom right. However, looking more closely, it becomes evident that this approach isn’t actually correct: A pixel that has been moved from S1 to E1 by the first flow field f1 is then moved from that location by the flow vector of the flow field f2 that corresponds to the new pixel location E1, not the original location S1. If the flow vectors are the same everywhere in the field, the difference will not be noticeable. However, if the flow vectors of f2 vary throughout the field, such as with a rotation around some point, it will!

In this case (corresponding to calling f1.combine_with(f2, mode=3)), and if the flow reference ref is s (“source”), the solution is to first apply the inverse of f1 to f2, essentially linking up each location E1 back to S1, and then to add up the flow vectors. Analogous observations apply for the other permutations of flow combinations and reference ref values.

Note

This is consistent with the observation that two translations are commutative in their application - the order does not matter, and the vectors can simply be added up at every pixel location -, while a translation followed by a rotation is not the same as a rotation followed by a translation: adding up vectors at each pixel cannot be the correct solution as there wouldn’t be a difference based on the order of vector addition.

Parameters

flow – Flow object to combine with, shape (including batch size) needs to match
mode –
Integer determining how the input flows are combined, where the number corresponds to the position in the formula \(flow_1 ⊕ flow_2 = flow_3\):
- Mode 1: self corresponds to \(flow_2\), flow corresponds to \(flow_3\), the result will be \(flow_1\)
- Mode 2: self corresponds to \(flow_1\), flow corresponds to \(flow_3\), the result will be \(flow_2\)
- Mode 3: self corresponds to \(flow_1\), flow corresponds to \(flow_2\), the result will be \(flow_3\)
thresholded – Boolean determining whether flows are thresholded during an internal call to is_zero(), defaults to False

Returns

New flow object

oflibpytorch.batch_flows(flows: Union[list, tuple]) → Flow ¶

Returns a batched flow object from a list of input flows of the same size and flow reference ref.

The output flow vectors are differentiable with respect to the input flow vectors.

Parameters: flows – Tuple or list of flow objects. Flow objects to have the same flow reference ref, as well as the same flow field heights and widths \((H, W)\). They can have any batch size.
Returns: Single batched flow object, with a batch size equal to the sum of all individual input batch sizes

Applying the Flow¶

Flow.apply(target: Union[torch.Tensor, Flow], target_mask: Optional[torch.Tensor] = None, return_valid_area: Optional[bool] = None, consider_mask: Optional[bool] = None, padding: Optional[list] = None, cut: Optional[bool] = None) → Union[torch.Tensor, Flow, Tuple[Union[torch.Tensor, Flow], torch.Tensor]]¶

Apply the flow to a target, which can be a torch tensor or a Flow object itself

If PURE_PYTORCH is set to True (default, see also set_pure_pytorch()), the output is differentiable with respect to the flow vectors and the input target, if given as a tensor.

Tip

If PURE_PYTORCH is set to False, calling apply() on a flow field with reference ref s (“source”) requires a call to scipy.interpolate.griddata(), which is quite slow. Using a flow field with reference ref t avoids this and will therefore be significantly faster. If PURE_PYTORCH is True, a flow field with reference ref s will yield less accurate results, but there is no speed penalty - and the output is differentiable.

If the flow shape \((H_{flow}, W_{flow})\) is smaller than the target shape \((H_{target}, W_{target})\), a list of padding values needs to be passed to localise the flow in the larger \(H_{target} \times W_{target}\) area.

The valid image area that can optionally be returned is True where the image values in the function output:

have been affected by flow vectors. If the flow has a reference ref value of t (“target”), this is always True as the target image by default has a corresponding flow vector at each pixel location in \(H \times W\). If the flow has a reference ref value of s (“source”), this is only True for some parts of the image: some target image pixel locations in \(H \times W\) would only be reachable by flow vectors originating outside of the source image area, which is impossible by definition
have been affected by flow vectors that were themselves valid, as determined by the flow mask

Caution

The parameter consider_mask relates to whether the invalid flow vectors in a flow field with reference s are removed before application (default behaviour) or not. Doing so results in a smoother flow field, but can cause artefacts to arise where the outline of the area returned by valid_target() is not a convex hull. For a more detailed explanation with an illustrative example, see the section “Applying a Flow” in the usage documentation.

Parameters

target – Torch tensor of shape \((H, W)\), \((C, H, W)\), or \((N, C, H, W)\), or a flow object of shape \((N, H, W)\) to which the flow should be applied, where \(H\) and \(W\) are equal or larger than the corresponding dimensions of the flow itself
target_mask – Optional torch tensor of shape \((H, W)\) or \((N, H, W)\) and type bool that indicates which part of the target is valid (only relevant if target is not a flow object). Defaults to True everywhere
return_valid_area – Boolean determining whether the valid image area is returned (only if the target is a numpy array), defaults to False. The valid image area is returned as a boolean torch tensor of shape \((N, H, W)\).
consider_mask – Boolean determining whether the flow vectors are masked before application (only relevant for flows with reference ref = 's'). Results in smoother outputs, but more artefacts. Defaults to True
padding – List or tuple of shape \((4)\) with padding values [top, bottom, left, right]. Required if the flow and the target don’t have the same shape. Defaults to None, which means no padding needed
cut – Boolean determining whether the warped target is returned cut from \((H_{target}, W_{target})\) to \((H_{flow}, W_{flow})\), in the case that the shapes are not the same. Defaults to True

Returns

The warped target of the same shape \((C, H, W)\) or \((N, C, H, W)\) and type as the input (rounded if necessary), except when this is an integer type and PURE_PYTORCH is True. In that case, outputs should be differentiable and are therefore kept as floats (but still rounded if the input is an integer type). Optionally also returns the valid area of the flow as a boolean torch tensor of shape \((N, H, W)\).

Flow.track(pts: torch.Tensor, int_out: Optional[bool] = None, get_valid_status: Optional[bool] = None) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]¶

Warp input points with the flow field, returning the warped point coordinates as integers if required

If PURE_PYTORCH is set to True (default, see also set_pure_pytorch()), the output is differentiable with respect to the flow vectors and the input point coordinates.

Tip

If PURE_PYTORCH is set to False, calling track() on a flow field with reference ref t (“target”) requires a call to scipy.interpolate.griddata(), which is quite slow. Using a flow field with reference ref s avoids this and will therefore be significantly faster. If PURE_PYTORCH is True, a flow field with reference ref t will yield less accurate results (by fractions of pixels), but there is no speed penalty - and the output is differentiable.

Parameters

pts – Torch tensor of shape \((M, 2)\) or \((N, M, 2)\) containing the point coordinates. If a batch dimension is given, it must correspond to the flow batch dimension. If the flow is batched but the points are not, the same points are warped by each flow field individually. pts[:, 0] corresponds to the vertical coordinate, pts[:, 1] to the horizontal coordinate
int_out – Boolean determining whether output points are returned as rounded integers, defaults to False
get_valid_status – Boolean determining whether a tensor of shape \((M)\) or \((N, M)\) is returned, which contains the status of each point. This corresponds to applying valid_source() to the point positions, and returns True for the points that 1) tracked by valid flow vectors, and 2) end up inside the flow area of \(H \times W\). Defaults to False

Returns

Torch tensor of warped (‘tracked’) points of the same shape as the input, and optionally a torch tensor of the tracking status per point. The tensor device is the same as the tensor device of the flow field

Evaluating the Flow¶

Flow.is_zero(thresholded: Optional[bool] = None, masked: Optional[bool] = None) → bool¶

Check whether all flow vectors (where mask is True) are zero. Optionally, a threshold flow magnitude value of 1e-3 is used. This can be useful to filter out motions that are equal to very small fractions of a pixel, which might just be a computational artefact to begin with.

Parameters

thresholded – Boolean determining whether the flow is thresholded, defaults to True
masked – Boolean determining whether the flow is masked with mask, defaults to True

Returns

Tensor matching the batch dimension, containing True for each flow field that is zero everywhere, otherwise False

Flow.matrix(dof: Optional[int] = None, method: Optional[str] = None, masked: Optional[bool] = None) → numpy.ndarray¶

Fit a transformation matrix to the flow field using OpenCV functions

Parameters

dof –
Integer describing the degrees of freedom in the transformation matrix to be fitted, defaults to 8. Options are:
- 4: Partial affine transform with rotation, translation, scaling
- 6: Affine transform with rotation, translation, scaling, shearing
- 8: Projective transform, i.e estimation of a homography
method –
String describing the method used to fit the transformations matrix by OpenCV, defaults to ransac. Options are:
- lms: Least mean squares
- ransac: RANSAC-based robust method
- lmeds: Least-Median robust method
masked – Boolean determining whether the flow mask is used to ignore flow locations where the mask mask is False. Defaults to True

Returns

Torch tensor of shape \((N, 3, 3)\) and the same device as the flow object, containing the transformation matrix

Flow.valid_target(consider_mask: Optional[bool] = None) → torch.Tensor¶

Find the valid area in the target domain

Given a source image and a flow, both of shape \((H, W)\), the target image is created by warping the source with the flow. The valid area is then a boolean numpy array of shape \((H, W)\) that is True wherever the value in the target img stems from warping a value from the source, and False where no valid information is known.

Pixels that are False will often be black (or ‘empty’) in the warped target image - but not necessarily, due to warping artefacts etc. The valid area also allows a distinction between pixels that are black due to no actual information being available at this position (validity False), and pixels that are black due to black pixel values having been warped to that (valid) location by the flow.

Parameters: consider_mask – Boolean determining whether the flow vectors are masked before application (only relevant for flows with reference ref = 's', analogous to apply()). Results in smoother outputs, but more artefacts. Defaults to True
Returns: Boolean torch tensor of the same shape \((N, H, W)\) as the flow

Flow.valid_source(consider_mask: Optional[bool] = None) → torch.Tensor¶

Finds the area in the source domain that will end up being valid in the target domain (see valid_target()) after warping

Given a source image and a flow, both of shape \((H, W)\), the target image is created by warping the source with the flow. The source area is then a boolean numpy array of shape \((H, W)\) that is True wherever the value in the source will end up somewhere inside the valid target area, and False where the value in the source will either be warped outside of the target image, or not be warped at all due to a lack of valid flow vectors connecting to this position.

Parameters: consider_mask – Boolean determining whether the flow vectors are masked before application (only relevant for flows with reference ref = 't' as their inverse flow will be applied, using the reference s; analogous to apply()). Results in smoother outputs, but more artefacts. Defaults to True
Returns: Boolean torch tensor of the same shape \((N, H, W)\) as the flow

Flow.get_padding(item: Optional[int] = None) → list¶

Determine necessary padding from the flow field:

When the flow reference ref has the value t (“target”), this corresponds to the padding needed in a source image which ensures that every flow vector in vecs marked as valid by the mask mask will find a value in the source domain to warp towards the target domain. I.e. any invalid locations in the area \(H \times W\) of the target domain (see valid_target()) are purely due to no valid flow vector being available to pull a source value to this target location, rather than no source value being available in the first place.
When the flow reference ref has the value s (“source”), this corresponds to the padding needed for the flow itself, so that applying it to a source image will result in no input image information being lost in the warped output, i.e each input image pixel will come to lie inside the padded area.

Parameters: item – Element in batch to be selected, as an integer. Defaults to ``None’’, returns the whole flow object
Returns: If no item is selected from the batch, this function returns a list of shape \((N, 4)\), where N is the batch size. If an item is selected, it returns a list of shape \((4)\). Padding values themselves are given in the following order: [top, bottom, left, right]

Visualising the Flow¶

Flow.visualise(mode: str, show_mask: Optional[bool] = None, show_mask_borders: Optional[bool] = None, range_max: Optional[Union[float, int, list, tuple]] = None, return_tensor: Optional[bool] = None) → Union[numpy.ndarray, torch.Tensor]¶

Visualises the flow as an rgb / bgr / hsv image, optionally showing the outline of the flow mask mask as a black line, and the invalid areas greyed out.

Note

This currently runs internally based on NumPy & OpenCV, due to a lack of easily accessible equivalent functions for mask border detection and drawing. Therefore, even if the output is a tensor, it will not be differentiable with respect to the flow vector tensor.

Parameters

mode – Output mode, options: rgb, bgr, hsv
show_mask – Boolean determining whether the flow mask is visualised, defaults to False
show_mask_borders – Boolean determining whether the flow mask border is visualised, defaults to False
range_max – Maximum vector magnitude expected, corresponding to the HSV maximum Value of 255 when scaling the flow magnitudes. Can be a list or tuple corresponding of the same length as the flow batch size. Defaults to the 99th percentile of the flow field magnitudes, per batch element
return_tensor – Boolean determining whether the result is returned as a tensor. Note that the result is originally a numpy array. Defaults to True

Returns

Numpy array of shape \((N, H, W, 3)\) or torch tensor of shape \((N, 3, H, W)\) containing the flow visualisation, where N is the batch size

Flow.visualise_arrows(grid_dist: Optional[int] = None, img: Optional[Union[numpy.ndarray, torch.Tensor]] = None, scaling: Optional[Union[float, int]] = None, show_mask: Optional[bool] = None, show_mask_borders: Optional[bool] = None, colour: Optional[tuple] = None, thickness: Optional[int] = None, return_tensor: Optional[bool] = None) → Union[numpy.ndarray, torch.Tensor]¶

Visualises the flow as arrowed lines, optionally showing the outline of the flow mask mask as a black line, and the invalid areas greyed out.

Note

This currently runs internally based on NumPy & OpenCV, due to a lack of easily accessible equivalent functions for arrow drawing and mask border detection. Therefore, even if the output is a tensor, it will not be differentiable with respect to the flow vector tensor.

Parameters

grid_dist – Integer of the distance of the flow points to be used for the visualisation, defaults to 20
img – Torch tensor of shape \((N, 3, H, W)\) or \((3, H, W)\) or numpy array of shape \((N, H, W, 3)\) or \((H, W, 3)\) containing the background image to use (in BGR mode), defaults to white
scaling – Float or int of the flow line scaling, defaults to scaling the 99th percentile of arrowed line lengths to be equal to twice the grid distance (empirical value)
show_mask – Boolean determining whether the flow mask is visualised, defaults to False
show_mask_borders – Boolean determining whether the flow mask border is visualised, defaults to False
colour – Tuple of the flow arrow colour, defaults to hue based on flow direction as in visualise()
thickness – Integer of the flow arrow thickness, larger than zero. Defaults to 1
return_tensor – Boolean determining whether the result is returned as a tensor. Note that the result is originally a numpy array. Defaults to True

Returns

Numpy array of shape \((H, W, 3)\) or torch tensor of shape \((3, H, W)\) containing the flow visualisation, in bgr colour space

Flow.show(elem: Optional[int] = None, wait: Optional[int] = None, show_mask: Optional[bool] = None, show_mask_borders: Optional[bool] = None)¶

Shows the flow in an OpenCV window using visualise()

Parameters

elem – Integer determining which batch element is visualised. Defaults to 0, so for flows with only one element it automatically selects the one available flow
wait – Integer determining how long to show the flow for, in milliseconds. Defaults to 0, which means it will be shown until the window is closed, or the process is terminated
show_mask – Boolean determining whether the flow mask is visualised, defaults to False
show_mask_borders – Boolean determining whether flow mask border is visualised, defaults to False

Flow.show_arrows(elem: Optional[int] = None, wait: Optional[int] = None, grid_dist: Optional[int] = None, img: Optional[numpy.ndarray] = None, scaling: Optional[Union[float, int]] = None, show_mask: Optional[bool] = None, show_mask_borders: Optional[bool] = None, colour: Optional[tuple] = None)¶

Shows the flow in an OpenCV window using visualise_arrows()

Parameters

elem – Integer determining which batch element is visualised. Defaults to 0, so for flows with only one element it automatically selects the one available flow
wait – Integer determining how long to show the flow for, in milliseconds. Defaults to 0, which means it will be shown until the window is closed, or the process is terminated
grid_dist – Integer of the distance of the flow points to be used for the visualisation, defaults to 20
img – Numpy array with the background image to use (in BGR colour space), defaults to black
scaling – Float or int of the flow line scaling, defaults to scaling the 99th percentile of arrowed line lengths to be equal to twice the grid distance (empirical value)
show_mask – Boolean determining whether the flow mask is visualised, defaults to False
show_mask_borders – Boolean determining whether the flow mask border is visualised, defaults to False
colour – Tuple of the flow arrow colour, defaults to hue based on flow direction as in visualise()

oflibpytorch.visualise_definition(mode: str, shape: Optional[Union[list, tuple]] = None, insert_text: Optional[bool] = None, return_tensor: Optional[bool] = None) → Union[numpy.ndarray, torch.Tensor]¶

Return an image that shows the definition of the flow visualisation.

Parameters

mode – Desired output colour space: rgb, bgr, or hsv
shape – List or tuple of shape \((2)\) containing the desired image shape as values (H, W). Defaults to (601, 601) - do not change if you leave insert_text as True as otherwise the text will appear in the wrong location
insert_text – Boolean determining whether explanatory text is put on the image (using cv2.putText()), defaults to True
return_tensor – Boolean determining whether the result is returned as a tensor. Note that the result is originally a numpy array. Defaults to True

Returns

Numpy array of shape \((H, W, 3)\) or torch tensor of shape \((3, H, W)\), type uint8, showing the colour definition of the flow visualisation

Using Torch tensors & NumPy arrays¶

This section contains functions that take Torch tensors as well as NumPy arrays as inputs, instead of making use of the custom flow class. On the one hand, this avoids having to define flow objects. On the other hand, it requires keeping track of flow attributes manually, and it does not avail itself of the full scope of functionality oflibpytorch has to offer: most importantly, flow masks are not considered or tracked.

Flow Creation & Loading¶

oflibpytorch.from_matrix(matrix: Union[numpy.ndarray, torch.Tensor], shape: Union[list, tuple], ref: Optional[str] = None, matrix_is_inverse: Optional[bool] = None) → torch.Tensor¶

Flow vectors calculated from a transformation matrix input.

The output flow vectors are differentiable with respect to the input matrix, if given as a torch tensor.

Parameters

matrix – Transformation matrix to be turned into a flow field, as numpy array or torch tensor of shape \((3, 3)\) or \((N, 3, 3)\)
shape – List or tuple of the shape \((H, W)\) of the flow field
ref – Flow reference, string of value t (“target”) or s (“source”). Defaults to t
matrix_is_inverse – Boolean determining whether the given matrix is already the inverse of the desired transformation. Is useful for flow with reference t to avoid calculation of the pseudo-inverse, but will throw a ValueError if used for flow with reference s to avoid accidental usage. Defaults to False

Returns

Flow vectors of shape \((N, 2, H, W)\)

oflibpytorch.from_transforms(transform_list: list, shape: Union[list, tuple], ref: Optional[str] = None, padding: Optional[list] = None) → torch.Tensor¶

Flow vectors calculated from a list of transforms. If padding values are given, the given shape is: padded accordingly. The transforms values are also adjusted, e.g. by shifting scaling and rotation centres.

Parameters

transform_list –
List of transforms to be turned into a flow field, where each transform is expressed as a list of [transform name, transform value 1, … , transform value n]. Supported options:
- Transform translation, with values horizontal shift in px, vertical shift in px
- Transform rotation, with values horizontal centre in px, vertical centre in px, angle in degrees, counter-clockwise
- Transform scaling, with values horizontal centre in px, vertical centre in px, scaling fraction
shape – List or tuple of the shape \((H, W)\) of the flow field
ref – Flow reference, string of value t (“target”) or s (“source”). Defaults to t
padding – List or tuple of shape \((4)\) with padding values [top, bot, left, right]

Returns

Flow vectors of shape \((N, 2, H, W)\)

oflibpytorch.load_kitti(path: str) → Union[List[torch.Tensor], torch.Tensor]¶

Loads the flow field contained in KITTI uint16 png images files, including the valid pixels. Follows the official instructions on how to read the provided .png files on the KITTI optical flow dataset website.

Parameters: path – String containing the path to the KITTI flow data (uint16, .png file)
Returns: A torch tensor of shape \((3, H, W)\) with the KITTI flow data (with valid pixels in the 3rd channel)

oflibpytorch.load_sintel(path: str) → torch.Tensor¶

Loads the flow field contained in Sintel .flo byte files. Follows the official instructions provided alongside the .flo data on the Sintel optical flow dataset website.

Parameters: path – String containing the path to the Sintel flow data (.flo byte file, little Endian)
Returns: A torch tensor of shape \((2, H, W)\) containing the Sintel flow data

oflibpytorch.load_sintel_mask(path: str) → torch.Tensor¶

Loads the invalid pixels contained in Sintel .png mask files, as a boolean mask marking valid pixels with True. Follows the official instructions provided alongside the .flo data on the Sintel optical flow dataset website.

Parameters: path – String containing the path to the Sintel invalid pixel data (.png, black and white)
Returns: A torch tensor containing the Sintel valid pixels (mask) data

Flow Manipulation¶

oflibpytorch.resize_flow(flow: Union[numpy.ndarray, torch.Tensor], scale: Union[float, int, list, tuple]) → torch.Tensor¶

Resize a flow field numpy array or torch tensor, scaling the flow vectors values accordingly.

The output flow field is differentiable with respect to the input flow field, if given as a torch tensor.

Parameters

flow – Flow field as a numpy array or torch tensor, shape \((2, H, W)\), \((H, W, 2)\), \((N, 2, H, W)\), or \((N, H, W, 2)\)
scale –
Scale used for resizing, options:
- Integer or float of value scaling applied both vertically and horizontally
- List or tuple of shape \((2)\) with values [vertical scaling, horizontal scaling]

Returns

Scaled flow field as a torch tensor, shape \((2, H, W)\) or \((N, 2, H, W)\), depending on input

oflibpytorch.invert_flow(flow: Union[numpy.ndarray, torch.Tensor], input_ref: str, output_ref: Optional[str] = None) → torch.Tensor¶

Inverting a flow: img₁ – f –> img₂ becomes img₁ <– f – img₂. The smaller the input flow, the closer the inverse is to simply multiplying the flow by -1.

The output flow field tensor is differentiable with respect to the input flow field, if given as a tensor.

Parameters

flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down.
input_ref – Reference of the input flow field, either s or t
output_ref – Desired reference of the output field, either s or t. Defaults to input_ref

Returns

Flow field as a torch tensor of shape \((2, H, W)\) or \((N, 2, H, W)\)

oflibpytorch.switch_flow_ref(flow: Union[numpy.ndarray, torch.Tensor], input_ref: str) → torch.Tensor¶

Recalculate flow vectors to correspond to a switched flow reference (see Flow reference ref)

The output flow field tensor is differentiable with respect to the input flow field, if given as a tensor.

Parameters

flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down.
input_ref – The reference of the input flow field, either s or t

Returns

Flow field as a torch tensor of shape \((2, H, W)\) or \((N, 2, H, W)\)

oflibpytorch.combine_flows(input_1: Union[Flow, torch.Tensor, numpy.ndarray], input_2: Union[Flow, torch.Tensor, numpy.ndarray], mode: int, ref: Optional[str] = None, thresholded: Optional[bool] = None) → Union[Flow, torch.Tensor]¶

Returns the result of the combination of two flow fields of the same shape and reference

If the toolbox-wide variable PURE_PYTORCH is set to True (default, see also set_pure_pytorch()), the output flow field tensor is differentiable with respect to the input flow fields, if given as tensors.

Tip

All of the flow field combinations in this function rely on some combination of the apply(), invert(), and combine_with() methods.

If PURE_PYTORCH is set to False, some of these methods will call scipy.interpolate.griddata(), possibly multiple times, which can be very slow (several seconds) - but the result will be more accurate compared to using the PyTorch-only setting. The table below aids decision-making with regards to which reference a flow field should be provided in to obtain the fastest result.

Calls to scipy.interpolate.griddata()¶

mode

ref = 's'

ref = 't'

1

1

3

2

1

1

3

0

0

All formulas used in this function have been derived from first principles. The base formula is \(flow_1 ⊕ flow_2 = flow_3\), where \(⊕\) is a non-commutative flow composition operation. This can be visualised with the start / end points of the flows as follows:

S = Start point    S1 = S3 ─────── f3 ────────────┐
E = End point       │                             │
f = flow           f1                             v
                    └───> E1 = S2 ── f2 ──> E2 = E3

The main difficulty in combining flow fields is that it would be incorrect to simply add up or subtract flow vectors at one location in the flow field area \(H \times W\). This appears to work given e.g. a translation to the right, and a translation downwards: the result will be the linear combination of the two vectors, or a translation towards the bottom right. However, looking more closely, it becomes evident that this approach isn’t actually correct: A pixel that has been moved from S1 to E1 by the first flow field f1 is then moved from that location by the flow vector of the flow field f2 that corresponds to the new pixel location E1, not the original location S1. If the flow vectors are the same everywhere in the field, the difference will not be noticeable. However, if the flow vectors of f2 vary throughout the field, such as with a rotation around some point, it will!

In this case (corresponding to calling combine_flows(f1, f2, mode=3)), and if the flow reference is s (“source”), the solution is to first apply the inverse of f1 to f2, essentially linking up each location E1 back to S1, and then to add up the flow vectors. Analogous observations apply for the other permutations of flow combinations and references.

Note

This is consistent with the observation that two translations are commutative in their application - the order does not matter, and the vectors can simply be added up at every pixel location -, while a translation followed by a rotation is not the same as a rotation followed by a translation: adding up vectors at each pixel cannot be the correct solution as there wouldn’t be a difference based on the order of vector addition.

Parameters

input_1 – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down. Can also be a flow object, but this will be deprecated soon
input_2 – Second input flow, same type as input_1
mode –
Integer determining how the input flows are combined, where the number corresponds to the position in the formula \(flow_1 ⊕ flow_2 = flow_3\):
- Mode 1: input_1 corresponds to \(flow_2\), input_2 corresponds to \(flow_3\), the result will be \(flow_1\)
- Mode 2: input_1 corresponds to \(flow_1\), input_2 corresponds to \(flow_3\), the result will be \(flow_2\)
- Mode 3: input_1 corresponds to \(flow_1\), input_2 corresponds to \(flow_2\), the result will be \(flow_3\)
ref – The reference of the input flow fields, either s or t
thresholded – Boolean determining whether flows are thresholded during an internal call to is_zero(), defaults to False

Returns

Flow object if inputs are flow objects (deprecated in future, avoid), Torch tensor of shape \((2, H, W)\) or \((N, 2, H, W)\) as standard

Flow Application¶

oflibpytorch.apply_flow(flow: Union[numpy.ndarray, torch.Tensor], target: torch.Tensor, ref: str, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None) → torch.Tensor¶

Uses a given flow to warp a target. The flow reference, if not given, is assumed to be t. Optionally, a mask can be passed which (only for flows in s reference) masks undesired (e.g. undefined or invalid) flow vectors.

If PURE_PYTORCH is set to True (default, see also set_pure_pytorch()), the output is differentiable with respect to the input flow and target. If PURE_PYTORCH is False (see also unset_pure_pytorch()) and ref is s, the more accurate function scipy.interpolate.griddata() is used. This is not only significantly slower, but also means the output does not have a grad_fn and is therefore not differentiable in the PyTorch context.

Parameters

flow – Flow field as a numpy array or torch tensor, shape \((2, H, W)\), \((H, W, 2)\), \((N, 2, H, W)\), or \((N, H, W, 2)\)
target – Torch tensor containing the content to be warped, with shape \((H, W)\), \((C, H, W)\), or \((N, C, H, W)\)
ref – Reference of the flow, t or s
mask – Flow mask as numpy array or torch tensor, with shape \((H, W)\) or \((N, H, W)\), matching the flow field. Only relevant for s flows. Defaults to True everywhere

Returns

Torch tensor of the same shape as the target, with the content warped by the flow

oflibpytorch.track_pts(flow: Union[numpy.ndarray, torch.Tensor], ref: str, pts: torch.Tensor, int_out: Optional[bool] = None) → torch.Tensor¶

Warp input points with a flow field, returning the warped point coordinates as integers if required.

If PURE_PYTORCH is set to True (default, see also set_pure_pytorch()), the output is differentiable with respect to the input flow and pts. If PURE_PYTORCH is False (see also unset_pure_pytorch()) and ref is t, the more accurate function scipy.interpolate.griddata() is used. This is not only significantly slower, but also means the output does not have a grad_fn and is therefore not differentiable in the PyTorch context.

Parameters

flow – Flow field as a numpy array or torch tensor, shape \((2, H, W)\), \((H, W, 2)\), \((N, 2, H, W)\), or \((N, H, W, 2)\)
ref – Flow field reference, either s or t
pts – Torch tensor of shape \((M, 2)\) or \((N, M, 2)\) containing the point coordinates. If a batch dimension is given, it must be 1 or correspond to the flow batch dimension. If the flow is batched but the points are not, the same points are warped by each flow field individually. pts[:, 0] corresponds to the vertical coordinate, pts[:, 1] to the horizontal coordinate
int_out – Boolean determining whether output points are returned as rounded integers, defaults to False

Returns

Torch tensor of warped (‘tracked’) points, tensor device same as the input flow field

Flow Evaluation¶

oflibpytorch.is_zero_flow(flow: Union[numpy.ndarray, torch.Tensor], thresholded: Optional[bool] = None) → torch.Tensor¶

Check whether all flow vectors are zero. Optionally, a threshold flow magnitude value of 1e-3 is used. This can be useful to filter out motions that are equal to very small fractions of a pixel, which might just be a computational artefact to begin with.

Parameters

flow – Flow field as a numpy array or torch tensor, shape \((2, H, W)\), \((H, W, 2)\), \((N, 2, H, W)\), or \((N, H, W, 2)\)
thresholded – Boolean determining whether the flow is thresholded, defaults to True

Returns

Tensor of (batch) shape \((N)\) which is True if the flow field is zero everywhere, otherwise False

oflibpytorch.get_flow_matrix(flow: Union[numpy.ndarray, torch.Tensor], ref: str, dof: Optional[int] = None, method: Optional[str] = None) → torch.Tensor¶

Fit a transformation matrix to the flow field using OpenCV functions

Parameters

flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down.
ref – Reference of the flow field, s or t
dof –
Integer describing the degrees of freedom in the transformation matrix to be fitted, defaults to 8. Options are:
- 4: Partial affine transform with rotation, translation, scaling
- 6: Affine transform with rotation, translation, scaling, shearing
- 8: Projective transform, i.e estimation of a homography
method –
String describing the method used to fit the transformations matrix by OpenCV, defaults to ransac. Options are:
- lms: Least mean squares
- ransac: RANSAC-based robust method
- lmeds: Least-Median robust method

Returns

Torch tensor of shape \((3, 3)\) or \((N, 3, 3)\) containing the transformation matrix

oflibpytorch.valid_target(flow: Union[numpy.ndarray, torch.Tensor], ref: str) → torch.Tensor¶

Find the valid area in the target domain

Given a source image and a flow, both of shape \((H, W)\), the target image is created by warping the source with the flow. The valid area is then a boolean numpy array of shape \((H, W)\) that is True wherever the value in the target img stems from warping a value from the source, and False where no valid information is known.

Pixels that are False will often be black (or ‘empty’) in the warped target image - but not necessarily, due to warping artefacts etc. The valid area also allows a distinction between pixels that are black due to no actual information being available at this position (validity False), and pixels that are black due to black pixel values having been warped to that (valid) location by the flow.

Parameters

flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down.
ref – Reference of the flow field, s or t

Returns

Boolean torch tensor of the same shape \((H, W)\) or \((N, H, W)\) as the flow

oflibpytorch.valid_source(flow: Union[numpy.ndarray, torch.Tensor], ref: str) → torch.Tensor¶

Finds the area in the source domain that will end up being valid in the target domain (see valid_target()) after warping

Given a source image and a flow, both of shape \((H, W)\), the target image is created by warping the source with the flow. The source area is then a boolean numpy array of shape \((H, W)\) that is True wherever the value in the source will end up somewhere inside the valid target area, and False where the value in the source will either be warped outside of the target image, or not be warped at all due to a lack of valid flow vectors connecting to this position.

Parameters

flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down.
ref – Reference of the flow field, s or t

Returns

Boolean torch tensor of the same shape \((H, W)\) or \((N, H, W)\) as the flow

oflibpytorch.get_flow_padding(flow: Union[numpy.ndarray, torch.Tensor], ref: str) → list¶

Determine necessary padding from the flow field:

When the flow reference is t (“target”), this corresponds to the padding needed in a source image which ensures that every flow vector will find a value in the source domain to warp towards the target domain. I.e. any invalid locations in the area \(H \times W\) of the target domain (see valid_target()) are purely due to no valid flow vector being available to pull a source value to this target location, rather than no source value being available in the first place.
When the flow reference is s (“source”), this corresponds to the padding needed for the flow itself, so that applying it to a source image will result in no input image information being lost in the warped output, i.e each input image pixel will come to lie inside the padded area.

Parameters

flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down.
ref – Reference of the flow field, s or t

Returns

A list of shape \((4)\) or \((N, 4)\) with the values [top, bottom, left, right] for each batch member, if applicable

Flow Visualisation¶

oflibpytorch.visualise_flow(flow: Union[numpy.ndarray, torch.Tensor], mode: str, range_max: Optional[float] = None, return_tensor: Optional[bool] = None) → Union[numpy.ndarray, torch.Tensor]¶

Visualises the flow as an rgb / bgr / hsv image

Note

This currently runs internally based on NumPy & OpenCV, due to a lack of easily accessible equivalent functions for coordinate and colour space conversions. Therefore, even if the output is a tensor, it will not be differentiable with respect to the input flow tensor.

Parameters

flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down.
mode – Output mode, options: rgb, bgr, hsv
range_max – Maximum vector magnitude expected, corresponding to the HSV maximum Value of 255 when scaling the flow magnitudes. Defaults to the 99th percentile of the flow field magnitudes
return_tensor – Boolean determining whether the result is returned as a tensor. Note that the result is originally a numpy array. Defaults to True

Returns

Numpy array of shape \((H, W, 3)\) or \((N, H, W, 3)\) or torch tensor of shape \((3, H, W)\) or \((N, 3, H, W)\) containing the flow visualisation

oflibpytorch.visualise_flow_arrows(flow: Union[numpy.ndarray, torch.Tensor], ref: str, grid_dist: Optional[int] = None, img: Optional[numpy.ndarray] = None, scaling: Optional[Union[float, int]] = None, colour: Optional[tuple] = None, thickness: Optional[int] = None, return_tensor: Optional[bool] = None) → Union[numpy.ndarray, torch.Tensor]¶

Visualises the flow as arrowed lines

Note

This currently runs internally based on NumPy & OpenCV, due to a lack of easily accessible equivalent functions for coordinate and colour space conversions. Therefore, even if the output is a tensor, it will not be differentiable with respect to the input flow tensor.

Parameters

flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down.
ref – Reference of the flow field, s or t
grid_dist – Integer of the distance of the flow points to be used for the visualisation, defaults to 20
img – Numpy array with the background image to use (in BGR mode), defaults to white
scaling – Float or int of the flow line scaling, defaults to scaling the 99th percentile of arrowed line lengths to be equal to twice the grid distance (empirical value)
colour – Tuple of the flow arrow colour, defaults to hue based on flow direction as in visualise()
thickness – Integer of the flow arrow thickness, larger than zero. Defaults to 1
return_tensor – Boolean determining whether the result is returned as a tensor. Note that the result is originally a numpy array. Defaults to True

Returns

Numpy array of shape \((H, W, 3)\) or \((N, H, W, 3)\) or torch tensor of shape \((3, H, W)\) or \((N, 3, H, W)\) containing the flow visualisation

oflibpytorch.show_flow(flow: Union[numpy.ndarray, torch.Tensor], wait: Optional[int] = None)¶

Shows the flow in an OpenCV window using visualise()

Parameters

flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) if possible, otherwise as \((H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down.
wait – Integer determining how long to show the flow for, in milliseconds. Defaults to 0, which means it will be shown until the window is closed, or the process is terminated

oflibpytorch.show_flow_arrows(flow: Union[numpy.ndarray, torch.Tensor], ref: str, wait: Optional[int] = None, grid_dist: Optional[int] = None, img: Optional[numpy.ndarray] = None, scaling: Optional[Union[float, int]] = None, colour: Optional[tuple] = None)¶

Shows the flow in an OpenCV window using visualise_arrows()

Parameters

flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) if possible, otherwise as \((H, W, 2)\), throwing a ValueError if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention: flow_vectors[..., 0] are the horizontal, flow_vectors[..., 1] are the vertical vector components, defined as positive when pointing to the right / down.
ref – Reference of the flow field, s or t
wait – Integer determining how long to show the flow for, in milliseconds. Defaults to 0, which means it will be shown until the window is closed, or the process is terminated
grid_dist – Integer of the distance of the flow points to be used for the visualisation, defaults to 20
img – Numpy array with the background image to use (in BGR colour space), defaults to black
scaling – Float or int of the flow line scaling, defaults to scaling the 99th percentile of arrowed line lengths to be equal to twice the grid distance (empirical value)
colour – Tuple of the flow arrow colour, defaults to hue based on flow direction as in visualise()

Utility methods¶

Additionally, some utility methods used in this module are made available here, as they may prove useful to others.

oflibpytorch.to_numpy(tensor: torch.Tensor, switch_channels: Optional[bool] = None) → numpy.ndarray¶

Tensor to numpy, calls .cpu() if necessary

Parameters

tensor – Input tensor, may have gradient, may be on GPU
switch_channels – Boolean determining whether the channels are moved from the second to the last dimension, assuming the input is of shape \((N, C, H, W)\), changing it to \((N, H, W, C)\). defaults to False

Returns

Numpy array, with channels switched if required

oflibpytorch.to_tensor(array: numpy.ndarray, switch_channels: Optional[str] = None, device: Optional[Union[torch.device, int, str]] = None) → torch.Tensor¶

Moves a NumPy array to a tensor on the desired device, swapping axis positions if required

Parameters

array – Input array
switch_channels – String determining whether the channels are moved from the last to the first dimension (if ‘single’), or from last to second dimension (if ‘batched’). Defaults to None (no channels moved)
device – Tensor device, either a torch.device or a valid input to torch.device(), such as a string (cpu or cuda). For a device of type cuda, the device index defaults to torch.cuda.current_device(). If the input is None, it defaults to torch.device('cpu')

Returns

Torch tensor, with channels switched if required

oflibpytorch.move_axis(input_tensor: torch.Tensor, source: int, destination: int) → torch.Tensor¶

Helper function to imitate np.moveaxis.

Output tensor differentiable with respect to input tensor.

Parameters

input_tensor – Input torch tensor, e.g. \((N, H, W, C)\)
source – Source position of the dimension to be moved, e.g. -1
destination – Target position of the dimension to be moved, e.g. 1

Returns

Output torch tensor, e.g. \((N, C, H, W)\)

oflibpytorch.show_masked_image(img: Union[torch.Tensor, numpy.ndarray], mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None) → numpy.ndarray¶

Mimics flow.show() for an input image and a mask, i.e. uses the OpenCV imshow() method to show the image in a window. Useful for debugging purposes or to quickly visualise an image, even when no mask is involved, as it is easier than typing out all the OpenCV commands every time

Parameters

img – Torch tensor of shape \((3, H, W)\) or numpy array of shape \((H, W, 3)\), BGR input image
mask – Torch tensor or numpy array of shape \((H, W)\), boolean mask showing the valid area

Returns

Masked image, in BGR colour space

oflibpytorch.apply_s_flow(flow: torch.Tensor, data: torch.Tensor, mask: Optional[torch.Tensor] = None, occlude_zero_flow: Optional[bool] = None) → torch.Tensor¶

Warp data with a given flow of ref s (“forward flow”), making use of the inverse bilinear interpolation function grid_from_unstructured_data() as a replacement for scipy.interpolate.griddata().

The warped data output is differentiable wrt input flow and input data

Parameters

flow – Float tensor of shape \((N, 2, H, W)\), the input flow with reference s
data – Float tensor of shape \((N, C, H, W)\), the data to be warped
mask – Boolean tensor of shape \((N, H, W)\), the mask belonging to the flow (optional)
occlude_zero_flow – Boolean determining whether data locations with corresponding flow of zero are occluded (overwritten) by other data moved to the same location. Rationale: if the order of objects were reversed, i.e. the zero flow points occlude the non-zero flow points, the latter wouldn’t be known in the first place. This logic breaks down when the flow points concerned have been inferred, e.g. from surrounding non-occluded known points. Setting it to False will likely lead to unusual artefacts in the output. Defaults to True

Returns

Warped data as float tensor of shape \((N, C, H, W)\), mask of where data points where warped to as bool tensor of shape \((N, H, W)\). If occlude_zero_flow is True, this mask does not include zero flow points as they have not been used in the interpolation to avoid artefacts.

oflibpytorch.grid_from_unstructured_data(x: torch.Tensor, y: torch.Tensor, data: torch.Tensor, mask: Optional[torch.Tensor] = None) → tuple¶

Returns unstructured input data interpolated (likely sparsely) on to a regular grid. Replacement for the SciPy griddata function, but less accurate. Equivalent to inverse bilinear interpolation. Credit:

This is based on the algorithm suggested in: Sánchez, J., Salgado de la Nuez, A. J., & Monzón, N., “Direct estimation of the backward flow”, 2013
The code implementation is a heavily reworked version of code suggested by Markus Hofinger in private correspondence, a version of which he first used in: Hofinger, M., Bulò, S. R., Porzi, L., Knapitsch, A., Pock, T., & Kontschieder, P., “Improving optical flow on a pyramid level”. ECCV 2020
Markus Hofinger in turn credits the function _flow2distribution from the HD3 code base as inspiration, used in: Yin, Z., Darrell, T., & Yu, F., “Hierarchical discrete distribution decomposition for match density estimation”, CAPER 2019

The interpolated data as well as the interpolation density outputs are differentiable with respect to the input data position grids and the data itself.

Parameters

x – Horizontal data position grid, shape \((N, H, W)\)
y – Vertical data position grid, shape \((N, H, W)\)
data – Data value grid, shape \((N, C, H, W)\)
mask – Tensor masking data positions that should be ignored for the interpolation, shape \((N, H, W)\)

Returns

Tensor of data interpolated on regular grid of shape \((N, C, H, W)\), Tensor of interpolation density of shape \((N, H, W)\)