Documentation¶
While this documentation aims to go beyond a simple listing of parameters and instead attempts to explain some of the principles behind the functions, please see the section “Usage” for more details and usage examples including code and flow field visualisations.
“Pure PyTorch” Setting¶
PURE_PYTORCH
is a toolbox-wide boolean variable which can be set. It enables the user to choose whether
oflibpytorch
makes use of a more accurate, but much slower and non-differentiable SciPy-based function, or a
less precise, but significantly faster and fully differentiable PyTorch-only function. By default, PURE_PYTORCH
is set to True
.
-
oflibpytorch.
get_pure_pytorch
()¶ Returns the state of the toolbox-wide boolean variable
PURE_PYTORCH
. IfTrue
, a PyTorch-only method replacesscipy.interpolate.griddata()
. The latter, while significantly slower and not differentiable, provides a more accurate result.
-
oflibpytorch.
set_pure_pytorch
(warn: Optional[bool] = None)¶ Set the state of the toolbox-wide boolean variable
PURE_PYTORCH
toTrue
. This means a faster PyTorch-only method is used instead ofscipy.interpolate.griddata()
, affording significant speed increases (an order of magnitude). It also means all main methods that output a tensor are differentiable in the PyTorch context. However, the results are less accurate.- Parameters
warn – Boolean determining whether a warning is printed in console. Useful for debugging, defaults to
False
-
oflibpytorch.
unset_pure_pytorch
(warn: Optional[bool] = None)¶ Set the state of the toolbox-wide boolean variable
PURE_PYTORCH
toFalse
. This meansscipy.interpolate.griddata()
is used instead of a faster PyTorch-only method. The results will be more accurate, but significantly slower (an order of magnitude). Most importantly, not all methods will be differentiable anymore.- Parameters
warn – Boolean determining whether a warning is printed in console. Useful for debugging, defaults to
False
Using the Flow Class¶
This section documents the custom flow class and all its class methods. It is the recommended way of using
oflibpytorch
and makes the full range of functionality available to the user.
Flow Constructors and Operators¶
-
class
oflibpytorch.
Flow
(flow_vectors: Union[numpy.ndarray, torch.Tensor], ref: Optional[str] = None, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None, device: Optional[Union[torch.device, int, str]] = None)¶ -
__init__
(flow_vectors: Union[numpy.ndarray, torch.Tensor], ref: Optional[str] = None, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None, device: Optional[Union[torch.device, int, str]] = None)¶ Flow object constructor. For a more detailed explanation of the arguments, see the class attributes
vecs
,ref
,mask
, anddevice
.- Parameters
flow_vectors – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down.ref – Flow reference, either
t
for “target”, ors
for “source”. Defaults tot
mask – Numpy array or pytorch tensor of shape \((H, W)\) containing a boolean mask indicating where the flow vectors are valid. Defaults to
True
everywheredevice – Tensor device, either a
torch.device
or a valid input totorch.device()
, such as a string (cpu
orcuda
). For a device of typecuda
, the device index defaults totorch.cuda.current_device()
. If the input isNone
, it defaults to the device of the given flow_vectors, ortorch.device('cpu')
if the flow_vectors are a numpy array
-
property
vecs
¶ Flow vectors, a torch tensor of shape \((N, 2, H, W)\). The first dimension contains the batch size, the second the flow vectors. These are in the order horizontal component first, vertical component second (OpenCV convention). They are defined as positive towards the right and the bottom, meaning the origin is located in the left upper corner of the \(H \times W\) flow field area.
- Returns
Flow vectors as torch tensor of shape \((N, 2, H, W)\), dtype
float
, deviceself.device
-
property
vecs_numpy
¶ Convenience function to get the flow vectors as a numpy array of shape \((N, H, W, 2)\). Otherwise same as
vecs
: The last dimension contains the flow vectors, in the order of horizontal component first, vertical component second (OpenCV convention). They are defined as positive towards the right and the bottom, meaning the origin is located in the left upper corner of the \(H \times W\) flow field area.- Returns
Flow vectors as a numpy array of shape \((N, H, W, 2)\), dtype
float32
-
property
ref
¶ Flow reference, a string: either
s
for “source” ort
for “target”. This determines whether the regular grid of shape \((H, W)\) associated with the flow vectors should be understood as the source of the vectors (which then point to any other position), or the target of the vectors (whose start point can then be any other position). The flow referencet
is the default, meaning the regular grid refers to the coordinates the pixels whose motion is being recorded by the vectors end up at.Applying a flow with reference
s
is known as “forward” warping, while referencet
corresponds to what is termed “backward” or “reverse” warping.Caution
If
PURE_PYTORCH
is set toFalse
, callingapply()
on a flow field with referenceref
s
(“source”) requires a call toscipy.interpolate.griddata()
, which is quite slow. Using a flow field with referenceref
t
avoids this and will therefore be significantly faster. Similarly, callingtrack()
on a flow field with referenceref
t
(“source”) also requires a call toscipy.interpolate.griddata()
, in which case using a flow field with referenceref
s
instead is faster.If
PURE_PYTORCH
isTrue
, the call toscipy.interpolate.griddata()
is replaced with a PyTorch-only interpolation function which will yield slightly less accurate result, but avoids any speed penalty and, most notably, is differentiable.Tip
If some algorithm
get_flow()
is set up to calculate a flow field with referencet
(ors
) as inflow_one_ref = get_flow(img1, img2)
, it is very simple to obtain the flow in references
(ort
) instead: simply call the algorithm with the images in the reversed order, and multiply the resulting flow vectors by -1:flow_other_ref = -1 * get_flow(img2, img1)
- Returns
Flow reference, as string of value
t
ors
-
property
mask
¶ Flow mask as a torch tensor of shape \((N, H, W)\) and type
bool
. This array indicates, for each flow vector, whether it is considered “valid”. As an example, this allows for masking of the flow based on object segmentations. It is also necessary to keep track of which flow vectors are valid when different flow fields are combined, as those operations often lead to undefined (partially or fully unknown) points in the given \(H \times W\) area where the flow vectors are either completely unknown, or will not have valid values.- Returns
Flow mask as a torch tensor of shape \((N, H, W)\) and type
bool
-
property
mask_numpy
¶ Convenience function to get the mask as a numpy array of shape \((N, H, W)\). Otherwise same as
mask
: this array indicates, for each flow vector, whether it is considered “valid”. As an example, this allows for masking of the flow based on object segmentations. It is also necessary to keep track of which flow vectors are valid when different flow fields are combined, as those operations often lead to undefined (partially or fully unknown) points in the given \(H \times W\) area where the flow vectors are either completely unknown, or will not have valid values.- Returns
mask as a numpy array of shape \((N, H, W)\) and type
bool
-
property
device
¶ The device of all flow object tensors, as a
torch.device
- Returns
Tensor device as a
torch.device
-
property
shape
¶ Shape (resolution) \((N, H, W)\) of the flow, corresponding to the batch size (can be 1) and the flow field shape \((H, W)\)
- Returns
Tuple of the shape (resolution) \((N, H, W)\) of the flow object
-
classmethod
zero
(shape: Union[list, tuple], ref: Optional[str] = None, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None, device: Optional[Union[torch.device, int, str]] = None) → Flow¶ Flow object constructor, zero everywhere
- Parameters
shape – List or tuple of the shape \((H, W)\) or \((N, H, W)\) of the flow field
ref – Flow reference, string of value
t
(“target”) ors
(“source”). Defaults tot
mask – Numpy array or torch tensor of shape \((H, W)\) or \((N, H, W)\) and type
bool
indicating where the flow vectors are valid. Defaults toTrue
everywheredevice – Tensor device, either a
torch.device
or a valid input totorch.device()
, such as a string (cpu
orcuda
). For a device of typecuda
, the device index defaults totorch.cuda.current_device()
. If the input isNone
, it defaults totorch.device('cpu')
- Returns
Flow object, zero everywhere
-
classmethod
from_matrix
(matrix: Union[numpy.ndarray, torch.Tensor], shape: Union[list, tuple], ref: Optional[str] = None, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None, device: Optional[Union[torch.device, int, str]] = None, matrix_is_inverse: Optional[bool] = None) → Flow¶ Flow object constructor, based on transformation matrix input
The output flow vectors are differentiable with respect to the input matrix, if given as a tensor.
- Parameters
matrix – Transformation matrix to be turned into a flow field, as numpy array or torch tensor of shape \((3, 3)\) or \((N, 3, 3)\)
shape – List or tuple of the shape \((H, W)\) of the flow field
ref – Flow reference, string of value
t
(“target”) ors
(“source”). Defaults tot
mask – Numpy array or torch tensor of shape \((H, W)\) and type
bool
indicating where the flow vectors are valid. Defaults toTrue
everywheredevice – Tensor device, either a
torch.device
or a valid input totorch.device()
, such as a string (cpu
orcuda
). For a device of typecuda
, the device index defaults totorch.cuda.current_device()
. If the input isNone
, it defaults totorch.device('cpu')
matrix_is_inverse – Boolean determining whether the given matrix is already the inverse of the desired transformation. Is useful for flow with reference
t
to avoid calculation of the pseudo-inverse, but will throw aValueError
if used for flow with references
to avoid accidental usage. Defaults toFalse
- Returns
Flow object
-
classmethod
from_transforms
(transform_list: list, shape: Union[list, tuple], ref: Optional[str] = None, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None, device: Optional[Union[torch.device, int, str]] = None, padding: Optional[list] = None) → Flow¶ Flow object constructor, based on a list of transforms. If padding values are given, the given shape is padded accordingly. The transforms values are also adjusted, e.g. by shifting scaling and rotation centres.
- Parameters
transform_list –
List of transforms to be turned into a flow field, where each transform is expressed as a list of [
transform name
,transform value 1
, … ,transform value n
]. Supported options:Transform
translation
, with valueshorizontal shift in px
,vertical shift in px
Transform
rotation
, with valueshorizontal centre in px
,vertical centre in px
,angle in degrees, counter-clockwise
Transform
scaling
, with valueshorizontal centre in px
,vertical centre in px
,scaling fraction
shape – List or tuple of the shape \((H, W)\) of the flow field
ref – Flow reference, string of value
t
(“target”) ors
(“source”). Defaults tot
mask – Numpy array or torch tensor of shape \((H, W)\) and type
bool
indicating where the flow vectors are valid. Defaults toTrue
everywheredevice – Tensor device, either a
torch.device
or a valid input totorch.device()
, such as a string (cpu
orcuda
). For a device of typecuda
, the device index defaults totorch.cuda.current_device()
. If the input isNone
, it defaults totorch.device('cpu')
padding – List or tuple of shape \((4)\) with padding values
[top, bot, left, right]
- Returns
Flow object
-
classmethod
from_kitti
(path: str, load_valid: Optional[bool] = None, device: Optional[Union[torch.device, int, str]] = None) → Flow¶ Loads the flow field contained in KITTI
uint16
png images files, optionally including the valid pixels. Follows the official instructions on how to read the provided .png files on the KITTI optical flow dataset website.- Parameters
path – String containing the path to the KITTI flow data (
uint16
, .png file)load_valid – Boolean determining whether the valid pixels are loaded as the flow
mask
. Defaults toTrue
device – Tensor device, either a
torch.device
or a valid input totorch.device()
, such as a string (cpu
orcuda
). For a device of typecuda
, the device index defaults totorch.cuda.current_device()
. If the input isNone
, it defaults totorch.device('cpu')
- Returns
A flow object corresponding to the KITTI flow data, with flow reference
ref
s
.
-
classmethod
from_sintel
(path: str, inv_path: Optional[str] = None, device: Optional[Union[torch.device, int, str]] = None) → Flow¶ Loads the flow field contained in Sintel .flo byte files, including the invalid pixels if required. Follows the official instructions provided alongside the .flo data on the Sintel optical flow dataset website.
- Parameters
path – String containing the path to the Sintel flow data (.flo byte file, little Endian)
inv_path – String containing the path to the Sintel invalid pixel data (.png, black and white)
device – Tensor device, either a
torch.device
or a valid input totorch.device()
, such as a string (cpu
orcuda
). For a device of typecuda
, the device index defaults totorch.cuda.current_device()
. If the input isNone
, it defaults totorch.device('cpu')
- Returns
A flow object corresponding to the Sintel flow data, with flow reference
ref
s
-
copy
() → Flow¶ Copy a flow object by constructing a new one with the same vectors
vecs
, referenceref
, maskmask
, and devicedevice
The output flow vectors are differentiable with respect to the input flow vectors.
- Returns
Copy of the flow object
-
to_device
(device) → Flow¶ Returns a new flow object on the desired torch device
The output flow vectors are differentiable with respect to the input flow vectors.
- Parameters
device – Tensor device, either a
torch.device
or a valid input totorch.device()
, such as a string (cpu
orcuda
). For a device of typecuda
, the device index defaults totorch.cuda.current_device()
. If the input isNone
, it defaults totorch.device('cpu')
- Returns
New flow object on the desired torch device
-
__str__
() → str¶ Enhanced string representation of the flow object, containing the flow reference
ref
, shapeshape
, and devicedevice
- Returns
String representation
-
select
(item: Optional[int] = None) → Flow¶ Returns a single-item flow object from a batched flow object, e.g. for iterating through or visualising
The output flow vectors are differentiable with respect to the input flow vectors.
-
__getitem__
(item: Union[int, list, slice]) → Flow¶ Mimics
__getitem__
of a torch tensor, returning a new flow object cut accordinglyThe output flow vectors are differentiable with respect to the input flow vectors.
Will throw an error if
mask.__getitem__(item)
orvecs.__getitem__(item)
(corresponding tomask[item]
andvecs[item]
) throw an error. Also throws an error if slicedvecs
ormask
are not suitable to construct a new flow object with, e.g. if the number of dimensions is too low.- Parameters
item – Slice used to select a part of the flow
- Returns
New flow object cut as a corresponding torch tensor would be cut
-
__add__
(other: Union[numpy.ndarray, torch.Tensor, Flow]) → Flow¶ Adds a flow object, a numpy array, or a torch tensor to a flow object
The output flow vectors are differentiable with respect to the input flow vectors.
Caution
This is not equal to applying the two flows sequentially. For that, use
combine_flows()
withmode
set to3
.Caution
If this method is used to add two flow objects, there is no check on whether they have the same reference
ref
.- Parameters
other – Flow object, numpy array, or torch tensor corresponding to the addend. Adding a flow object will adjust the mask of the resulting flow object to correspond to the logical union of the augend / addend masks. If a batch dimension is given, it has to match the batch dimension of the flow object, or one of them needs to be 1 in order to be broadcast correctly
- Returns
New flow object corresponding to the sum
-
__sub__
(other: Union[numpy.ndarray, torch.Tensor, Flow]) → Flow¶ Subtracts a flow object, a numpy array, or a torch tensor from a flow object
The output flow vectors are differentiable with respect to the input flow vectors.
Caution
This is not equal to subtracting the effects of applying flow fields to an image. For that, use
combine_flows()
withmode
set to1
or2
.Caution
If this method is used to subtract two flow objects, there is no check on whether they have the same reference
ref
.- Parameters
other – Flow object, numpy array, or torch tensor corresponding to the subtrahend. Subtracting a flow object will adjust the mask of the resulting flow object to correspond to the logical union of the minuend / subtrahend masks. If a batch dimension is given, it has to match the batch dimension of the flow object, or one of them needs to be 1 in order to be broadcast correctly
- Returns
New flow object corresponding to the difference
-
__mul__
(other: Union[float, int, bool, list, numpy.ndarray, torch.Tensor]) → Flow¶ Multiplies a flow object with a single number, a list, a numpy array, or a torch tensor
The output flow vectors are differentiable with respect to the input flow vectors.
- Parameters
other –
Multiplier, options:
can be converted to a float
a list of shape \((2)\)
a numpy array or torch tensor of the same shape \((H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((H, W, 2)\) or \((2, H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((N, 2, H, W)\) as the flow object
- Returns
New flow object corresponding to the product
-
__truediv__
(other: Union[float, int, bool, list, numpy.ndarray, torch.Tensor]) → Flow¶ Divides a flow object by a single number, a list, a numpy array, or a torch tensor
The output flow vectors are differentiable with respect to the input flow vectors.
- Parameters
other –
Divisor, options:
can be converted to a float
a list of shape \((2)\)
a numpy array or torch tensor of the same shape \((H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((H, W, 2)\) or \((2, H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((N, 2, H, W)\) as the flow object
- Returns
New flow object corresponding to the quotient
-
__pow__
(other: Union[float, int, bool, list, numpy.ndarray, torch.Tensor]) → Flow¶ Exponentiates a flow object by a single number, a list, a numpy array, or a torch tensor
The output flow vectors are differentiable with respect to the input flow vectors.
- Parameters
other –
Exponent, options:
can be converted to a float
a list of shape \((2)\)
a numpy array or torch tensor of the same shape \((H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((H, W, 2)\) or \((2, H, W)\) as the flow object
a numpy array or torch tensor of the same shape \((N, 2, H, W)\) as the flow object
- Returns
New flow object corresponding to the power
-
__neg__
() → Flow¶ Returns a new flow object with all the flow vectors inverted
The output flow vectors are differentiable with respect to the input flow vectors.
Caution
This is not equal to inverting the transformation a flow field corresponds to! For that, use
invert()
.- Returns
New flow object with inverted flow vectors
-
Manipulating the Flow¶
-
Flow.
resize
(scale: Union[float, int, list, tuple]) → Flow¶ Resize a flow field, scaling the flow vectors values
vecs
accordingly.The output flow vectors are differentiable with respect to the input flow vectors.
- Parameters
scale –
Scale used for resizing, options:
Integer or float of value
scaling
applied both vertically and horizontallyList or tuple of shape \((2)\) with values
[vertical scaling, horizontal scaling]
- Returns
New flow object scaled as desired
-
Flow.
pad
(padding: Optional[list] = None, mode: Optional[str] = None) → Flow¶ Pad the flow with the given padding. Padded flow
vecs
values are either constant (set to0
), reflect the existing flow values along the edges, or replicate those edge values. Paddedmask
values are set toFalse
.The output flow vectors are differentiable with respect to the input flow vectors.
- Parameters
padding – List or tuple of shape \((4)\) with padding values
[top, bot, left, right]
mode – String of the numpy padding mode for the flow vectors, with options
constant
(fill value0
),reflect
,replicate
(see documentation fortorch.nn.functional.pad()
). Defaults toconstant
- Returns
New flow object with the padded flow field
-
Flow.
unpad
(padding: Optional[list] = None) → Flow¶ Cuts the flow according to the padding values, effectively undoing the effect of
pad()
The output flow vectors are differentiable with respect to the input flow vectors.
- Parameters
padding – List or tuple of shape \((4)\) with padding values
[top, bot, left, right]
- Returns
New flow object, cut according to the padding values
-
Flow.
invert
(ref: Optional[str] = None) → Flow¶ Inverting a flow: img1 – f –> img2 becomes img1 <– f – img2. The smaller the input flow, the closer the inverse is to simply multiplying the flow by -1.
If the toolbox-wide variable
PURE_PYTORCH
is set toTrue
(default, see alsoset_pure_pytorch()
), the output flow field vectors are differentiable with respect to the input flow field vectors.- Parameters
ref – Desired reference of the output field, defaults to the reference of original flow field
- Returns
New flow object, inverse of the original
-
Flow.
switch_ref
(mode: Optional[str] = None) → Flow¶ Switch the reference
ref
betweens
(“source”) andt
(“target”)If the toolbox-wide variable
PURE_PYTORCH
is set toTrue
(default, see alsoset_pure_pytorch()
), the output flow field vectors are differentiable with respect to the input flow field vectors.Caution
Do not use
mode=invalid
if avoidable: it does not actually change any flow values, and the resulting flow object, when applied to an image, will no longer yield the correct result.- Parameters
mode –
Mode used for switching, available options:
invalid
: just the flow reference attribute is switched without any flow values being changed. This is functionally equivalent to simply assigningflow.ref = 't'
for a “source” flow orflow.ref = 's'
for a “target” flowvalid
: the flow field is switched to the other coordinate reference, with flow vectors recalculated accordingly
- Returns
New flow object with switched coordinate reference
-
Flow.
combine
(other: Flow, mode: int, ref: Optional[str] = None) → Flow¶ Function that returns the result of the combination of two flow objects of the same shape
shape
in whichever referenceref
required.If the toolbox-wide variable
PURE_PYTORCH
is set toTrue
(default, see alsoset_pure_pytorch()
), the output flow field vectors are differentiable with respect to the input flow fields.Tip
All of the flow field combinations in this function rely on some combination of the
apply()
,invert()
, andswitch_ref()
methods. IfPURE_PYTORCH
is set toFalse
, some of these methods will callscipy.interpolate.griddata()
, which can be very slow (several seconds) - but the result will be more accurate compared to using the PyTorch-only setting.All formulas used in this function have been derived from first principles. The base formula is \(flow_1 ⊕ flow_2 = flow_3\), where \(⊕\) is a non-commutative flow composition operation. This can be visualised with the start / end points of the flows as follows:
S = Start point S1 = S3 ─────── f3 ────────────┐ E = End point │ │ f = flow f1 v └───> E1 = S2 ── f2 ──> E2 = E3
The main difficulty in combining flow fields is that it would be incorrect to simply add up or subtract flow vectors at one location in the flow field area \(H \times W\). This appears to work given e.g. a translation to the right, and a translation downwards: the result will be the linear combination of the two vectors, or a translation towards the bottom right. However, looking more closely, it becomes evident that this approach isn’t actually correct: A pixel that has been moved from S1 to E1 by the first flow field f1 is then moved from that location by the flow vector of the flow field f2 that corresponds to the new pixel location E1, not the original location S1. If the flow vectors are the same everywhere in the field, the difference will not be noticeable. However, if the flow vectors of f2 vary throughout the field, such as with a rotation around some point, it will!
In this case (corresponding to calling
f1.combine(f2, mode=3)
), and if the flow referenceref
iss
(“source”), the solution is to first apply the inverse of f1 to f2, essentially linking up each location E1 back to S1, and then to add up the flow vectors. Analogous observations apply for the other permutations of flow combinations and referenceref
values.Note
This is consistent with the observation that two translations are commutative in their application - the order does not matter, and the vectors can simply be added up at every pixel location -, while a translation followed by a rotation is not the same as a rotation followed by a translation: adding up vectors at each pixel cannot be the correct solution as there wouldn’t be a difference based on the order of vector addition.
- Parameters
other – Flow object to combine with, shape (including batch size) needs to match
mode –
Integer determining how the input flows are combined, where the number corresponds to the position in the formula \(flow_1 ⊕ flow_2 = flow_3\):
Mode
1
: self corresponds to \(flow_2\), flow corresponds to \(flow_3\), the result will be \(flow_1\)Mode
2
: self corresponds to \(flow_1\), flow corresponds to \(flow_3\), the result will be \(flow_2\)Mode
3
: self corresponds to \(flow_1\), flow corresponds to \(flow_2\), the result will be \(flow_3\)
ref – Desired output flow reference, defaults to the reference of self
- Returns
New flow object
-
Flow.
combine_with
(flow: Flow, mode: int, thresholded: Optional[bool] = None) → Flow¶ Function that returns the result of the combination of two flow objects of the same shape
shape
and referenceref
. If the toolbox-wide variablePURE_PYTORCH
is set toTrue
(default, see alsoset_pure_pytorch()
), the output flow field vectors are differentiable with respect to the input flow fields.Caution
This method will in future be deprecated in favour of
combine()
, using a more general algorithm that can both combine and output flow objects in any reference frame.Tip
All of the flow field combinations in this function rely on some combination of the
apply()
andinvert()
methods. IfPURE_PYTORCH
is set toFalse
, andmode
is1
or2
, these methods will callscipy.interpolate.griddata()
, which can be very slow (several seconds) - but the result will be more accurate compared to using the PyTorch-only setting.All formulas used in this function have been derived from first principles. The base formula is \(flow_1 ⊕ flow_2 = flow_3\), where \(⊕\) is a non-commutative flow composition operation. This can be visualised with the start / end points of the flows as follows:
S = Start point S1 = S3 ─────── f3 ────────────┐ E = End point │ │ f = flow f1 v └───> E1 = S2 ── f2 ──> E2 = E3
The main difficulty in combining flow fields is that it would be incorrect to simply add up or subtract flow vectors at one location in the flow field area \(H \times W\). This appears to work given e.g. a translation to the right, and a translation downwards: the result will be the linear combination of the two vectors, or a translation towards the bottom right. However, looking more closely, it becomes evident that this approach isn’t actually correct: A pixel that has been moved from S1 to E1 by the first flow field f1 is then moved from that location by the flow vector of the flow field f2 that corresponds to the new pixel location E1, not the original location S1. If the flow vectors are the same everywhere in the field, the difference will not be noticeable. However, if the flow vectors of f2 vary throughout the field, such as with a rotation around some point, it will!
In this case (corresponding to calling
f1.combine_with(f2, mode=3)
), and if the flow referenceref
iss
(“source”), the solution is to first apply the inverse of f1 to f2, essentially linking up each location E1 back to S1, and then to add up the flow vectors. Analogous observations apply for the other permutations of flow combinations and referenceref
values.Note
This is consistent with the observation that two translations are commutative in their application - the order does not matter, and the vectors can simply be added up at every pixel location -, while a translation followed by a rotation is not the same as a rotation followed by a translation: adding up vectors at each pixel cannot be the correct solution as there wouldn’t be a difference based on the order of vector addition.
- Parameters
flow – Flow object to combine with, shape (including batch size) needs to match
mode –
Integer determining how the input flows are combined, where the number corresponds to the position in the formula \(flow_1 ⊕ flow_2 = flow_3\):
Mode
1
: self corresponds to \(flow_2\), flow corresponds to \(flow_3\), the result will be \(flow_1\)Mode
2
: self corresponds to \(flow_1\), flow corresponds to \(flow_3\), the result will be \(flow_2\)Mode
3
: self corresponds to \(flow_1\), flow corresponds to \(flow_2\), the result will be \(flow_3\)
thresholded – Boolean determining whether flows are thresholded during an internal call to
is_zero()
, defaults toFalse
- Returns
New flow object
-
oflibpytorch.
batch_flows
(flows: Union[list, tuple]) → Flow¶ Returns a batched flow object from a list of input flows of the same size and flow reference
ref
.The output flow vectors are differentiable with respect to the input flow vectors.
- Parameters
flows – Tuple or list of flow objects. Flow objects to have the same flow reference
ref
, as well as the same flow field heights and widths \((H, W)\). They can have any batch size.- Returns
Single batched flow object, with a batch size equal to the sum of all individual input batch sizes
Applying the Flow¶
-
Flow.
apply
(target: Union[torch.Tensor, Flow], target_mask: Optional[torch.Tensor] = None, return_valid_area: Optional[bool] = None, consider_mask: Optional[bool] = None, padding: Optional[list] = None, cut: Optional[bool] = None) → Union[torch.Tensor, Flow, Tuple[Union[torch.Tensor, Flow], torch.Tensor]]¶ Apply the flow to a target, which can be a torch tensor or a Flow object itself
If
PURE_PYTORCH
is set toTrue
(default, see alsoset_pure_pytorch()
), the output is differentiable with respect to the flow vectors and the input target, if given as a tensor.Tip
If
PURE_PYTORCH
is set toFalse
, callingapply()
on a flow field with referenceref
s
(“source”) requires a call toscipy.interpolate.griddata()
, which is quite slow. Using a flow field with referenceref
t
avoids this and will therefore be significantly faster. IfPURE_PYTORCH
isTrue
, a flow field with referenceref
s
will yield less accurate results, but there is no speed penalty - and the output is differentiable.If the flow shape \((H_{flow}, W_{flow})\) is smaller than the target shape \((H_{target}, W_{target})\), a list of padding values needs to be passed to localise the flow in the larger \(H_{target} \times W_{target}\) area.
The valid image area that can optionally be returned is
True
where the image values in the function output:have been affected by flow vectors. If the flow has a reference
ref
value oft
(“target”), this is alwaysTrue
as the target image by default has a corresponding flow vector at each pixel location in \(H \times W\). If the flow has a referenceref
value ofs
(“source”), this is onlyTrue
for some parts of the image: some target image pixel locations in \(H \times W\) would only be reachable by flow vectors originating outside of the source image area, which is impossible by definitionhave been affected by flow vectors that were themselves valid, as determined by the flow mask
Caution
The parameter consider_mask relates to whether the invalid flow vectors in a flow field with reference
s
are removed before application (default behaviour) or not. Doing so results in a smoother flow field, but can cause artefacts to arise where the outline of the area returned byvalid_target()
is not a convex hull. For a more detailed explanation with an illustrative example, see the section “Applying a Flow” in the usage documentation.- Parameters
target – Torch tensor of shape \((H, W)\), \((C, H, W)\), or \((N, C, H, W)\), or a flow object of shape \((N, H, W)\) to which the flow should be applied, where \(H\) and \(W\) are equal or larger than the corresponding dimensions of the flow itself
target_mask – Optional torch tensor of shape \((H, W)\) or \((N, H, W)\) and type
bool
that indicates which part of the target is valid (only relevant if target is not a flow object). Defaults toTrue
everywherereturn_valid_area – Boolean determining whether the valid image area is returned (only if the target is a numpy array), defaults to
False
. The valid image area is returned as a boolean torch tensor of shape \((N, H, W)\).consider_mask – Boolean determining whether the flow vectors are masked before application (only relevant for flows with reference
ref = 's'
). Results in smoother outputs, but more artefacts. Defaults toTrue
padding – List or tuple of shape \((4)\) with padding values
[top, bottom, left, right]
. Required if the flow and the target don’t have the same shape. Defaults toNone
, which means no padding neededcut – Boolean determining whether the warped target is returned cut from \((H_{target}, W_{target})\) to \((H_{flow}, W_{flow})\), in the case that the shapes are not the same. Defaults to
True
- Returns
The warped target of the same shape \((C, H, W)\) or \((N, C, H, W)\) and type as the input (rounded if necessary), except when this is an integer type and
PURE_PYTORCH
isTrue
. In that case, outputs should be differentiable and are therefore kept as floats (but still rounded if the input is an integer type). Optionally also returns the valid area of the flow as a boolean torch tensor of shape \((N, H, W)\).
-
Flow.
track
(pts: torch.Tensor, int_out: Optional[bool] = None, get_valid_status: Optional[bool] = None) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]¶ Warp input points with the flow field, returning the warped point coordinates as integers if required
If
PURE_PYTORCH
is set toTrue
(default, see alsoset_pure_pytorch()
), the output is differentiable with respect to the flow vectors and the input point coordinates.Tip
If
PURE_PYTORCH
is set toFalse
, callingtrack()
on a flow field with referenceref
t
(“target”) requires a call toscipy.interpolate.griddata()
, which is quite slow. Using a flow field with referenceref
s
avoids this and will therefore be significantly faster. IfPURE_PYTORCH
isTrue
, a flow field with referenceref
t
will yield less accurate results (by fractions of pixels), but there is no speed penalty - and the output is differentiable.- Parameters
pts – Torch tensor of shape \((M, 2)\) or \((N, M, 2)\) containing the point coordinates. If a batch dimension is given, it must correspond to the flow batch dimension. If the flow is batched but the points are not, the same points are warped by each flow field individually.
pts[:, 0]
corresponds to the vertical coordinate,pts[:, 1]
to the horizontal coordinateint_out – Boolean determining whether output points are returned as rounded integers, defaults to
False
get_valid_status – Boolean determining whether a tensor of shape \((M)\) or \((N, M)\) is returned, which contains the status of each point. This corresponds to applying
valid_source()
to the point positions, and returnsTrue
for the points that 1) tracked by valid flow vectors, and 2) end up inside the flow area of \(H \times W\). Defaults toFalse
- Returns
Torch tensor of warped (‘tracked’) points of the same shape as the input, and optionally a torch tensor of the tracking status per point. The tensor device is the same as the tensor device of the flow field
Evaluating the Flow¶
-
Flow.
is_zero
(thresholded: Optional[bool] = None, masked: Optional[bool] = None) → bool¶ Check whether all flow vectors (where
mask
isTrue
) are zero. Optionally, a threshold flow magnitude value of1e-3
is used. This can be useful to filter out motions that are equal to very small fractions of a pixel, which might just be a computational artefact to begin with.- Parameters
thresholded – Boolean determining whether the flow is thresholded, defaults to
True
masked – Boolean determining whether the flow is masked with
mask
, defaults toTrue
- Returns
Tensor matching the batch dimension, containing
True
for each flow field that is zero everywhere, otherwiseFalse
-
Flow.
matrix
(dof: Optional[int] = None, method: Optional[str] = None, masked: Optional[bool] = None) → numpy.ndarray¶ Fit a transformation matrix to the flow field using OpenCV functions
- Parameters
dof –
Integer describing the degrees of freedom in the transformation matrix to be fitted, defaults to
8
. Options are:4
: Partial affine transform with rotation, translation, scaling6
: Affine transform with rotation, translation, scaling, shearing8
: Projective transform, i.e estimation of a homography
method –
String describing the method used to fit the transformations matrix by OpenCV, defaults to
ransac
. Options are:lms
: Least mean squaresransac
: RANSAC-based robust methodlmeds
: Least-Median robust method
masked – Boolean determining whether the flow mask is used to ignore flow locations where the mask
mask
isFalse
. Defaults toTrue
- Returns
Torch tensor of shape \((N, 3, 3)\) and the same device as the flow object, containing the transformation matrix
-
Flow.
valid_target
(consider_mask: Optional[bool] = None) → torch.Tensor¶ Find the valid area in the target domain
Given a source image and a flow, both of shape \((H, W)\), the target image is created by warping the source with the flow. The valid area is then a boolean numpy array of shape \((H, W)\) that is
True
wherever the value in the target img stems from warping a value from the source, andFalse
where no valid information is known.Pixels that are
False
will often be black (or ‘empty’) in the warped target image - but not necessarily, due to warping artefacts etc. The valid area also allows a distinction between pixels that are black due to no actual information being available at this position (validityFalse
), and pixels that are black due to black pixel values having been warped to that (valid) location by the flow.- Parameters
consider_mask – Boolean determining whether the flow vectors are masked before application (only relevant for flows with reference
ref = 's'
, analogous toapply()
). Results in smoother outputs, but more artefacts. Defaults toTrue
- Returns
Boolean torch tensor of the same shape \((N, H, W)\) as the flow
-
Flow.
valid_source
(consider_mask: Optional[bool] = None) → torch.Tensor¶ Finds the area in the source domain that will end up being valid in the target domain (see
valid_target()
) after warpingGiven a source image and a flow, both of shape \((H, W)\), the target image is created by warping the source with the flow. The source area is then a boolean numpy array of shape \((H, W)\) that is
True
wherever the value in the source will end up somewhere inside the valid target area, andFalse
where the value in the source will either be warped outside of the target image, or not be warped at all due to a lack of valid flow vectors connecting to this position.- Parameters
consider_mask – Boolean determining whether the flow vectors are masked before application (only relevant for flows with reference
ref = 't'
as their inverse flow will be applied, using the references
; analogous toapply()
). Results in smoother outputs, but more artefacts. Defaults toTrue
- Returns
Boolean torch tensor of the same shape \((N, H, W)\) as the flow
-
Flow.
get_padding
(item: Optional[int] = None) → list¶ Determine necessary padding from the flow field:
When the flow reference
ref
has the valuet
(“target”), this corresponds to the padding needed in a source image which ensures that every flow vector invecs
marked as valid by the maskmask
will find a value in the source domain to warp towards the target domain. I.e. any invalid locations in the area \(H \times W\) of the target domain (seevalid_target()
) are purely due to no valid flow vector being available to pull a source value to this target location, rather than no source value being available in the first place.When the flow reference
ref
has the values
(“source”), this corresponds to the padding needed for the flow itself, so that applying it to a source image will result in no input image information being lost in the warped output, i.e each input image pixel will come to lie inside the padded area.
- Parameters
item – Element in batch to be selected, as an integer. Defaults to ``None’’, returns the whole flow object
- Returns
If no item is selected from the batch, this function returns a list of shape \((N, 4)\), where N is the batch size. If an item is selected, it returns a list of shape \((4)\). Padding values themselves are given in the following order:
[top, bottom, left, right]
Visualising the Flow¶
-
Flow.
visualise
(mode: str, show_mask: Optional[bool] = None, show_mask_borders: Optional[bool] = None, range_max: Optional[Union[float, int, list, tuple]] = None, return_tensor: Optional[bool] = None) → Union[numpy.ndarray, torch.Tensor]¶ Visualises the flow as an rgb / bgr / hsv image, optionally showing the outline of the flow mask
mask
as a black line, and the invalid areas greyed out.Note
This currently runs internally based on NumPy & OpenCV, due to a lack of easily accessible equivalent functions for mask border detection and drawing. Therefore, even if the output is a tensor, it will not be differentiable with respect to the flow vector tensor.
- Parameters
mode – Output mode, options:
rgb
,bgr
,hsv
show_mask – Boolean determining whether the flow mask is visualised, defaults to
False
show_mask_borders – Boolean determining whether the flow mask border is visualised, defaults to
False
range_max – Maximum vector magnitude expected, corresponding to the HSV maximum Value of 255 when scaling the flow magnitudes. Can be a list or tuple corresponding of the same length as the flow batch size. Defaults to the 99th percentile of the flow field magnitudes, per batch element
return_tensor – Boolean determining whether the result is returned as a tensor. Note that the result is originally a numpy array. Defaults to
True
- Returns
Numpy array of shape \((N, H, W, 3)\) or torch tensor of shape \((N, 3, H, W)\) containing the flow visualisation, where N is the batch size
-
Flow.
visualise_arrows
(grid_dist: Optional[int] = None, img: Optional[Union[numpy.ndarray, torch.Tensor]] = None, scaling: Optional[Union[float, int]] = None, show_mask: Optional[bool] = None, show_mask_borders: Optional[bool] = None, colour: Optional[tuple] = None, thickness: Optional[int] = None, return_tensor: Optional[bool] = None) → Union[numpy.ndarray, torch.Tensor]¶ Visualises the flow as arrowed lines, optionally showing the outline of the flow mask
mask
as a black line, and the invalid areas greyed out.Note
This currently runs internally based on NumPy & OpenCV, due to a lack of easily accessible equivalent functions for arrow drawing and mask border detection. Therefore, even if the output is a tensor, it will not be differentiable with respect to the flow vector tensor.
- Parameters
grid_dist – Integer of the distance of the flow points to be used for the visualisation, defaults to
20
img – Torch tensor of shape \((N, 3, H, W)\) or \((3, H, W)\) or numpy array of shape \((N, H, W, 3)\) or \((H, W, 3)\) containing the background image to use (in BGR mode), defaults to white
scaling – Float or int of the flow line scaling, defaults to scaling the 99th percentile of arrowed line lengths to be equal to twice the grid distance (empirical value)
show_mask – Boolean determining whether the flow mask is visualised, defaults to
False
show_mask_borders – Boolean determining whether the flow mask border is visualised, defaults to
False
colour – Tuple of the flow arrow colour, defaults to hue based on flow direction as in
visualise()
thickness – Integer of the flow arrow thickness, larger than zero. Defaults to
1
return_tensor – Boolean determining whether the result is returned as a tensor. Note that the result is originally a numpy array. Defaults to
True
- Returns
Numpy array of shape \((H, W, 3)\) or torch tensor of shape \((3, H, W)\) containing the flow visualisation, in
bgr
colour space
-
Flow.
show
(elem: Optional[int] = None, wait: Optional[int] = None, show_mask: Optional[bool] = None, show_mask_borders: Optional[bool] = None)¶ Shows the flow in an OpenCV window using
visualise()
- Parameters
elem – Integer determining which batch element is visualised. Defaults to
0
, so for flows with only one element it automatically selects the one available flowwait – Integer determining how long to show the flow for, in milliseconds. Defaults to
0
, which means it will be shown until the window is closed, or the process is terminatedshow_mask – Boolean determining whether the flow mask is visualised, defaults to
False
show_mask_borders – Boolean determining whether flow mask border is visualised, defaults to
False
-
Flow.
show_arrows
(elem: Optional[int] = None, wait: Optional[int] = None, grid_dist: Optional[int] = None, img: Optional[numpy.ndarray] = None, scaling: Optional[Union[float, int]] = None, show_mask: Optional[bool] = None, show_mask_borders: Optional[bool] = None, colour: Optional[tuple] = None)¶ Shows the flow in an OpenCV window using
visualise_arrows()
- Parameters
elem – Integer determining which batch element is visualised. Defaults to
0
, so for flows with only one element it automatically selects the one available flowwait – Integer determining how long to show the flow for, in milliseconds. Defaults to
0
, which means it will be shown until the window is closed, or the process is terminatedgrid_dist – Integer of the distance of the flow points to be used for the visualisation, defaults to
20
img – Numpy array with the background image to use (in BGR colour space), defaults to black
scaling – Float or int of the flow line scaling, defaults to scaling the 99th percentile of arrowed line lengths to be equal to twice the grid distance (empirical value)
show_mask – Boolean determining whether the flow mask is visualised, defaults to
False
show_mask_borders – Boolean determining whether the flow mask border is visualised, defaults to
False
colour – Tuple of the flow arrow colour, defaults to hue based on flow direction as in
visualise()
-
oflibpytorch.
visualise_definition
(mode: str, shape: Optional[Union[list, tuple]] = None, insert_text: Optional[bool] = None, return_tensor: Optional[bool] = None) → Union[numpy.ndarray, torch.Tensor]¶ Return an image that shows the definition of the flow visualisation.
- Parameters
mode – Desired output colour space:
rgb
,bgr
, orhsv
shape – List or tuple of shape \((2)\) containing the desired image shape as values
(H, W)
. Defaults to (601, 601) - do not change if you leave insert_text asTrue
as otherwise the text will appear in the wrong locationinsert_text – Boolean determining whether explanatory text is put on the image (using
cv2.putText()
), defaults toTrue
return_tensor – Boolean determining whether the result is returned as a tensor. Note that the result is originally a numpy array. Defaults to
True
- Returns
Numpy array of shape \((H, W, 3)\) or torch tensor of shape \((3, H, W)\), type
uint8
, showing the colour definition of the flow visualisation
Using Torch tensors & NumPy arrays¶
This section contains functions that take Torch tensors as well as NumPy arrays as inputs, instead of making use of
the custom flow class. On the one hand, this avoids having to define flow objects. On the other hand, it requires
keeping track of flow attributes manually, and it does not avail itself of the full scope of functionality
oflibpytorch
has to offer: most importantly, flow masks are not considered or tracked.
Flow Creation & Loading¶
-
oflibpytorch.
from_matrix
(matrix: Union[numpy.ndarray, torch.Tensor], shape: Union[list, tuple], ref: Optional[str] = None, matrix_is_inverse: Optional[bool] = None) → torch.Tensor¶ Flow vectors calculated from a transformation matrix input.
The output flow vectors are differentiable with respect to the input matrix, if given as a torch tensor.
- Parameters
matrix – Transformation matrix to be turned into a flow field, as numpy array or torch tensor of shape \((3, 3)\) or \((N, 3, 3)\)
shape – List or tuple of the shape \((H, W)\) of the flow field
ref – Flow reference, string of value
t
(“target”) ors
(“source”). Defaults tot
matrix_is_inverse – Boolean determining whether the given matrix is already the inverse of the desired transformation. Is useful for flow with reference
t
to avoid calculation of the pseudo-inverse, but will throw aValueError
if used for flow with references
to avoid accidental usage. Defaults toFalse
- Returns
Flow vectors of shape \((N, 2, H, W)\)
-
oflibpytorch.
from_transforms
(transform_list: list, shape: Union[list, tuple], ref: Optional[str] = None, padding: Optional[list] = None) → torch.Tensor¶ - Flow vectors calculated from a list of transforms. If padding values are given, the given shape is
padded accordingly. The transforms values are also adjusted, e.g. by shifting scaling and rotation centres.
- Parameters
transform_list –
List of transforms to be turned into a flow field, where each transform is expressed as a list of [
transform name
,transform value 1
, … ,transform value n
]. Supported options:Transform
translation
, with valueshorizontal shift in px
,vertical shift in px
Transform
rotation
, with valueshorizontal centre in px
,vertical centre in px
,angle in degrees, counter-clockwise
Transform
scaling
, with valueshorizontal centre in px
,vertical centre in px
,scaling fraction
shape – List or tuple of the shape \((H, W)\) of the flow field
ref – Flow reference, string of value
t
(“target”) ors
(“source”). Defaults tot
padding – List or tuple of shape \((4)\) with padding values
[top, bot, left, right]
- Returns
Flow vectors of shape \((N, 2, H, W)\)
-
oflibpytorch.
load_kitti
(path: str) → Union[List[torch.Tensor], torch.Tensor]¶ Loads the flow field contained in KITTI
uint16
png images files, including the valid pixels. Follows the official instructions on how to read the provided .png files on the KITTI optical flow dataset website.- Parameters
path – String containing the path to the KITTI flow data (
uint16
, .png file)- Returns
A torch tensor of shape \((3, H, W)\) with the KITTI flow data (with valid pixels in the 3rd channel)
-
oflibpytorch.
load_sintel
(path: str) → torch.Tensor¶ Loads the flow field contained in Sintel .flo byte files. Follows the official instructions provided alongside the .flo data on the Sintel optical flow dataset website.
- Parameters
path – String containing the path to the Sintel flow data (.flo byte file, little Endian)
- Returns
A torch tensor of shape \((2, H, W)\) containing the Sintel flow data
-
oflibpytorch.
load_sintel_mask
(path: str) → torch.Tensor¶ Loads the invalid pixels contained in Sintel .png mask files, as a boolean mask marking valid pixels with
True
. Follows the official instructions provided alongside the .flo data on the Sintel optical flow dataset website.- Parameters
path – String containing the path to the Sintel invalid pixel data (.png, black and white)
- Returns
A torch tensor containing the Sintel valid pixels (mask) data
Flow Manipulation¶
-
oflibpytorch.
resize_flow
(flow: Union[numpy.ndarray, torch.Tensor], scale: Union[float, int, list, tuple]) → torch.Tensor¶ Resize a flow field numpy array or torch tensor, scaling the flow vectors values accordingly.
The output flow field is differentiable with respect to the input flow field, if given as a torch tensor.
- Parameters
flow – Flow field as a numpy array or torch tensor, shape \((2, H, W)\), \((H, W, 2)\), \((N, 2, H, W)\), or \((N, H, W, 2)\)
scale –
Scale used for resizing, options:
Integer or float of value
scaling
applied both vertically and horizontallyList or tuple of shape \((2)\) with values
[vertical scaling, horizontal scaling]
- Returns
Scaled flow field as a torch tensor, shape \((2, H, W)\) or \((N, 2, H, W)\), depending on input
-
oflibpytorch.
invert_flow
(flow: Union[numpy.ndarray, torch.Tensor], input_ref: str, output_ref: Optional[str] = None) → torch.Tensor¶ Inverting a flow: img1 – f –> img2 becomes img1 <– f – img2. The smaller the input flow, the closer the inverse is to simply multiplying the flow by -1.
The output flow field tensor is differentiable with respect to the input flow field, if given as a tensor.
- Parameters
flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down.input_ref – Reference of the input flow field, either
s
ort
output_ref – Desired reference of the output field, either
s
ort
. Defaults toinput_ref
- Returns
Flow field as a torch tensor of shape \((2, H, W)\) or \((N, 2, H, W)\)
-
oflibpytorch.
switch_flow_ref
(flow: Union[numpy.ndarray, torch.Tensor], input_ref: str) → torch.Tensor¶ Recalculate flow vectors to correspond to a switched flow reference (see Flow reference
ref
)The output flow field tensor is differentiable with respect to the input flow field, if given as a tensor.
- Parameters
flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down.input_ref – The reference of the input flow field, either
s
ort
- Returns
Flow field as a torch tensor of shape \((2, H, W)\) or \((N, 2, H, W)\)
-
oflibpytorch.
combine_flows
(input_1: Union[Flow, torch.Tensor, numpy.ndarray], input_2: Union[Flow, torch.Tensor, numpy.ndarray], mode: int, ref: Optional[str] = None, thresholded: Optional[bool] = None) → Union[Flow, torch.Tensor]¶ Returns the result of the combination of two flow fields of the same shape and reference
If the toolbox-wide variable
PURE_PYTORCH
is set toTrue
(default, see alsoset_pure_pytorch()
), the output flow field tensor is differentiable with respect to the input flow fields, if given as tensors.Tip
All of the flow field combinations in this function rely on some combination of the
apply()
,invert()
, andcombine_with()
methods.If
PURE_PYTORCH
is set toFalse
, some of these methods will callscipy.interpolate.griddata()
, possibly multiple times, which can be very slow (several seconds) - but the result will be more accurate compared to using the PyTorch-only setting. The table below aids decision-making with regards to which reference a flow field should be provided in to obtain the fastest result.¶ mode
ref = 's'
ref = 't'
1
1
3
2
1
1
3
0
0
All formulas used in this function have been derived from first principles. The base formula is \(flow_1 ⊕ flow_2 = flow_3\), where \(⊕\) is a non-commutative flow composition operation. This can be visualised with the start / end points of the flows as follows:
S = Start point S1 = S3 ─────── f3 ────────────┐ E = End point │ │ f = flow f1 v └───> E1 = S2 ── f2 ──> E2 = E3
The main difficulty in combining flow fields is that it would be incorrect to simply add up or subtract flow vectors at one location in the flow field area \(H \times W\). This appears to work given e.g. a translation to the right, and a translation downwards: the result will be the linear combination of the two vectors, or a translation towards the bottom right. However, looking more closely, it becomes evident that this approach isn’t actually correct: A pixel that has been moved from S1 to E1 by the first flow field f1 is then moved from that location by the flow vector of the flow field f2 that corresponds to the new pixel location E1, not the original location S1. If the flow vectors are the same everywhere in the field, the difference will not be noticeable. However, if the flow vectors of f2 vary throughout the field, such as with a rotation around some point, it will!
In this case (corresponding to calling
combine_flows(f1, f2, mode=3)
), and if the flow reference iss
(“source”), the solution is to first apply the inverse of f1 to f2, essentially linking up each location E1 back to S1, and then to add up the flow vectors. Analogous observations apply for the other permutations of flow combinations and references.Note
This is consistent with the observation that two translations are commutative in their application - the order does not matter, and the vectors can simply be added up at every pixel location -, while a translation followed by a rotation is not the same as a rotation followed by a translation: adding up vectors at each pixel cannot be the correct solution as there wouldn’t be a difference based on the order of vector addition.
- Parameters
input_1 – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down. Can also be a flow object, but this will be deprecated sooninput_2 – Second input flow, same type as
input_1
mode –
Integer determining how the input flows are combined, where the number corresponds to the position in the formula \(flow_1 ⊕ flow_2 = flow_3\):
Mode
1
: input_1 corresponds to \(flow_2\), input_2 corresponds to \(flow_3\), the result will be \(flow_1\)Mode
2
: input_1 corresponds to \(flow_1\), input_2 corresponds to \(flow_3\), the result will be \(flow_2\)Mode
3
: input_1 corresponds to \(flow_1\), input_2 corresponds to \(flow_2\), the result will be \(flow_3\)
ref – The reference of the input flow fields, either
s
ort
thresholded – Boolean determining whether flows are thresholded during an internal call to
is_zero()
, defaults toFalse
- Returns
Flow object if inputs are flow objects (deprecated in future, avoid), Torch tensor of shape \((2, H, W)\) or \((N, 2, H, W)\) as standard
Flow Application¶
-
oflibpytorch.
apply_flow
(flow: Union[numpy.ndarray, torch.Tensor], target: torch.Tensor, ref: str, mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None) → torch.Tensor¶ Uses a given flow to warp a target. The flow reference, if not given, is assumed to be
t
. Optionally, a mask can be passed which (only for flows ins
reference) masks undesired (e.g. undefined or invalid) flow vectors.If
PURE_PYTORCH
is set toTrue
(default, see alsoset_pure_pytorch()
), the output is differentiable with respect to the inputflow
andtarget
. IfPURE_PYTORCH
isFalse
(see alsounset_pure_pytorch()
) andref
iss
, the more accurate functionscipy.interpolate.griddata()
is used. This is not only significantly slower, but also means the output does not have a grad_fn and is therefore not differentiable in the PyTorch context.- Parameters
flow – Flow field as a numpy array or torch tensor, shape \((2, H, W)\), \((H, W, 2)\), \((N, 2, H, W)\), or \((N, H, W, 2)\)
target – Torch tensor containing the content to be warped, with shape \((H, W)\), \((C, H, W)\), or \((N, C, H, W)\)
ref – Reference of the flow,
t
ors
mask – Flow mask as numpy array or torch tensor, with shape \((H, W)\) or \((N, H, W)\), matching the flow field. Only relevant for
s
flows. Defaults toTrue
everywhere
- Returns
Torch tensor of the same shape as the target, with the content warped by the flow
-
oflibpytorch.
track_pts
(flow: Union[numpy.ndarray, torch.Tensor], ref: str, pts: torch.Tensor, int_out: Optional[bool] = None) → torch.Tensor¶ Warp input points with a flow field, returning the warped point coordinates as integers if required.
If
PURE_PYTORCH
is set toTrue
(default, see alsoset_pure_pytorch()
), the output is differentiable with respect to the inputflow
andpts
. IfPURE_PYTORCH
isFalse
(see alsounset_pure_pytorch()
) andref
ist
, the more accurate functionscipy.interpolate.griddata()
is used. This is not only significantly slower, but also means the output does not have a grad_fn and is therefore not differentiable in the PyTorch context.- Parameters
flow – Flow field as a numpy array or torch tensor, shape \((2, H, W)\), \((H, W, 2)\), \((N, 2, H, W)\), or \((N, H, W, 2)\)
ref – Flow field reference, either
s
ort
pts – Torch tensor of shape \((M, 2)\) or \((N, M, 2)\) containing the point coordinates. If a batch dimension is given, it must be 1 or correspond to the flow batch dimension. If the flow is batched but the points are not, the same points are warped by each flow field individually.
pts[:, 0]
corresponds to the vertical coordinate,pts[:, 1]
to the horizontal coordinateint_out – Boolean determining whether output points are returned as rounded integers, defaults to
False
- Returns
Torch tensor of warped (‘tracked’) points, tensor device same as the input flow field
Flow Evaluation¶
-
oflibpytorch.
is_zero_flow
(flow: Union[numpy.ndarray, torch.Tensor], thresholded: Optional[bool] = None) → torch.Tensor¶ Check whether all flow vectors are zero. Optionally, a threshold flow magnitude value of
1e-3
is used. This can be useful to filter out motions that are equal to very small fractions of a pixel, which might just be a computational artefact to begin with.- Parameters
flow – Flow field as a numpy array or torch tensor, shape \((2, H, W)\), \((H, W, 2)\), \((N, 2, H, W)\), or \((N, H, W, 2)\)
thresholded – Boolean determining whether the flow is thresholded, defaults to
True
- Returns
Tensor of (batch) shape \((N)\) which is
True
if the flow field is zero everywhere, otherwiseFalse
-
oflibpytorch.
get_flow_matrix
(flow: Union[numpy.ndarray, torch.Tensor], ref: str, dof: Optional[int] = None, method: Optional[str] = None) → torch.Tensor¶ Fit a transformation matrix to the flow field using OpenCV functions
- Parameters
flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down.ref – Reference of the flow field,
s
ort
dof –
Integer describing the degrees of freedom in the transformation matrix to be fitted, defaults to
8
. Options are:4
: Partial affine transform with rotation, translation, scaling6
: Affine transform with rotation, translation, scaling, shearing8
: Projective transform, i.e estimation of a homography
method –
String describing the method used to fit the transformations matrix by OpenCV, defaults to
ransac
. Options are:lms
: Least mean squaresransac
: RANSAC-based robust methodlmeds
: Least-Median robust method
- Returns
Torch tensor of shape \((3, 3)\) or \((N, 3, 3)\) containing the transformation matrix
-
oflibpytorch.
valid_target
(flow: Union[numpy.ndarray, torch.Tensor], ref: str) → torch.Tensor¶ Find the valid area in the target domain
Given a source image and a flow, both of shape \((H, W)\), the target image is created by warping the source with the flow. The valid area is then a boolean numpy array of shape \((H, W)\) that is
True
wherever the value in the target img stems from warping a value from the source, andFalse
where no valid information is known.Pixels that are
False
will often be black (or ‘empty’) in the warped target image - but not necessarily, due to warping artefacts etc. The valid area also allows a distinction between pixels that are black due to no actual information being available at this position (validityFalse
), and pixels that are black due to black pixel values having been warped to that (valid) location by the flow.- Parameters
flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down.ref – Reference of the flow field,
s
ort
- Returns
Boolean torch tensor of the same shape \((H, W)\) or \((N, H, W)\) as the flow
-
oflibpytorch.
valid_source
(flow: Union[numpy.ndarray, torch.Tensor], ref: str) → torch.Tensor¶ Finds the area in the source domain that will end up being valid in the target domain (see
valid_target()
) after warpingGiven a source image and a flow, both of shape \((H, W)\), the target image is created by warping the source with the flow. The source area is then a boolean numpy array of shape \((H, W)\) that is
True
wherever the value in the source will end up somewhere inside the valid target area, andFalse
where the value in the source will either be warped outside of the target image, or not be warped at all due to a lack of valid flow vectors connecting to this position.- Parameters
flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down.ref – Reference of the flow field,
s
ort
- Returns
Boolean torch tensor of the same shape \((H, W)\) or \((N, H, W)\) as the flow
-
oflibpytorch.
get_flow_padding
(flow: Union[numpy.ndarray, torch.Tensor], ref: str) → list¶ Determine necessary padding from the flow field:
When the flow reference is
t
(“target”), this corresponds to the padding needed in a source image which ensures that every flow vector will find a value in the source domain to warp towards the target domain. I.e. any invalid locations in the area \(H \times W\) of the target domain (seevalid_target()
) are purely due to no valid flow vector being available to pull a source value to this target location, rather than no source value being available in the first place.When the flow reference is
s
(“source”), this corresponds to the padding needed for the flow itself, so that applying it to a source image will result in no input image information being lost in the warped output, i.e each input image pixel will come to lie inside the padded area.
- Parameters
flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down.ref – Reference of the flow field,
s
ort
- Returns
A list of shape \((4)\) or \((N, 4)\) with the values
[top, bottom, left, right]
for each batch member, if applicable
Flow Visualisation¶
-
oflibpytorch.
visualise_flow
(flow: Union[numpy.ndarray, torch.Tensor], mode: str, range_max: Optional[float] = None, return_tensor: Optional[bool] = None) → Union[numpy.ndarray, torch.Tensor]¶ Visualises the flow as an rgb / bgr / hsv image
Note
This currently runs internally based on NumPy & OpenCV, due to a lack of easily accessible equivalent functions for coordinate and colour space conversions. Therefore, even if the output is a tensor, it will not be differentiable with respect to the input flow tensor.
- Parameters
flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down.mode – Output mode, options:
rgb
,bgr
,hsv
range_max – Maximum vector magnitude expected, corresponding to the HSV maximum Value of 255 when scaling the flow magnitudes. Defaults to the 99th percentile of the flow field magnitudes
return_tensor – Boolean determining whether the result is returned as a tensor. Note that the result is originally a numpy array. Defaults to
True
- Returns
Numpy array of shape \((H, W, 3)\) or \((N, H, W, 3)\) or torch tensor of shape \((3, H, W)\) or \((N, 3, H, W)\) containing the flow visualisation
-
oflibpytorch.
visualise_flow_arrows
(flow: Union[numpy.ndarray, torch.Tensor], ref: str, grid_dist: Optional[int] = None, img: Optional[numpy.ndarray] = None, scaling: Optional[Union[float, int]] = None, colour: Optional[tuple] = None, thickness: Optional[int] = None, return_tensor: Optional[bool] = None) → Union[numpy.ndarray, torch.Tensor]¶ Visualises the flow as arrowed lines
Note
This currently runs internally based on NumPy & OpenCV, due to a lack of easily accessible equivalent functions for coordinate and colour space conversions. Therefore, even if the output is a tensor, it will not be differentiable with respect to the input flow tensor.
- Parameters
flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) or \((N, 2, H, W)\) if possible, otherwise as \((H, W, 2)\) or \((N, H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down.ref – Reference of the flow field,
s
ort
grid_dist – Integer of the distance of the flow points to be used for the visualisation, defaults to
20
img – Numpy array with the background image to use (in BGR mode), defaults to white
scaling – Float or int of the flow line scaling, defaults to scaling the 99th percentile of arrowed line lengths to be equal to twice the grid distance (empirical value)
colour – Tuple of the flow arrow colour, defaults to hue based on flow direction as in
visualise()
thickness – Integer of the flow arrow thickness, larger than zero. Defaults to
1
return_tensor – Boolean determining whether the result is returned as a tensor. Note that the result is originally a numpy array. Defaults to
True
- Returns
Numpy array of shape \((H, W, 3)\) or \((N, H, W, 3)\) or torch tensor of shape \((3, H, W)\) or \((N, 3, H, W)\) containing the flow visualisation
-
oflibpytorch.
show_flow
(flow: Union[numpy.ndarray, torch.Tensor], wait: Optional[int] = None)¶ Shows the flow in an OpenCV window using
visualise()
- Parameters
flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) if possible, otherwise as \((H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down.wait – Integer determining how long to show the flow for, in milliseconds. Defaults to
0
, which means it will be shown until the window is closed, or the process is terminated
-
oflibpytorch.
show_flow_arrows
(flow: Union[numpy.ndarray, torch.Tensor], ref: str, wait: Optional[int] = None, grid_dist: Optional[int] = None, img: Optional[numpy.ndarray] = None, scaling: Optional[Union[float, int]] = None, colour: Optional[tuple] = None)¶ Shows the flow in an OpenCV window using
visualise_arrows()
- Parameters
flow – Numpy array or pytorch tensor with 3 or 4 dimension. The shape is interpreted as \((2, H, W)\) if possible, otherwise as \((H, W, 2)\), throwing a
ValueError
if this isn’t possible either. The dimension that is 2 (the channel dimension) contains the flow vector in OpenCV convention:flow_vectors[..., 0]
are the horizontal,flow_vectors[..., 1]
are the vertical vector components, defined as positive when pointing to the right / down.ref – Reference of the flow field,
s
ort
wait – Integer determining how long to show the flow for, in milliseconds. Defaults to
0
, which means it will be shown until the window is closed, or the process is terminatedgrid_dist – Integer of the distance of the flow points to be used for the visualisation, defaults to
20
img – Numpy array with the background image to use (in BGR colour space), defaults to black
scaling – Float or int of the flow line scaling, defaults to scaling the 99th percentile of arrowed line lengths to be equal to twice the grid distance (empirical value)
colour – Tuple of the flow arrow colour, defaults to hue based on flow direction as in
visualise()
Utility methods¶
Additionally, some utility methods used in this module are made available here, as they may prove useful to others.
-
oflibpytorch.
to_numpy
(tensor: torch.Tensor, switch_channels: Optional[bool] = None) → numpy.ndarray¶ Tensor to numpy, calls .cpu() if necessary
- Parameters
tensor – Input tensor, may have gradient, may be on GPU
switch_channels – Boolean determining whether the channels are moved from the second to the last dimension, assuming the input is of shape \((N, C, H, W)\), changing it to \((N, H, W, C)\). defaults to
False
- Returns
Numpy array, with channels switched if required
-
oflibpytorch.
to_tensor
(array: numpy.ndarray, switch_channels: Optional[str] = None, device: Optional[Union[torch.device, int, str]] = None) → torch.Tensor¶ Moves a NumPy array to a tensor on the desired device, swapping axis positions if required
- Parameters
array – Input array
switch_channels – String determining whether the channels are moved from the last to the first dimension (if ‘single’), or from last to second dimension (if ‘batched’). Defaults to
None
(no channels moved)device – Tensor device, either a
torch.device
or a valid input totorch.device()
, such as a string (cpu
orcuda
). For a device of typecuda
, the device index defaults totorch.cuda.current_device()
. If the input isNone
, it defaults totorch.device('cpu')
- Returns
Torch tensor, with channels switched if required
-
oflibpytorch.
move_axis
(input_tensor: torch.Tensor, source: int, destination: int) → torch.Tensor¶ Helper function to imitate np.moveaxis.
Output tensor differentiable with respect to input tensor.
- Parameters
input_tensor – Input torch tensor, e.g. \((N, H, W, C)\)
source – Source position of the dimension to be moved, e.g.
-1
destination – Target position of the dimension to be moved, e.g.
1
- Returns
Output torch tensor, e.g. \((N, C, H, W)\)
-
oflibpytorch.
show_masked_image
(img: Union[torch.Tensor, numpy.ndarray], mask: Optional[Union[numpy.ndarray, torch.Tensor]] = None) → numpy.ndarray¶ Mimics flow.show() for an input image and a mask, i.e. uses the OpenCV
imshow()
method to show the image in a window. Useful for debugging purposes or to quickly visualise an image, even when no mask is involved, as it is easier than typing out all the OpenCV commands every time- Parameters
img – Torch tensor of shape \((3, H, W)\) or numpy array of shape \((H, W, 3)\), BGR input image
mask – Torch tensor or numpy array of shape \((H, W)\), boolean mask showing the valid area
- Returns
Masked image, in BGR colour space
-
oflibpytorch.
apply_s_flow
(flow: torch.Tensor, data: torch.Tensor, mask: Optional[torch.Tensor] = None, occlude_zero_flow: Optional[bool] = None) → torch.Tensor¶ Warp data with a given flow of
ref
s
(“forward flow”), making use of the inverse bilinear interpolation functiongrid_from_unstructured_data()
as a replacement forscipy.interpolate.griddata()
.The warped data output is differentiable wrt input flow and input data
- Parameters
flow – Float tensor of shape \((N, 2, H, W)\), the input flow with reference
s
data – Float tensor of shape \((N, C, H, W)\), the data to be warped
mask – Boolean tensor of shape \((N, H, W)\), the mask belonging to the flow (optional)
occlude_zero_flow – Boolean determining whether data locations with corresponding flow of zero are occluded (overwritten) by other data moved to the same location. Rationale: if the order of objects were reversed, i.e. the zero flow points occlude the non-zero flow points, the latter wouldn’t be known in the first place. This logic breaks down when the flow points concerned have been inferred, e.g. from surrounding non-occluded known points. Setting it to False will likely lead to unusual artefacts in the output. Defaults to True
- Returns
Warped data as float tensor of shape \((N, C, H, W)\), mask of where data points where warped to as bool tensor of shape \((N, H, W)\). If occlude_zero_flow is
True
, this mask does not include zero flow points as they have not been used in the interpolation to avoid artefacts.
-
oflibpytorch.
grid_from_unstructured_data
(x: torch.Tensor, y: torch.Tensor, data: torch.Tensor, mask: Optional[torch.Tensor] = None) → tuple¶ Returns unstructured input data interpolated (likely sparsely) on to a regular grid. Replacement for the SciPy
griddata
function, but less accurate. Equivalent to inverse bilinear interpolation. Credit:This is based on the algorithm suggested in: Sánchez, J., Salgado de la Nuez, A. J., & Monzón, N., “Direct estimation of the backward flow”, 2013
The code implementation is a heavily reworked version of code suggested by Markus Hofinger in private correspondence, a version of which he first used in: Hofinger, M., Bulò, S. R., Porzi, L., Knapitsch, A., Pock, T., & Kontschieder, P., “Improving optical flow on a pyramid level”. ECCV 2020
Markus Hofinger in turn credits the function _flow2distribution from the HD3 code base as inspiration, used in: Yin, Z., Darrell, T., & Yu, F., “Hierarchical discrete distribution decomposition for match density estimation”, CAPER 2019
The interpolated data as well as the interpolation density outputs are differentiable with respect to the input data position grids and the data itself.
- Parameters
x – Horizontal data position grid, shape \((N, H, W)\)
y – Vertical data position grid, shape \((N, H, W)\)
data – Data value grid, shape \((N, C, H, W)\)
mask – Tensor masking data positions that should be ignored for the interpolation, shape \((N, H, W)\)
- Returns
Tensor of data interpolated on regular grid of shape \((N, C, H, W)\), Tensor of interpolation density of shape \((N, H, W)\)