Validation and Limitations
Validation
How do you know the detected interaction is real
? Here we describe methods to validate your results.
Ground truth
Voxel level
If you have voxel level ground truth, you're unlikely to be needing the contact detection anyway. This is the easiest, but most rare case.
Image level
Here you have weak supervision, for example you know that in cell A there is a difference in interaction. You can then optimize the detection's parameters to maximize that difference.
Simulation
If you know what biological interaction you wish to observe and under what modality, you can simulate this and test if the detection can detect the simulated interaction, and to what extent.
Stability
Stability of a function $f$ is defined as $\vert f(x + \epsilon) - f(x) \vert \leq \delta$ for $\epsilon, \delta \geq 0, s = \vert \epsilon -\delta \vert$.
If $s$ is small for all values, then you have a stable function. For example, stability can be with respect to parameters or noise. Stability matters for validation because you do not want to publish results that only exist for $\alpha=0.003$ and w=2, and no other values. Ideally on representative (median) cells, you do a parameter sweep to show a consistent difference or a consistent recall/precision with respect to ground truth.
We made the simplification of using addition here, that is rarely true, but the principle holds. In addition, you do not want the stability to be small, but predictable. For example, as $\epsilon$ increases, how does $\delta$ increase? If there is a limiting function (linear, quadratic, ...), then that too is consistent. If this random, that is the true worst case.
Phantoms
You can test the detection on phantoms, e.g. physical or biological induced changes (e.g. SPLICS), where you alter the biology in such a way that forces organelles or proteins to interact.
Limitations
There are several factors affecting the outcome of the algorithm, we briefly describe each of them with mitigations: For a description of confounding factors in microscopy in the context for interaction, see this paper where we describe the factors and list methods to resolve them.
Signal to noise ratio
In microscopy noise is a complex, largely unknown, and non-additive perturbation to the image.
Effect
Noise, especially noise that introduces pixellation effects, will disrupt the detection. This will be first visible in the $\alpha$ filtered values, as SNR goes from 2 to 1, say, you may need to increase $\alpha$ (significance).
Mitigation
Apply deconvolution, ideally in combination with empirical resolution estimation tools such as FRC/FSC. You would pick the deconvolution parameters based on the smallest average FRC.
Registration
Objects can move between scanlines or acquisitions, to correct this you need registration.
Effect
There is no way the contact detection can guess that registration is needed, this is modality specific.
Mitigation
Apply registration, but be very careful. Registration algorithms can assume the registered objects should perfectly align, this destroys the interaction.
Resolution
The resolution limits what can be observed within one voxel, in one channel.
Effect
Contact detection uses spatial-temporal context to improve on what can be detected, but there are limits. If you have an empirical resolution of say 250nm, and no other confounding factors.
Mitigation
First, measure using e.g. FRC. Note that FRC tends to predict resolutions a bit too optimistic. Second, use deconvolution or preprocessing to improve the SNR and resolution. Third, validate your results.
Empirical resulotion is the resolution defined by your actual data and acquisition, theoretical is a best, worst, or average case based on specific calibrated issues, assumptions, and theoretical models from the microscope.
Statistics
If you acquire interactions of two organelles across a whole 3D cell, be forewarned that this will likely be very imbalanced data. While the human visual system is very good at focusing on entropy rich differences, statistics is frequency driven (mostly).
Mitigation
Carefully frame your research question to take into account imbalance. You can long-tail robust modelling, stratification, or extreme value theory models to account for this. A simple example is looking only at the QXX quantiles of a measure to adaptively isolate the long tail of a distribution, for exmaple.