Spectral images are a generalisation of colour images, where each channel represents a well-defined band of energy.
Ideally spectral channels don't overlap in energy, unlike many colour image filters
We normally can't sample both spatially and spectrally simultaneously, so we create images channel-by-channel (e.g. EFTEM), or position-by-position (e.g STEM-EELS)
There are many conventions for storing numbers as binary, here are some common ones used in images. Usage depends on your camera electronics and what processing you do.
Names
Size (bits or digits)
Min
Max
0100000001001001 0000111111010000
Binary
bool
8
0
1
-
Unsigned Integer
uint8, ubyte
8
0
255
64, 73, 15, 208
uint16
16
0
65,535
18752, 53263
Integer
int16, short
16
-32,768
32,767
18752, -12273
int32, long
32
-2,147,483,648
2,147,483,647
-804304576
Floating (Decimal)
float32, float
32
-3.40E+38
-3.40E+38
3.14159
float64, double
64
-1.70E+308
1.70E+308
-
Complex
complex64
64
-3.40E+38
-3.40E+38
-
Number types - Notes
Digital numbers are stored in a fixed amount of space - exceeding the min or max for a type can cause wrapping, e.g. 200 + 100 = 44!
uint8 has a maximum of 256, so ⇒ 300 mod 256 = 44.
The size of the number = the space it requires in memory and on disk
No reason to store 8-byte float64 if values are only 0 or 1
Often larger types ⇒ slower computation
Floating point numbers have variable precision, i.e. they can represent very large or very small values, but cannot represent a large number with a small fraction:
float32(324,000) + float32(0.0055) = float(324,000.0) and not float32(324,000.0055)
A 2D image is usually ordered row-by-row, or column-by-column, by convention. As each number occupies a fixed number of bits in the sequence, we can find the value of any pixel by row/col arithmetic and then indexing according to the layout in memory.
If an image is large and >2D, e.g. a spectrum image, then memory layout can heavily affect processing time. Jumping between memory locations is very slow compared to sequentially reading memory, so it pays to store data in the way it will be processed.
Multi-image data, stacks, 4D-STEM
Tomography can an add an extra [tilt] dimension to all of the above!
Coordinate systems
Depending on the tool or programming language, image coordinate systems vary
Matrix notation in 2D: [row, column]
Python is 0-indexed
image[0, 0] is the first pixel
MATLAB is 1-indexed:
image[1, 1] is the first pixel
Extra dimensions e.g. channel, scan are according to convention (and sometimes also memory-layout).
Typically row == 0/1 at the top when displayed, with positive-down
Maths with images
As an image is just a list of numbers, so we can do arithmethic or more complex operations on images to yield new images or other results. For example:
image = image - image.min() # subtract the minimum value in the image from every pixel
px_sum = image[5, 7] + image[2, 4] # sum the values in two pixels
image = log(image) # take the natural logarithm of every pixel
wavefront = exp(-1j * image) # interpret the image as phase and create a complex wavefront
sum_image = image + other_image # sum two images together
Note that when operating on pairs of images they must have the same shape for the elementwise calculation to be defined.
Images can be stored in many ways, depending on how they are used
.jpg, .png, .gif: colour RGB uint8 images, compressed for small file size, open anywhere without special software, not for scientific data, just visualisation
.tif: a general-purpose image format, can hold most number types and shapes
TIFF files with strange data (floating point) may need special software
Can hold additional metadata (e.g. calibrations), can be compressed
Proprietary formats like .dm3/4, .mib, .emd, .blo: specific to a certain camera or software, not always readable elsewhere
General array formats: .mat, .npy, .hdf5, .zarr: flexible, can be compressed, can hold stacks / nD data and metadata, need compatible code/software
import imageio.v3 as iio
frames = np.stack(
[iio.imread(f"{x}.jpg") for x inrange(n)],
)
iio.imwrite("test.gif", frames, fps=10)
Image: nVidia
Graphics Processing Units (GPUs)
A Graphics Processing Unit (GPU) is a computation accelerator which can be added to computers. They can be used to speed up many forms of scientific computation.
GPUs are specialised to perform simple math operations in parallel on multi-dimensional arrays of data (such as images)
Operations for 3D graphics (coordinate transformations, filtering, raytracing), the original usage for GPUs, are frequently identical to math needed in scientific computing (FFTs, convolutions, matrix algebra and inversion).
At least in Python, it is reasonably trivial to make code run on GPU rather than CPU, thanks to the standardisation efforts behind the teams behind numpy and cupy.
Many math functions have been re-implemented on GPU, and are used nearly the same as the CPU equivalent
import cupy as cp
img_fft = cp.fft.fft2(large_image)
and runs in ~1 ms on a large GPU.
Visualising images
Image histograms
An image histogram represents the frequency of intensity values in an image. It is a useful way to visualise the separation between background and content, and to see outlier pixels.
Data → Screen
Screens usually display uint8RGB colour (3 values of 0-255 per-pixel, also known as 24-bit colour). Unless our image was acquired in these three channels then we need to transform our data from recorded intensities to screen RGB.
If we recorded intensity, then setting R = G = B on a screen gives colourless Gray
This limits us to only 256 levels of intensity to display all of the data range
If our data are more than 8-bit, need to sacrifice detail or clip values
We can use artificial colour to achieve more on-screen contrast, known as a lookup table or colormap, of which there are many choices for different applications.
Choice of data transformation or colormap can massively influence how data are perceived.
Image: Jean-Luc Rouvière
Transformation: Brightness + Contrast
The basic data-to-screen transformation is linear: [img.min(), img.max()] → [0, 255].
The brightness/constrast transform chooses two other values and clips pixels outside their range to 0 or 255. This removes detail in some regions while increasing it in others.
Dynamic range
Dynamic range usually refers to the difference between the minimum and maximum value that an image could possibly represent, i.e. how much depth of intensity we can store without saturating at the top-end, or recording only zeros at the bottom.
In microscopy we often have data which span orders of magnitude in intensity (e.g. diffraction patterns).
When brightness/contrast cannot cover the dynamic range of an image, a non-linear mapping between data and colour can be used, trading local for global contrast:
Colourmaps are critical to how we interpret visual data. It is important that features we see are from the data and not the map.
Some colourmaps are made to be perceptually uniform - a in the data is perceived as the same visual to our eyes, across the whole range of the colourmap.
Non-uniform colourmaps can create visual features which do not exist, or hide real information.
Certain colour blindness forms are experienced in 1-5% of the population (biased towards males). Choice of colourmap can hugely impact the perception of data for these groups.
In particular try to avoid using Red–Green to draw distinctions, as this is the most common form of colour bindness
A digital image is a sampling of a continuous world onto a discrete grid. The step- or pixel size limits what information can be captured by the image.
Conversely, increasing pixel density adds value only if the information is there to sample:
A smooth ramp in intensity is fully defined by two points - we can interpolate between them to get the same quality as a densely sampled image
If the optics of the microscope cannot cleanly resolve the detail we want to see, more camera pixels will not help, we'll just have better sampled blur
For a periodic feature like atomic columns, 2 samples-per-period are sufficient according to Nyqist-Shannon, if we impose the right model when displaying the data. Though this wouldn't make very interesting images!
With reduced sampling, the denser areas of the signal are not resolved.
With extra sampling, no additional detail is added
FFTs are subject to ringing, an artefact due to aperiodicity of the input signal. Content cut by the edge of the image is a step function, and so contributes to all frequencies. This can be reduced by applying a window function which tapers to zero at the edges, for example.
A discrete image can be interpolated into a continuous coordinate system so that it can be re-sampled at new coordinates.
Interpolation does not add additional information, but can reconstruct a higher-fidelity version of the image if we have a good model of the true signal.
Interpolation is one method to acheive sub-pixel resolution in measurements, for example finding the position of intensity peaks in an image
A signal sampled at lower than its highest frequency can be subject to aliasing. The samples will ambiguously fit both the true signal and other signals at combinations of the true and sampling frequency.
Moiré patterns
A moiré pattern is a form of interference between two periodic signals, or between a signal and a sampling rate.
It is a form of aliasing in that the pattern is a product of the two characteristic frequencies.
Moiré imaging is the intentional undersampling of a periodic structure, e.g. a lattice, to record a lower-frequency alias.
The information in an image exists on a coordinate grid. We can map it onto a new grid using a transform, and so translate, stretch, rotate, shear, or generally warp the data.
Filters enhance certain information in an image, compensate for issues in the imaging system or highlight properties of the image which are beyond a simple intensity distribution.
Filtering is usually a pre-processing step before applying other methods.
Patch-based filters
A simple type of filter is patch-based. These run a procedure in the vicinity of each pixel to generate a new value for that pixel.
Edges need special treatment as their neighbourhood is limited, else the filtered image becomes smaller. Padding with zeros, periodic boundaries or reflecting the boundary are common ways to handle this.
The Median filter is a patch-based filter which is quite useful for removing extreme values, for example hot or dead pixels. A Gaussian blur would incorporate these unwanted extremes into the blurred image.
Convolutional filters are a class of patch-based filters using elementwise multiplication and summation with a small kernel to compute each new pixel value.
They can be efficiently computed using a Fourier transform since
GPUs are very efficient at computing image convolutions.
Edge filters respond to sharp transitions in image intensity, or large image gradient, and are useful in applications like peak finding or contour detection for metrology.
The size of the filter influences whether it catches sharp edges or soft edges.
Frequency space filtering
Zero-ing or modifying frequencies in the FFT of an image acts as a filtering process.
The most well-known are:
Low-pass or high-cut, which retain low-frequency information like gradients
Block the FFT far from the centre
High-pass or low-cut, which retain high-frequency information like edges
Block the central part of the FFT
Band-pass → cut both high-frequency and low-frequency information
Image segmentation algorithms label pixels of an image based on what they each represent
Poly-crystal phase and orientation mapping is a form of image segmentation, for example to measure a grain size distribution
Segmentation algorithms can use local- and non-local information to label a pixel:
Intensity of the pixel and its neighbours
Location of a pixel with respect to edges / shapes
Texture in the region of the pixel
Binary thresholding
The simplest segmentation is a hard cut in intensity: above the cut is assigned category 1 or True, below a 0 or False. For simple, high-contrast data this is often sufficient.
The connected components algorithm can be used to number isolated regions in a binary image, allowing us to count and measure properties like area and diameter.
The algorithm propagates the label of adjacent True pixels, or creates a new label, until no unlabelled pixels remain.
When an image contains intensity gradients or noise then threshold-based segmentation can be impossible.
More advanced algorithms compute feature vectors on the data - combining intensity, edges, textures etc. - to distinguish categories which share properties.
Image segmentation was an early application of convolutional neural networks (CNNs), particularly as image features are difficult to construct. The model can instead learn optimal features for the data it is trained on.
The most well-know, albeit now quite old architecture are the U-Nets, which are designed to combine information at multiple image scales to inform the segmentation.
Image restoration refers to techniques to remove artefacts or noise from an image while preserving the content.
Filtering is a type of image restoration, but is simpler. Restoration frequently estimates how an image would have been without its artefacts.
In low-dose, low signal-to-noise data, denoising is of particular interest.
Binning
Many modern electron cameras are built with dense pixel arrays - 2K or 4K images are not unusual. Summing the recorded intensity within non-overlapping patches improves signal-to-noise.
This is practically equivalent to a camera with larger but fewer pixels
Most cameras can do this during acquisition
Loss of spatial resolution might be important in certain cases
When acquisition condition allow, taking multiple rapid images to form an image stack is advantageous. We can compute statistics for each pixel and exclude clear outliers.
Stacking can also help with problems such as drift during long acquisitions.
Principal Component Analysis is a well-known tool to decompose data into a set of components that each capture the maximum variance for the data they represent.
Excluding smaller components excludes outliers and noise, since each only explains a small portion of the whole dataset
PCA is a matrix factorisation and so is very computationally intensive on large images
Rather than a simple average of local patches around each pixel, instead average all pixels in the image weighted by their similarity to the pixel being denoised.
The BM3D algorithm improves non-local means by grouping similar image patches and filtering them together. Going beyond a simple weighted average greatly improves edge and texture preservation. Extensions exist for denoising time-series and hyperspectral data.
Take care with scientific images as common models are not trained on these domains, and the "invented" data are likely misleading!
Image: CEA + Tescan
Pattern matching and image alignment
Pattern matching and image alignment
A common need in microscopy is to locate some image feature: an edge, a spot a corner - in order to measure somthing about it. This is an application of pattern matching.
A related problem is image alignment, where two-or-more images are separated by acquisition drift or change of scale, but we would like to compare the data from both images on the same grid or plot, requiring us to transform one image into the coordinate system of the other(s). Image alignment is also often referred to as image registration.
When the feature to detect is a local minimum or maximum in the intensity image, we can use peak-finding to locate it. A simple algorithm uses a maximum filter:
Peak-finding in 1D, example
In practice with noisy data it is also necessary to:
When the feature to find is not a local maximum, or we need to detect a particular pattern in the intensity rather than a point, one approach is template matching, based on the correlation between our target image and the template or pattern that we want to find.
Checking all template positions generates a 2D correlation map with peaks at all "good" matches. Then use a peak-finding algorithm (with refinement) to locate the best positions.
Template matching: filtering
Template matching is very sensitive to both template choice and image quality.
Often useful to filter the target image to acheive sharper peaks in the correlation image, leading to more precise results.
When aligning or comparing pairs of images, it is useful to have metrics which describe if two images are "close" to each other in some way. A direct subtraction of images is rarely useful as intensity levels vary and noise influences the mean.
Structural Similarity Index (SSI) tries to account for similarity of structure, contrast and intensity level separately, and was designed to replicate how humans perceive similarity between images.
Summary
Digital images underpin almost all of modern microscopy, and influence how data are acquired, interpreted and perceived.
This presentation was a very rapid overview of a lot of topics, and should be seen as a starting point for what could be done with your data.
The interactive components are based on Panel and Bokeh, which can be used both in standalone web-pages and within Jupyter to put interactivity in-line with your analysis.
A microscope lets us see, for a time
An image lets us look again
I mostly work with Python so I am afraid these slides are biased, if you work more in Matlab just add one...
More than just TEMs, as I said, there are over a 100 people working on all types of nano-characterisaion under one roof, including surface science, x-rays, optics, NMR, ions, etc.
I'll just take a moment to mention where I work
The French atomic energy agency, CEA has a site in Grenoble with a strong focus on nano-materials development and characterisation
There is a medium-sized clean room, and an advanced nano-characterisation platform, the PFNC, including a number of fairly recent corrected TEMs, with sample preparation facilities. It is part of a number of research networks including METSA in France, so there are possibilities to come use the platform if your project fits within the right criteria.
More than just TEMs, as I said, there are over a 100 people working on all types of nano-characterisaion under one roof, including surface science, x-rays, optics, NMR, ions, etc.
So, on to the content. We're going to cover a lot of topics from a high-level as you can see. The slides may be way too long, I didn't have any opportunity to practice to get a feel for the timing, but I'll try not to rush and if we don't make it to the end then no worries, they're all online if ever some of you want to see the rest.
I made the conscious choice to not include many equations or derivations of the different approaches I show, because in practice I find that I rarely if-ever have to write the basics from scratch. Instead it's about knowing what exists and combining the building blocks in clever ways. So what I've done is include links in the header of every slide that introduces a tool, and those links point to the (usually) Python library that implements that tool, usually with an example which is much better than what I could show you here.
Again I'll mention that the slides, figures and all the source code are online at the address below if ever you want to flick ahead or through the content, or use the figures for something else, it's all MIT license which means free for any use with no restrictions on modification or adaptation.
- Found it quite hard to introduce digital images without talking about images more generally first
- Sit through yet another introduction to image formation in optics and how a camera works
- At least, though, it'll be brief - I'm not the person to talk in detail about ray diagrams and sensor design
- The way I see Image formation is through the idea of ambiguity.
- Every point being hit by light is effectively a point source sending more light in all directions
- If there is a place where all of the light from a given point converges again to a point, then here we have an image of that point
- If a lot of points share a surface on which their light converges to points, then this is an image plane and here could record that light to create what we'd call in normal speak an image, e.g. a photograph or a selfie or whatever
That said in the real world we never have perfectly flat objects, or perfect lenses, so there is always some ambiguity of where the light is coming from, we just set up the system so that the light we want overpowers the light we don't, and our image is only slightly flawed as a result.
So to create what we'd call an image in normal day-to-day speak we need to record a lightfield
Back in the day this would be done with film, and light-sensitive chemical reactions did their thing in response to the local intensity of light. Usually some kind of metal salt is crystallised onto a substrate, this being light sensitive, the light causing the initiation of formation of the metal itself, which is of course opaque.
The fidelity of the recording was very much dependent on the chemistry, temperature, formulation, age of the film, particle size distribution etc
Just as an aside, I wanted to share the existence of this video by Ben of the Applied Science channel on YouTube, where he describes in great detail the process of taking photographs using Geranium leaves and photosynthesis as the reaction. The development process is a bit complex as you have to wash out the chlorophyll in a particular way, but he got pretty good results if you ask me. I highly recommend his channel for lots of interesting experiments.
So here we are, the conversion from rays of light to numbers in a very schematic way, incident intensity is proportional to voltage or some other physical quantity, this is digitized using an Analog-to-Digital converter, sometimes directly in the pixel sometimes row-by-row, or zone-by-zone, that depends on the camera hardware, but at the end you have a big grid of numbers that we hope is representative of the light-that-was.
You have certainly heard the word Pixel, and I wanted to note that is a contraction of the words "picture element" and dates from the 50s or 60s, and is the name for one of the values we record on a digital camera. Of course the word didn't exist before digital imaging, as photographic film is not structured on a grid or structured in any way, it's hard to define one value in a photographic image.
Resolution, then is either the number of pixels you have in you image, or on you camera or screen, and their shape, especially if you're talking to a regular person. In some more technical contexts resolution might also be the pixel spatial density as it's related to how finely we sample the lightfield and so how much detail we can ultimately capture.
It should go without saying that there is a lot of conversion involved between the light and the digital image on a computer, and so pixel values don't really have meaningful units out-of-the box. You might be lucky and have metadata for an image that tells you things like the pixel spacing, readout gain or offset, but rarely will this let you generate a true "number of photons per mm2" value.
I wanted to mention how true colour images work, even if in electron microscopy there isn't really such a thing. A lot of colour sensors use a thing called a Bayer Filter, which selects light from certain wavelength bands in the red, green and blue parts of the spectrum for individual pixels. Green is recorded twice as our eyes are particularly sensistive here. This forms a single gray image with a different response in different pixels, but we can electronically separate the different sub-images and recombine them to display something like the colour we would have seen. Of course there is a slight offset between the four sub-images, a kind of intrinsic chromatic abberation, but if they are recombined intelligently then we don't really notice, especially if the pixel grid is dense enough.
A bit closer to home, spectral images are just colour images but acquired with some kind of filtering, usually with a bit better separation between channels than RGB colour. It would be rare to see something like a Bayer filter in science, too, rather we have something like a spectrometer to record only one energy at a time, and we spread the recording over multiple acquisitions while politely asking the sample to not move.
At the end of the day, whether colour of spectral imaging, we still have a list of sampled numbers, now with an extra dimension of energy or channel, as well as height and width.
Right, we've been talking like an image is essentially a photograph but digital, but in reality a computer is not built to work with data that has such a meaning. A computer holds an image as a one-dimension list of binary numbers of a certain length, and we apply a set of conventions to these numbers to make it behave like the image we recorded. We have to tell the computer somehow how many binary numbers make up a single pixel, in what encoding, and also how we ordered our pixels, whether row-by-row or column-by-column, or even completely randomly or by block if you want to make life hard. Only with these rules will any operations on the numbers give the right results, or will the image display on the screen correctly. On the screen you see a representation of a hand-drawn number 2 (though very small) in binary, and also how it is meant to be interpreted on the left, and a really bad interpretation on the right, but take note that the data in both cases is identical!
Luckily in practice we rarely have to think about this kind of thing, there are enough conventions and common practice that we can just expect things to work most of the time, and when it doesn't work right it gives people like me something to do or to complain about.
I mentioned that binary numbers are what a computer sees, and these only have meaning to us if we apply some information about how to interpret them. I won't go into much detail but we have different rules for storing integer and decimal numbers, with impact on the number of digits needed to store such a format and the minimum and maximum values we can represent with such rules. The usage of all of these depends on your application, but typically unsigned integers come out of cameras, as we only have a limited range of values the sensor can produce and these are rarely negative, while floating and complex numbers are of course more meaningful when we start doing operations on the image.
The choice of number type has implications for what you can do with it, a common issue with small types like 8-bit integers is "wrapping", i.e. unexpected results when an operation overflows the maximum or minimum value a type can hold. This is much less common with languages such as Python which try to protect you as much as possible from low-level details like this, but it can still happen! That might even be worse, as you might not be expecting it.
Take note of the point that number type influences the storage requirements of an image as well as the time to process it. If you notice that suddenly an image has doubled in size on the disk you might have accidentally changed its number type from 16-bit to 32-bits per number, for example, without really much justification.
A reality of decimal or floating point numbers, even in Python, is loss of precision when trying to work with both small and large numbers at the same time - as you can see here adding 324k and 0.0055 together as float32 just drops the 0.0055, which may or may not be important to your application. Doing this using float64 would have worked fine, but takes up twice the space in memory and is usually slower to compute.
Complex numbers are stored as a pair of floating numbers representing real and imaginary parts, there are no native complex number formats
So here you have a representation of an image in 2D, with height and width or rows and columns, then two ways you could unpack that image into a linear sequence in memory, rowwise or columnwise. The consequence is that any given pixel in the 2D representation will exist at a different location in the memory according to how we unpacked it, and we need to do the correct arithmetic to find it again.
This is particularly important for spectral data and 4D-STEM as the way we unpack the data into a sequence in memory determines how we can access it later. It's important to take away that accessing data which is sequential in the memory is very very fast, while jumping around to pick out every 100th value is very slow. If the processing to do is usually on whole spectra then it pays to store the values spectrum-by-spectrum, while if you need whole images for a given channel, it pays to store it image-by-image.
Of course, you rarely get a say in this with modern hardware and software, which is sometimes a shame, but I point it out in case any of you are confronted with processing which is bizarrely slow, it's something to think about.
With all that said here are some representations of different data, with their shapes and sizes. You might be able to appreciate that if I need to extract the centra pixel from every single frame in a 4D STEM dataset then there is a low of jumping around in memory, especially if we are talking about data on a harddrive where we might only be able to traverse the data at a few hundred MB/s.
And just to prove my point, here we can try some maths with images live, right now we've computed the image times -1, note the colourbars on the right. We can try adding 5 to the images, ... see. If I copy this example here we will multiply our image by a ramp from left to right. ... We can also add or subtract images together, for example the image and the inverse image will give 0 everywhere, I hope ..., or if we add another image we see see details from both.
In microscopy we come across a panoply of different image file formats, and everyone has their favourite. A lot of the unusual ones are specific to a particular camera or equipment vendor, and can normally only be opened in their software or a few open-source packages. Then you have more general formats for binary data, which won't necessarily re-create your image for free, you need to supply the interpretation. TIFF is probably the only format capable of both representing scientific data well while also being openable on most devices without special software - though not always. It can hold decimal and complex numbers but Windows Image Viewer won't do well with that kind of file. Then at the bottom of the ranking you have general colour image formats, these can rarely if ever hold anything other than 8-bit integers, so 0-255, which are good for visualisations or PDFs, but not at all good for analysis.
I wanted to mention sparse images since not everyone is aware of them. A sparse image is a way of storing an array of data where most of the pixels are some constant value, usually zero, and so rather than unravel this array into a long sequence of zeros in memory, we just don't bother and store only values which are non-zero as well as where they should be.
As you can see on this tiny example we saved about 75% of the storage space, depending on the sparsity and how you store the data this can rise to 90-95%. An added benefit is that certain operations are much faster when we can ignore zero values - for example an element-by-element dot product between one image and another, we can cut down the number of multiply and add operations by 95% if we just don't bother to process zero-value pixels.
So I've been asked to speak about GPUs, which I am sure most of you have heard of especially with the recent boom in so-called AI.
GPUs are a computation accelerators, they let us do fairly basic maths in a very parallel way on one device. They enabled the current AI boom because most AI training can be reduced to trillions of multiply-and-addition operations, and GPUs can do this in parallel on thousands of input values in one step. Lucky for us we can get the same boosts with images, especially as GPUs were originally designed to generate and work on image data for video games, things like coordinate transformations and filtering are essentially what we do in scientific computing, and so GPUs are great when the computation to do is quite heavy.
There's not much point with small images or small data, however, as GPUs have a certain overhead. An operation which takes 1 ms on a normal CPU will probably take much longer on a GPU unless the data are already inside the GPU - the time to transfer the data to the accelerator will massively outweigh the processing time if that is all we are doing.
Luckily, at least in Python, there is very little we have to do to make code run on a GPU as opposed to CPU, unless you want to extract the maximum performance you can more-or-less change one letter and add a few magic incantations and suddenly your code will use the GPU if desired.
Perhaps a bit basic, but worth mentioning - image histograms show us the distribution of intensity values in an image, they are ubiqutious in image software, and as long as the axis limits are clear they can be useful in spotting when an image is saturated or clipping, or if we are using the full dynamic range of the image type correctly.
Something about other colourspaces?
Here we have a demonstration of brightness and contrast adjustment on an image display - note that the data are not changing, only the mapping from data to the colours shown on screen. Our data are in the range .., while the colours are 0 to 255, and the slope and position of the line determines the map between a section of the data range and the colourmap. Any values outside the range that is on the slope are clipped to 0 or 255 depending on whether they are before or after it. As I adjust the contrast note how part of the image become pure white or black, and the rest becomes sharper.
Here we have a logarithm transform on the data before display - note how low values are given more of the vertical range between 0 and 255 while higher values are on the flatter part of the curve so are represented by fewer values. The effect is we can see more detail at the low end at the cost of effective saturation at the high end. The Gamma transform does more or less the same thing for certain values of gamma, but now we can choose how the values are mapped.
Choosing a good coormap for the data is really important, as it can massively affect interpretation. More contrast is not always better, if all we are showing is quasi-random noise on a rainbow colour scale. But even if sometimes a rainbow map is a good choice, I want you to take away that there is more than one way to build a rainbow, some of which are more honest than others. See here for example the two different colourmaps displaying a simple linear ramp ... without other context which of the two looks like more interesting data ? We have a way to visualise this a bit which is to superimpose a sine wave on top of the ramp, which means we make the colour oscillate along the map. That oscillation represents our eye's ability to distinguish a value change between two different points on the colourmap, and ideally we should feel like the oscillations are "equal" everywhere. So 0.05 change around 0.2 is visually the same as a 0.05 change at 0.9. This is called perceptual uniformity.Another example are the two colourmaps Hot and Fire, both well-used in some fields, but only one was designed to be uniform, can you tell which one. Finally there are specialised colourmaps when the data warrant it, look here at this data with both negative and positive features at different scales, a normal colourmap makes it hard to know where the zero level is, while a diverging colourmap forced to be symmetric around 0
A factor which is often overlooked in data visualisation is the fact that not all of us see colour in the same way, and up to 5% of people, mostly male, have some form of altered vision. The most common form is weak red-green vision, meaning that red and green tones appear similar, as in this visualisation. This would be quite extreme but note that the simulation is effectively two-tone, blue and yellow. Given the size of this room it would not be surprising if a few people here are partially colourblind. There are a lot of colourmaps which are mostly colourblind friendly, including Viridis which is the default in a lot of packages. In general if a colourmap looks like a rainbow, there is a greater chance it will be a bad choice for some people.
Transparency is another lever along with colour that we can pull to visualise our data. In practice it just means that a pixel doesn't just replace what was already on the screen, rather it is combined in a weighted sum with what was already there. It's not particularly useful to denote data itself, as we have a hard time interpreting the transparency once it is drawn, but it can be used to help denote a lack of data or density of sampling.
Vector values are harder to display with a colourmap, and in some cases it is easier to use a different style of plot like a oriented arrows where length encodes magnitude. A cyclic colourmap wraps around to the same colour at 2pi, which can represent direction, and in some cases we can also represent magnitude using saturation of the colour. This becomes quite complex to visualise unless the viewer is used to it, though!
Here you see a visualisation of a complex image from a hologram - on the left we have the real and imaginary parts displayed separately, which don't convey much information on their own. When we display combinations of these, though, we see clearly the amplitude of the wave and the phase, which in itself is quite complex because of wrapping of values around 2pi. There are two ways to solve this, either use a cyclic colourmap which means we don't see the wrapping as a hard boundary, though we lose any concept of a phase ramp, or we unwrap the phase data, which is not really a visualisation technique but rather an analysis technique to try to coherently increment the phase value at each jump so that we get a smooth image with the correct phase ramp. I'll let someone else explain phase unwrapping, probably in the context of holography.
Hopefully this comes across on the projector, but this is a case where you can see the effects of undersampling to resolve an feature. In the corners where we have very finely spaced circles the 128x128 case simply doesn't accurately show the signal, instead we get a sort of gray blur. More pressingly but harder to see are additional ring-like artefacts that seem to be interfering with the true signal, these are a due to an effect called aliasing, which is problematic in measurements because you might not be able to distinguish real signal from sampling errors.
So, this is my one real equation and even then it's mostly for show. You're all probably familiar with Fourier series, which represent periodic functions as a sum of trigonometric functions. And I would imagine you've also come across the more general Fourier Transform, on the screen now, which is an integral to compute a decomposition of a non-periodic function in terms of complex exponentials of different frequency. Well, I'm stating that you can extend the 1D Fourier Transform to 2D by integrating over two orthogonal frequency axes, and even compute the transform quickly using what is essentially matrix multiplication, making the operation really quite efficient on images.
A Fourier transform of an image is usually not directly interpretable, especially as it is a complex array. This is my test image which has a lot of well-oriented sharp edges, and hopefully you can see there are rays in the magnitude of the FFT which are kind-of oriented perpendicular to these edges. This is about as far as we can go with direct interpretation of an FFT.
To give you a feel for what the magnitude of the FFT represents here is a zoom on the centre of the roof image FFT, where I've picked out two pixels and plotted their contribution to the Fourier transform sent back to real space. You can see that each pixel contributes an oriented sinusoid in the direction matching where it is relative to the centre, and the distance from the centre gives the frequency of that sinusoid. The image is then essentially made up of a huge weighted sum of all these sinusoids. The phase of the FFT pixels determines a phase shift of each sinusoid (making it more of a co-sinusoid) to achieve the correct sum. Note that the central pixel is special and contains the zero-frequency component, without a meaningful phase, and it encodes a constant offset which is just the mean value of all the pixels in the image.
Interpolation schemes can be very simple, the simplest being nearest-neighbour, which is to say the values at coordinates between 4 pixels is just the value of the nearest pixel to that coordinate. Bilinear essentially fits a 2D plane to each group of four pixels, then reads the interpolated value of the corresponding plane. And bicubic uses sequential polynomial fits to the region around each point until it can create a polynomial going directly through the point to interpolate, which can then be evaluated.
A quick example of what we can do with interpolation, namely if we want to extract a line profile in an image which is not aligned with the pixel grid then we have to intepolate it, doubly so if we also want to do averaging perpendicular to that line as you might be able to do in certain software. Here we can see the effect of the sampling frequency again, if I drop this value down then you see we lose a lot of fidelity and we start to show an alias of the true signal. Increasing the value beyond a certain point has no effect aside from perhaps making my laptop a bit hotter.
Speaking of aliasing, here is a little direct visualisation of that, at least in 1D. Here you can see how we are sampling a periodic signal
---
##
The alias represents the lattice, but each period covers multiple periods in the true signal.
The Moiré is **very sensitive** to small changes in lattice spacing, and so with appropriate interpretation we can very precisely measure strain or locate defects over a large field of view.
---
## Theshold choice
The right cutoff depends on the data, its range, and the intended analysis.
Algorithms exist to automatically threshold an image, e.g. [Otsu's method](https://en.wikipedia.org/wiki/Otsu%27s_method).
---
# Extra topics
A bit more physics ??
- PSF
- STEM scan patterns ?
- GPA
- Phase Reconstruction (holo etc)
- Distortion correction
TODO:
Generally add code examples
Raw data for peak finding figure is not instructive
Add diagram of GPU computation, or code examples
Explain phase unwrapping
Fix colourmap dropdown display
Add image of SSI
U-Net example image