morpheus.core package¶

Submodules¶

morpheus.core.base_model module¶

A base class for building neural network models in TensorFlow.

class morpheus.core.base_model.Model(dataset: tensorflow.python.data.ops.dataset_ops.DatasetV2, data_format: str = 'channels_last')[source]¶

Bases: object

Base class for models.

dataset¶

Dataset Object for training

Type:	tf.data.Dataset

is_training¶

indicates if the model is training

Type:	bool

data_format¶

‘channels_first’ or ‘channels_last’

Type:	str

Required methods to override:: model_fn: the graph function
Optional methods to override:: train_metrics: to add metrics during training test_metrics: to add metrics during testing, can be same as train_metrics optimizer: updates params based on a loss tensor loss_func: defines a loss value given and x and y tensor inference: default applies softmax to tensor from model_fn

build_graph(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool) → tensorflow.python.framework.ops.Tensor[source]¶

Base function that returns model_fn evaluated on x. Don’t Override!

Parameters:	inputs (tf.Tensor) – The tensor to be processed, ie a placeholder is_training (bool) – whether or not the model is training useful for things like batch normalization or dropout
Returns:	returns the tensor that represents the result of model_fn evaluated on the input tensor

Raises: NotImplementedError if Model.model_fn() is not overwritten

inference(*args)¶: Placeholder function used as default in __init__

loss_func(*args)¶: Placeholder function used as default in __init__

model_fn(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool) → function[source]¶

Function that defines model. Needs to be Overridden!

Parameters:	inputs (tf.Tensor) – the input tensor is_training (bool) – boolean to indicate if in training phase
Returns:	Should return a function that takes two inputs tf.Tensor and bool
Raises:	NotImplementedError if not overridden

optimizer(*args)¶: Placeholder function used as default in __init__

test() -> (<class 'tensorflow.python.framework.ops.Tensor'>, <class 'tensorflow.python.framework.ops.Tensor'>)[source]¶

Builds the testing routing tensors. Don’t Override!

Returns:	the result of the self.build_graph and self.test_metrics respectively
Return type:	(logits, metrics)

Raises: NotImplementedError if Model.model_fn() is not overwritten

test_metrics(*args)¶: Placeholder function used as default in __init__

train() -> (<class 'tensorflow.python.framework.ops.Tensor'>, <class 'tensorflow.python.framework.ops.Tensor'>)[source]¶

Builds the training routine tensors. Don’t Override!

Returns:	the result of self.optimizer and self.train_metrics respectively
Return type:	(optimize, metrics)

Raises: NotImplementedError if Model.model_fn() is not overwritten

train_metrics(*args)¶: Placeholder function used as default in __init__

morpheus.core.helpers module¶

Helper classes used in Morpheus.

class morpheus.core.helpers.FitsHelper[source]¶

Bases: object

A class that handles basic FITS file functions.

static create_file(file_name: str, data_shape: tuple, dtype) → None[source]¶

Creates a fits file without loading it into memory.

This is a helper method to create large FITS files without loading an array into memory. The method follows the direction given at: http://docs.astropy.org/en/stable/generated/examples/io/skip_create-large-fits.html

Parameters:	file_name (str) – the complete path to the file to be created. data_shape (tuple) – a tuple describe the shape of the file to be created dtype (numpy datatype) – the numpy datatype used in the array
Raises:	ValueError if dtype is not one of – - np.unit8 - np.int16 - np.int32 - np.float32 - np.float64

TODO: Figure out why this throws warning about size occasionally: when files that are created by it are opened

static create_mean_var_files(shape: List[int], out_dir: str) -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]¶

Creates the output fits files for the mean/variance morpheus output.

Parameters:

shape (List[int]) – The shape to use when making the FITS files
out_dir (str) – the directory to place the files in. Will make it if it doesn’t already exist.

Returns:

for the created files Dict(str, np.ndarray): a dictionary where the key is the data

descriptor and the value is the memmapped data numpy array

Return type:

List[fits.HDUList]

static create_n_file(shape: List[int], out_dir: str) -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]¶

Creates the output fits files for the rank vote morpheus output.

Parameters:

shape (List[int]) – The shape to use when making the FITS files
out_dir (str) – the directory to place the files in. Will make it if it doesn’t already exist.

Returns:

for the created files Dict(str, np.ndarray): a dictionary where the key is the data

descriptor and the value is the memmapped data numpy array

Return type:

List[fits.HDUList]

static create_rank_vote_files(shape: List[int], out_dir: str) -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]¶

Creates the output fits files for the rank vote morpheus output.

Parameters:

shape (List[int]) – The shape to use when making the FITS files
out_dir (str) – the directory to place the files in. Will make it if it doesn’t already exist.

Returns:

for the created files Dict(str, np.ndarray): a dictionary where the key is the data

descriptor and the value is the memmapped data numpy array

Return type:

List[fits.HDUList]

static get_files(file_names: List[str], mode: str = 'readonly') -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]¶

Gets the HDULS and data handles for all the files in file_names.

This is a convience function to opening multiple FITS files using memmap.

Parameters:	file_names (List[str]) – a list of file names including paths to FITS files mode (str) – the mode to pass to fits.open
Returns:	Tuple of a list numpy arrays that are the mmapped data handles for each of the FITS files and the HDULs that go along with them

class morpheus.core.helpers.LabelHelper[source]¶

Bases: object

Class to help with label updates.

Class Variables: UPDATE_MASK (np.ndarray): the (40, 40) integer array that indicates which

parts of the output of the model to include in the calculations. default: innermost (30,30)

UPDATE_MASK_N (np.ndarray): the (40, 40) integer array that indicates which: parts of the count ‘n’ to udpate. default: all (40, 40)

MORPHOLOGIES = ['spheroid', 'disk', 'irregular', 'point_source', 'background']¶

UPDATE_MASK = array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], dtype=int16)¶

UPDATE_MASK_N = array([[1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1]], dtype=int16)¶

static finalize_rank_vote(data: dict) → None[source]¶

Finalize the rank vote by dividing by n.

Parameters:	data (dict) – a dict of numpy arrays containing the data

TODO: Refactor to accommodate large files

Returns:	None

static finalize_variance(n: numpy.ndarray, curr_sn: numpy.ndarray, final_map: List[Tuple[int, int]])[source]¶

The second of two methods used to calculate the variance online.

This method calculates the final variance value using equation 25 from

http://people.ds.cam.ac.uk/fanf2/hermes/doc/antiforgery/stats.pdf

but without performing the square root.

Parameters:	n (np.ndarray) – the current number of values included in the calculation curr_sn (np.ndarray) – the current $S_n$ values List[ (final_map) – a list of indices to calculate the final variance for
Returns:	A np.ndarray with the current $S_n$ values and variance values for all indices in final_map

static get_final_map(shape: List[int], y: int, x: int)[source]¶

Creates a pixel mapping that flags pixels that won’t be updated again.

Parameters:	shape (List[int]) – the shape of the array that x and y are indexing y (int) – the current y index x (int) – the current x index
Returns:	A list of relative indices that won’t be updated again.

static index_generator(dim0: int, dim1: int) → Iterable[Tuple[int, int]][source]¶

Creates a generator that returns indices to iterate over a 2d array.

Parameters:	dim0 (int) – The upper limit to iterate up to for the first dimension dim1 (int) – The upper limit to iterate up to for the second dimension
Returns:	A generator that yields indices to iterate over a 2d array with shape [dim0, dim1]

static iterative_mean(n: numpy.ndarray, curr_mean: numpy.ndarray, x_n: numpy.ndarray, update_mask: numpy.ndarray)[source]¶

Calculates the mean of collection in an online fashion.

The values are calculated using the following equation: http://people.ds.cam.ac.uk/fanf2/hermes/doc/antiforgery/stats.pdf, eq. 4

Parameters:	n (np.ndarray) – a 2d array containing the number of terms in mean so far, prev_mean (np.ndarray) – the current calculated mean. x_n (np.ndarray) – the new values to add to the mean update_mask (np.ndarray) – a 2d boolean array indicating which indices in the array should be updated.
Returns:	An array with the same shape as the curr_mean with the newly calculated mean values.

static iterative_rank_vote(x_n: numpy.ndarray, prev_count: numpy.ndarray, update_mask: numpy.ndarray)[source]¶

Calculates the updated values for the rank vote labels for a one class.

Parameters:	x_n (np.ndarray) – the current rank vote values for the class being updated prev_count (np.ndarray) – the array containing the running totals, should be shaped as [labels, height, width] update_mask (np.ndarray) – a boolean array indicating which values to update
Returns:	A numpy array containing the updated count values

static iterative_variance(prev_sn: numpy.ndarray, x_n: numpy.ndarray, prev_mean: numpy.ndarray, curr_mean: numpy.ndarray, update_mask: numpy.ndarray)[source]¶

The first of two methods used to calculate the variance online.

This method specifically calculates the $S_n$ value as indicated in equation 24 from:

http://people.ds.cam.ac.uk/fanf2/hermes/doc/antiforgery/stats.pdf

Parameters:	prev_sn (np.ndarray) – the $S_n$ value from the previous step x_n (np.ndarray) – the current incoming values prev_mean (np.ndarray) – the mean that was previously calculated curr_mean (np.ndarray) – the mean, including the current values update_mask (np.ndarray) – a boolean mask indicating which values to update
Returns:	An np.ndarray containg the current value for $S_n$

static make_mean_var_arrays(shape: Tuple[int, int]) → dict[source]¶

Create output arrays for use in in-memory classification.

Parameters:	shape (Tuple[int]) – The 2d (width, height) for to create the arrays

Returns: A dictionary with keys being the arrays description and values being the array itself

static make_n_array(shape: Tuple[int, int]) → dict[source]¶

Create an output array for use in in-memory classification.

Parameters:	shape (Tuple[int]) – The 2d (width, height) for to create the arrays

Returns: A dictionary with keys being the arrays description and values being the array itself

static make_rank_vote_arrays(shape: Tuple[int, int]) → dict[source]¶

Create output arrays for use in in-memory classification.

Parameters:	shape (Tuple[int]) – The 2d (width, height) for to create the arrays

Returns: A dictionary with keys being the arrays description and values being the array itself

static update_labels(data: dict, labels: numpy.ndarray, batch_idx: List[Tuple[int, int]], out_type: str) → None[source]¶

Updates the running total label values with the new output values.

Parameters:	data (dict) – data (dict): a dict of numpy arrays containing the data labels (np.ndarray) – the new output from the model batch_idx (List[Tuple[int, int]]) – a list of indices to update out_type (str) – indicates which type of output to update must be one of [‘mean_var’, ‘rank_vote’, ‘both’]
Returns:	None

static update_mean_var(data: dict, labels: numpy.ndarray, batch_idx: List[Tuple[int, int]])[source]¶

Updates the mean and variance outputs with the new model values.

Parameters:	data (dict) – a dict of numpy arrays containing the data labels (np.ndarray) – the new output from the model batch_idx (List[Tuple[int, int]]) – a list of indices to update
Returns:	None

static update_ns(data: dict, batch_idx: List[Tuple[int, int]], inc: int = 1) → None[source]¶

Updates the n values by inc.

Parameters:	data (dict) – a dictionary of numpy arrays containing the data batch_idx (List[Tuple[int, int]]) – a list of indices to update inc (int) – the number to increment n by. Default=1

Returns: None

static update_rank_vote(data: dict, labels: numpy.ndarray, batch_idx: List[Tuple[int, int]]) → None[source]¶

Updates the rank vote values with the new output.

Parameters:	data (dict) – data (dict): a dict of numpy arrays containing the data labels (np.ndarray) – the new output from the model batch_idx (List[Tuple[int, int]]) – a list of indices to update
Returns:	None

static windowed_index_generator(dim0: int, dim1: int) → Iterable[Tuple[int, int]][source]¶

Creates a generator that returns window limited indices over a 2d array.

THe generator returned by this method will yield the indices for the use of a sliding window of size N_UPDATE_MASK.shape over a 2d array with the size (dim0, dim1).

Parameters:	dim0 (int) – The upper limit to iterate up to for the first dimension dim1 (int) – The upper limit to iterate up to for the second dimension
Returns:	A generator that yields indices to iterate over a 2d array with shape [dim0, dim1]

class morpheus.core.helpers.OptionalFunc(warn_msg: str, init_func: function = None)[source]¶

Bases: object

Descriptor protocol for functions that don’t have to overriden.

This is a helper class that is used to stub methods that don’t have to be overridden.

placeholder(*args)[source]¶: Placeholder function used as default in __init__

class morpheus.core.helpers.TFLogger[source]¶

Bases: object

A helper class to color the logging text in TensorFlow.

BLUE()¶

GREEN()¶

LIGHTRED()¶

RED()¶

YELLOW()¶

static debug(msg: str) → None[source]¶

Log at debug level in yellow.

Parameters:	msg (str) – The string to be logged
Returns:	None

static error(msg: str)[source]¶

Log at error level in red.

Parameters:	msg (str) – The string to be logged
Returns:	None

static info(msg: str) → None[source]¶

Log at info level in green.

@staticmethod @staticmethodgged @staticmethod

Returns:

None

static tensor_shape(tensor: tensorflow.python.framework.ops.Tensor, log_func=None, format_str='[{}]::{}') → None[source]¶

Log the the shape of tensor ‘t’.

Parameters:	tensor (tf.Tensor) – A tensorflow Tensor logging_func (func) – logging function to to use, default tf_logger.debug format_str (str) – A string that will be passed will have .format called on it and given two arguments in the following order: - tensor_name - tensor_shape
Returns:	None

static warn(msg: str) → None[source]¶

Log at warn level in lightred.

Parameters:	msg (str) – The string to be logged
Returns:	None

morpheus.core.model module¶

Contains model code for Morpheus.

class morpheus.core.model.Morpheus(hparams: morpheus.core.hparams.HParams, dataset: tensorflow.python.data.ops.dataset_ops.DatasetV1, data_format: str)[source]¶

Bases: morpheus.core.unet.Model

The main class for the Morpheus model.

This class takes a HParams object as an argument and it should contain the following properties:

Note if you are using pretrained weights for inference only you need to mock the dataset object and use the default hparams.

You can mock the dataset object calling Morpheus.mock_dataset().

You can get the default HParams by calling Morpheus.inference_hparams().

An example call for inference only

>>> dataset = Morpheus.mock_dataset()
>>> hparams = Morpheus.inference_hparams()
>>> data_format = 'channels_last'
>>> morph = Morpheus(hparams, dataset, data_format)

Required HParams:

inference (bool): true if using pretrained model
down_filters (list): number of filters for each down conv section
num_down_convs (int): number of conv ops per down conv section
up_filters (list): number of filters for each up conv section
num_up_convs (int): number of conv ops per up conv section
batch_norm (bool): use batch normalization
dropout (bool): use dropout

Optional HParams:

learning_rate (float): learning rate for training, required if inference is set to false
dropout_rate (float): the percentage of neurons to drop [0.0, 1.0]

Parameters:	hparams (morpheus.core.hparams.HParams) – Model Hyperparameters dataset (tf.data.Dataset) – dataset to use for training data_format – channels_first or channels_last

Todo

Make optimizer a parameter

static eval_metrics(yh: tensorflow.python.framework.ops.Tensor, y: tensorflow.python.framework.ops.Tensor) → dict[source]¶

Function to generate metrics for evaluation during training.

Parameters:	yh (tf.Tensor) – network output [n,h,w,c] y (tf.Tensor) – labels [n,h,w,c]
Returns:	A dictionary collection of (tf.Tensor, tf.Tensor), where the keys are the names of the metrics and the values are running metric pairs. More infor on running accuracy metrics here: https://www.tensorflow.org/api_docs/python/tf/metrics/accuracy

static get_weights_dir() → str[source]¶: Returns the location of the weights for tf.Saver.

inference(inputs: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor[source]¶

Performs inference on input.

Parameters:	inputs (tf.Tensor) – input tensor with shape [batch_size, width, height, 5]
Returns:	A tf.Tensor of [batch_size, width, height, 5] representing the output the model, includes applying the softmax function.

static inference_hparams() → morpheus.core.hparams.HParams[source]¶

Generates a mockdataset for inference.

Returns:	a morpheus.core.hparams.HParams object with the settings for inference

loss_func(logits: tensorflow.python.framework.ops.Tensor, labels: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor[source]¶

Defines the loss function used in training.

The loss function is defined by combining cross entropy loss calculated against all 5 classes and dice loss calculated against just the background class.

Parameters:	logits (tf.Tensor) – output tensor from graph should be [batch_size, width, height, 5] labels (tf.Tensor) – labels used in training should be [batch_size, width, height, 5]
Returns:	Tensor representing loss function.
Return type:	tf.Tensor

static mock_dataset() → collections.namedtuple[source]¶

Generates a mockdataset for inference.

Returns:	A collections.namedtuple object that can be passed in place of a tf.data.Dataset for ‘dataset’ argument in the constructor

optimizer(loss: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor[source]¶

Overrides the optimizer func in morpheus.core.unet

Parameters:	loss (tf.Tensor) – The loss function tensor to pass to the optimizer
Returns:	the Tensor result of optimizer.minimize()
Return type:	tf.Tensor

test_metrics(logits: tensorflow.python.framework.ops.Tensor, labels: tensorflow.python.framework.ops.Tensor) -> ((typing.List[str], typing.List[tensorflow.python.framework.ops.Tensor]), typing.List[tensorflow.python.framework.ops.Tensor])[source]¶

Overrides the test_metrics func in morpheus.core.unet

Parameters:

logits (tf.Tensor) – the output logits from the model
labels (tf.Tensor) – the labels used during training

Returns:

Tuple(Tuple(: List(str): names of metrics, List(tf.Tensor): tensors for metrics

), List(tf.Tensor): Tensors for updating running metrics

train_metrics(logits: tensorflow.python.framework.ops.Tensor, labels: tensorflow.python.framework.ops.Tensor) -> ((typing.List[str], typing.List[tensorflow.python.framework.ops.Tensor]), typing.List[tensorflow.python.framework.ops.Tensor])[source]¶

Overrides the train_metrics func in morpheus.core.unet

Parameters:

logits (tf.Tensor) – the output logits from the model
labels (tf.Tensor) – the labels used during training

Returns:

Tuple(Tuple(: List(str): names of metrics, List(tf.Tensor): tensors for metrics

), List(tf.Tensor): Tensors for updating running metrics

morpheus.core.unet module¶

Implements variations of the U-Net architecture.

class morpheus.core.unet.Model(hparams: morpheus.core.hparams.HParams, dataset: tensorflow.python.data.ops.dataset_ops.DatasetV1, data_format='channels_last')[source]¶

Bases: morpheus.core.base_model.Model

Based on U-Net (https://arxiv.org/abs/1505.04597).

Parameters:	hparams (morpheus.core.hparams.HParams) – Hyperparamters to use dataset (tf.data.Dataset) – dataset to use for training data_format (str) – channels_first or channels_last

Required HParams:: down_filters (list): number of filters for each down conv section num_down_convs (int): number of conv ops per down conv section up_filters (list): number of filters for each up conv section num_up_convs (int): number of conv ops per up conv section batch_norm (bool): use batch normalization dropout (bool): use dropout
Optional HParams:: dropout_rate (float): the percentage of neurons to drop [0.0, 1.0]

batch_norm(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool)[source]¶

block_op(inputs: tensorflow.python.framework.ops.Tensor, num_filters: int, is_training: bool) → tensorflow.python.framework.ops.Tensor[source]¶

Basic unit of work batch_norm->conv->dropout.

Batch normalization and dropout are conditioned on the obect’s HParams

Parameters:	inputs (tf.Tensor) – input tensor num_filters (int) – number of inputs for the conv operation is_training – indicates if the model is training
Returns:	the output tensor from the block operation
Return type:	tf.Tensor

conv(inputs, num_filters, padding='same', strides=1, activation=<function relu>, name='conv', kernel_size=3)[source]¶

down_sample(inputs)[source]¶

Reduces inputs width and height by half.

Parameters:	inputs (tf.Tensor) – input tensor
Returns:	input tensor downsampled

dropout(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool)[source]¶

model_fn(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool) → tensorflow.python.framework.ops.Tensor[source]¶

Defines U-Net graph using HParams.

Parameters:	inputs (tf.Tensor) – The input tensor to the graph is_training (bool) – indicates if the model is in the training phase
Returns:	the output tensor from the graph
Return type:	tf.Tensor

TODO: add input shape check for incompatible tensor shapes

up_sample(inputs)[source]¶

Doubles inputs width and height.

Transposes the input if necessary for tf.image.resize_images

Parameters:	inputs (tf.Tensor) – input tensor
Returns:	input tensor upsampled

morpheus.core package¶

Submodules¶

morpheus.core.base_model module¶

morpheus.core.helpers module¶

morpheus.core.model module¶

morpheus.core.unet module¶

Module contents¶