morpheus.core package

Submodules

morpheus.core.base_model module

A base class for building neural network models in TensorFlow.

class morpheus.core.base_model.Model(dataset: tensorflow.python.data.ops.dataset_ops.DatasetV2, data_format: str = 'channels_last')[source]

Bases: object

Base class for models.

dataset

Dataset Object for training

Type:tf.data.Dataset
is_training

indicates if the model is training

Type:bool
data_format

‘channels_first’ or ‘channels_last’

Type:str
Required methods to override:
model_fn: the graph function
Optional methods to override:
train_metrics: to add metrics during training test_metrics: to add metrics during testing, can be same as train_metrics optimizer: updates params based on a loss tensor loss_func: defines a loss value given and x and y tensor inference: default applies softmax to tensor from model_fn
build_graph(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool) → tensorflow.python.framework.ops.Tensor[source]

Base function that returns model_fn evaluated on x. Don’t Override!

Parameters:
  • inputs (tf.Tensor) – The tensor to be processed, ie a placeholder
  • is_training (bool) – whether or not the model is training useful for things like batch normalization or dropout
Returns:

returns the tensor that represents the result of model_fn evaluated on the input tensor

Raises
NotImplementedError if Model.model_fn() is not overwritten
inference(*args)

Placeholder function used as default in __init__

loss_func(*args)

Placeholder function used as default in __init__

model_fn(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool) → function[source]

Function that defines model. Needs to be Overridden!

Parameters:
  • inputs (tf.Tensor) – the input tensor
  • is_training (bool) – boolean to indicate if in training phase
Returns:

Should return a function that takes two inputs tf.Tensor and bool

Raises:

NotImplementedError if not overridden

optimizer(*args)

Placeholder function used as default in __init__

test() -> (<class 'tensorflow.python.framework.ops.Tensor'>, <class 'tensorflow.python.framework.ops.Tensor'>)[source]

Builds the testing routing tensors. Don’t Override!

Returns:
the result of the self.build_graph and
self.test_metrics respectively
Return type:(logits, metrics)
Raises
NotImplementedError if Model.model_fn() is not overwritten
test_metrics(*args)

Placeholder function used as default in __init__

train() -> (<class 'tensorflow.python.framework.ops.Tensor'>, <class 'tensorflow.python.framework.ops.Tensor'>)[source]

Builds the training routine tensors. Don’t Override!

Returns:
the result of self.optimizer and
self.train_metrics respectively
Return type:(optimize, metrics)
Raises
NotImplementedError if Model.model_fn() is not overwritten
train_metrics(*args)

Placeholder function used as default in __init__

morpheus.core.helpers module

Helper classes used in Morpheus.

class morpheus.core.helpers.FitsHelper[source]

Bases: object

A class that handles basic FITS file functions.

static create_file(file_name: str, data_shape: tuple, dtype) → None[source]

Creates a fits file without loading it into memory.

This is a helper method to create large FITS files without loading an array into memory. The method follows the direction given at: http://docs.astropy.org/en/stable/generated/examples/io/skip_create-large-fits.html

Parameters:
  • file_name (str) – the complete path to the file to be created.
  • data_shape (tuple) – a tuple describe the shape of the file to be created
  • dtype (numpy datatype) – the numpy datatype used in the array
Raises:

ValueError if dtype is not one of – - np.unit8 - np.int16 - np.int32 - np.float32 - np.float64

TODO: Figure out why this throws warning about size occasionally
when files that are created by it are opened
static create_mean_var_files(shape: List[int], out_dir: str) -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]

Creates the output fits files for the mean/variance morpheus output.

Parameters:
  • shape (List[int]) – The shape to use when making the FITS files
  • out_dir (str) – the directory to place the files in. Will make it if it doesn’t already exist.
Returns:

for the created files Dict(str, np.ndarray): a dictionary where the key is the data

descriptor and the value is the memmapped data numpy array

Return type:

List[fits.HDUList]

static create_n_file(shape: List[int], out_dir: str) -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]

Creates the output fits files for the rank vote morpheus output.

Parameters:
  • shape (List[int]) – The shape to use when making the FITS files
  • out_dir (str) – the directory to place the files in. Will make it if it doesn’t already exist.
Returns:

for the created files Dict(str, np.ndarray): a dictionary where the key is the data

descriptor and the value is the memmapped data numpy array

Return type:

List[fits.HDUList]

static create_rank_vote_files(shape: List[int], out_dir: str) -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]

Creates the output fits files for the rank vote morpheus output.

Parameters:
  • shape (List[int]) – The shape to use when making the FITS files
  • out_dir (str) – the directory to place the files in. Will make it if it doesn’t already exist.
Returns:

for the created files Dict(str, np.ndarray): a dictionary where the key is the data

descriptor and the value is the memmapped data numpy array

Return type:

List[fits.HDUList]

static get_files(file_names: List[str], mode: str = 'readonly') -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]

Gets the HDULS and data handles for all the files in file_names.

This is a convience function to opening multiple FITS files using memmap.

Parameters:
  • file_names (List[str]) – a list of file names including paths to FITS files
  • mode (str) – the mode to pass to fits.open
Returns:

Tuple of a list numpy arrays that are the mmapped data handles for each of the FITS files and the HDULs that go along with them

class morpheus.core.helpers.LabelHelper[source]

Bases: object

Class to help with label updates.

Class Variables: UPDATE_MASK (np.ndarray): the (40, 40) integer array that indicates which

parts of the output of the model to include in the calculations. default: innermost (30,30)
UPDATE_MASK_N (np.ndarray): the (40, 40) integer array that indicates which
parts of the count ‘n’ to udpate. default: all (40, 40)
MORPHOLOGIES = ['spheroid', 'disk', 'irregular', 'point_source', 'background']
UPDATE_MASK = array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], dtype=int16)
UPDATE_MASK_N = array([[1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1]], dtype=int16)
static finalize_rank_vote(data: dict) → None[source]

Finalize the rank vote by dividing by n.

Parameters:data (dict) – a dict of numpy arrays containing the data

TODO: Refactor to accommodate large files

Returns:None
static finalize_variance(n: numpy.ndarray, curr_sn: numpy.ndarray, final_map: List[Tuple[int, int]])[source]

The second of two methods used to calculate the variance online.

This method calculates the final variance value using equation 25 from

http://people.ds.cam.ac.uk/fanf2/hermes/doc/antiforgery/stats.pdf

but without performing the square root.

Parameters:
  • n (np.ndarray) – the current number of values included in the calculation
  • curr_sn (np.ndarray) – the current $S_n$ values
  • List[ (final_map) – a list of indices to calculate the final variance for
Returns:

A np.ndarray with the current $S_n$ values and variance values for all indices in final_map

static get_final_map(shape: List[int], y: int, x: int)[source]

Creates a pixel mapping that flags pixels that won’t be updated again.

Parameters:
  • shape (List[int]) – the shape of the array that x and y are indexing
  • y (int) – the current y index
  • x (int) – the current x index
Returns:

A list of relative indices that won’t be updated again.

static index_generator(dim0: int, dim1: int) → Iterable[Tuple[int, int]][source]

Creates a generator that returns indices to iterate over a 2d array.

Parameters:
  • dim0 (int) – The upper limit to iterate up to for the first dimension
  • dim1 (int) – The upper limit to iterate up to for the second dimension
Returns:

A generator that yields indices to iterate over a 2d array with shape [dim0, dim1]

static iterative_mean(n: numpy.ndarray, curr_mean: numpy.ndarray, x_n: numpy.ndarray, update_mask: numpy.ndarray)[source]

Calculates the mean of collection in an online fashion.

The values are calculated using the following equation: http://people.ds.cam.ac.uk/fanf2/hermes/doc/antiforgery/stats.pdf, eq. 4

Parameters:
  • n (np.ndarray) – a 2d array containing the number of terms in mean so far,
  • prev_mean (np.ndarray) – the current calculated mean.
  • x_n (np.ndarray) – the new values to add to the mean
  • update_mask (np.ndarray) – a 2d boolean array indicating which indices in the array should be updated.
Returns:

An array with the same shape as the curr_mean with the newly calculated mean values.

static iterative_rank_vote(x_n: numpy.ndarray, prev_count: numpy.ndarray, update_mask: numpy.ndarray)[source]

Calculates the updated values for the rank vote labels for a one class.

Parameters:
  • x_n (np.ndarray) – the current rank vote values for the class being updated
  • prev_count (np.ndarray) – the array containing the running totals, should be shaped as [labels, height, width]
  • update_mask (np.ndarray) – a boolean array indicating which values to update
Returns:

A numpy array containing the updated count values

static iterative_variance(prev_sn: numpy.ndarray, x_n: numpy.ndarray, prev_mean: numpy.ndarray, curr_mean: numpy.ndarray, update_mask: numpy.ndarray)[source]

The first of two methods used to calculate the variance online.

This method specifically calculates the $S_n$ value as indicated in equation 24 from:

http://people.ds.cam.ac.uk/fanf2/hermes/doc/antiforgery/stats.pdf

Parameters:
  • prev_sn (np.ndarray) – the $S_n$ value from the previous step
  • x_n (np.ndarray) – the current incoming values
  • prev_mean (np.ndarray) – the mean that was previously calculated
  • curr_mean (np.ndarray) – the mean, including the current values
  • update_mask (np.ndarray) – a boolean mask indicating which values to update
Returns:

An np.ndarray containg the current value for $S_n$

static make_mean_var_arrays(shape: Tuple[int, int]) → dict[source]

Create output arrays for use in in-memory classification.

Parameters:shape (Tuple[int]) – The 2d (width, height) for to create the arrays
Returns
A dictionary with keys being the arrays description and values being the array itself
static make_n_array(shape: Tuple[int, int]) → dict[source]

Create an output array for use in in-memory classification.

Parameters:shape (Tuple[int]) – The 2d (width, height) for to create the arrays
Returns
A dictionary with keys being the arrays description and values being the array itself
static make_rank_vote_arrays(shape: Tuple[int, int]) → dict[source]

Create output arrays for use in in-memory classification.

Parameters:shape (Tuple[int]) – The 2d (width, height) for to create the arrays
Returns
A dictionary with keys being the arrays description and values being the array itself
static update_labels(data: dict, labels: numpy.ndarray, batch_idx: List[Tuple[int, int]], out_type: str) → None[source]

Updates the running total label values with the new output values.

Parameters:
  • data (dict) – data (dict): a dict of numpy arrays containing the data
  • labels (np.ndarray) – the new output from the model
  • batch_idx (List[Tuple[int, int]]) – a list of indices to update
  • out_type (str) – indicates which type of output to update must be one of [‘mean_var’, ‘rank_vote’, ‘both’]
Returns:

None

static update_mean_var(data: dict, labels: numpy.ndarray, batch_idx: List[Tuple[int, int]])[source]

Updates the mean and variance outputs with the new model values.

Parameters:
  • data (dict) – a dict of numpy arrays containing the data
  • labels (np.ndarray) – the new output from the model
  • batch_idx (List[Tuple[int, int]]) – a list of indices to update
Returns:

None

static update_ns(data: dict, batch_idx: List[Tuple[int, int]], inc: int = 1) → None[source]

Updates the n values by inc.

Parameters:
  • data (dict) – a dictionary of numpy arrays containing the data
  • batch_idx (List[Tuple[int, int]]) – a list of indices to update
  • inc (int) – the number to increment n by. Default=1
Returns
None
static update_rank_vote(data: dict, labels: numpy.ndarray, batch_idx: List[Tuple[int, int]]) → None[source]

Updates the rank vote values with the new output.

Parameters:
  • data (dict) – data (dict): a dict of numpy arrays containing the data
  • labels (np.ndarray) – the new output from the model
  • batch_idx (List[Tuple[int, int]]) – a list of indices to update
Returns:

None

static windowed_index_generator(dim0: int, dim1: int) → Iterable[Tuple[int, int]][source]

Creates a generator that returns window limited indices over a 2d array.

THe generator returned by this method will yield the indices for the use of a sliding window of size N_UPDATE_MASK.shape over a 2d array with the size (dim0, dim1).

Parameters:
  • dim0 (int) – The upper limit to iterate up to for the first dimension
  • dim1 (int) – The upper limit to iterate up to for the second dimension
Returns:

A generator that yields indices to iterate over a 2d array with shape [dim0, dim1]

class morpheus.core.helpers.OptionalFunc(warn_msg: str, init_func: function = None)[source]

Bases: object

Descriptor protocol for functions that don’t have to overriden.

This is a helper class that is used to stub methods that don’t have to be overridden.

placeholder(*args)[source]

Placeholder function used as default in __init__

class morpheus.core.helpers.TFLogger[source]

Bases: object

A helper class to color the logging text in TensorFlow.

BLUE()
GREEN()
LIGHTRED()
RED()
YELLOW()
static debug(msg: str) → None[source]

Log at debug level in yellow.

Parameters:msg (str) – The string to be logged
Returns:None
static error(msg: str)[source]

Log at error level in red.

Parameters:msg (str) – The string to be logged
Returns:None
static info(msg: str) → None[source]

Log at info level in green.

@staticmethod @staticmethodgged @staticmethod

Returns:
None
static tensor_shape(tensor: tensorflow.python.framework.ops.Tensor, log_func=None, format_str='[{}]::{}') → None[source]

Log the the shape of tensor ‘t’.

Parameters:
  • tensor (tf.Tensor) – A tensorflow Tensor
  • logging_func (func) – logging function to to use, default tf_logger.debug
  • format_str (str) – A string that will be passed will have .format called on it and given two arguments in the following order: - tensor_name - tensor_shape
Returns:

None

static warn(msg: str) → None[source]

Log at warn level in lightred.

Parameters:msg (str) – The string to be logged
Returns:None

morpheus.core.model module

Contains model code for Morpheus.

class morpheus.core.model.Morpheus(hparams: morpheus.core.hparams.HParams, dataset: tensorflow.python.data.ops.dataset_ops.DatasetV1, data_format: str)[source]

Bases: morpheus.core.unet.Model

The main class for the Morpheus model.

This class takes a HParams object as an argument and it should contain the following properties:

Note if you are using pretrained weights for inference only you need to mock the dataset object and use the default hparams.

You can mock the dataset object calling Morpheus.mock_dataset().

You can get the default HParams by calling Morpheus.inference_hparams().

An example call for inference only

>>> dataset = Morpheus.mock_dataset()
>>> hparams = Morpheus.inference_hparams()
>>> data_format = 'channels_last'
>>> morph = Morpheus(hparams, dataset, data_format)
Required HParams:
  • inference (bool): true if using pretrained model
  • down_filters (list): number of filters for each down conv section
  • num_down_convs (int): number of conv ops per down conv section
  • up_filters (list): number of filters for each up conv section
  • num_up_convs (int): number of conv ops per up conv section
  • batch_norm (bool): use batch normalization
  • dropout (bool): use dropout
Optional HParams:
  • learning_rate (float): learning rate for training, required if inference is set to false
  • dropout_rate (float): the percentage of neurons to drop [0.0, 1.0]
Parameters:
  • hparams (morpheus.core.hparams.HParams) – Model Hyperparameters
  • dataset (tf.data.Dataset) – dataset to use for training
  • data_format – channels_first or channels_last

Todo

  • Make optimizer a parameter
static eval_metrics(yh: tensorflow.python.framework.ops.Tensor, y: tensorflow.python.framework.ops.Tensor) → dict[source]

Function to generate metrics for evaluation during training.

Parameters:
  • yh (tf.Tensor) – network output [n,h,w,c]
  • y (tf.Tensor) – labels [n,h,w,c]
Returns:

A dictionary collection of (tf.Tensor, tf.Tensor), where the keys are the names of the metrics and the values are running metric pairs. More infor on running accuracy metrics here: https://www.tensorflow.org/api_docs/python/tf/metrics/accuracy

static get_weights_dir() → str[source]

Returns the location of the weights for tf.Saver.

inference(inputs: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor[source]

Performs inference on input.

Parameters:inputs (tf.Tensor) – input tensor with shape [batch_size, width, height, 5]
Returns:A tf.Tensor of [batch_size, width, height, 5] representing the output the model, includes applying the softmax function.
static inference_hparams() → morpheus.core.hparams.HParams[source]

Generates a mockdataset for inference.

Returns:a morpheus.core.hparams.HParams object with the settings for inference
loss_func(logits: tensorflow.python.framework.ops.Tensor, labels: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor[source]

Defines the loss function used in training.

The loss function is defined by combining cross entropy loss calculated against all 5 classes and dice loss calculated against just the background class.

Parameters:
  • logits (tf.Tensor) – output tensor from graph should be [batch_size, width, height, 5]
  • labels (tf.Tensor) – labels used in training should be [batch_size, width, height, 5]
Returns:

Tensor representing loss function.

Return type:

tf.Tensor

static mock_dataset() → collections.namedtuple[source]

Generates a mockdataset for inference.

Returns:A collections.namedtuple object that can be passed in place of a tf.data.Dataset for ‘dataset’ argument in the constructor
optimizer(loss: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor[source]

Overrides the optimizer func in morpheus.core.unet

Parameters:loss (tf.Tensor) – The loss function tensor to pass to the optimizer
Returns:the Tensor result of optimizer.minimize()
Return type:tf.Tensor
test_metrics(logits: tensorflow.python.framework.ops.Tensor, labels: tensorflow.python.framework.ops.Tensor) -> ((typing.List[str], typing.List[tensorflow.python.framework.ops.Tensor]), typing.List[tensorflow.python.framework.ops.Tensor])[source]

Overrides the test_metrics func in morpheus.core.unet

Parameters:
  • logits (tf.Tensor) – the output logits from the model
  • labels (tf.Tensor) – the labels used during training
Returns:

Tuple(Tuple(

List(str): names of metrics, List(tf.Tensor): tensors for metrics

), List(tf.Tensor): Tensors for updating running metrics

train_metrics(logits: tensorflow.python.framework.ops.Tensor, labels: tensorflow.python.framework.ops.Tensor) -> ((typing.List[str], typing.List[tensorflow.python.framework.ops.Tensor]), typing.List[tensorflow.python.framework.ops.Tensor])[source]

Overrides the train_metrics func in morpheus.core.unet

Parameters:
  • logits (tf.Tensor) – the output logits from the model
  • labels (tf.Tensor) – the labels used during training
Returns:

Tuple(Tuple(

List(str): names of metrics, List(tf.Tensor): tensors for metrics

), List(tf.Tensor): Tensors for updating running metrics

morpheus.core.unet module

Implements variations of the U-Net architecture.

class morpheus.core.unet.Model(hparams: morpheus.core.hparams.HParams, dataset: tensorflow.python.data.ops.dataset_ops.DatasetV1, data_format='channels_last')[source]

Bases: morpheus.core.base_model.Model

Based on U-Net (https://arxiv.org/abs/1505.04597).

Parameters:
  • hparams (morpheus.core.hparams.HParams) – Hyperparamters to use
  • dataset (tf.data.Dataset) – dataset to use for training
  • data_format (str) – channels_first or channels_last
Required HParams:
down_filters (list): number of filters for each down conv section num_down_convs (int): number of conv ops per down conv section up_filters (list): number of filters for each up conv section num_up_convs (int): number of conv ops per up conv section batch_norm (bool): use batch normalization dropout (bool): use dropout
Optional HParams:
dropout_rate (float): the percentage of neurons to drop [0.0, 1.0]
batch_norm(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool)[source]
block_op(inputs: tensorflow.python.framework.ops.Tensor, num_filters: int, is_training: bool) → tensorflow.python.framework.ops.Tensor[source]

Basic unit of work batch_norm->conv->dropout.

Batch normalization and dropout are conditioned on the obect’s HParams

Parameters:
  • inputs (tf.Tensor) – input tensor
  • num_filters (int) – number of inputs for the conv operation
  • is_training – indicates if the model is training
Returns:

the output tensor from the block operation

Return type:

tf.Tensor

conv(inputs, num_filters, padding='same', strides=1, activation=<function relu>, name='conv', kernel_size=3)[source]
down_sample(inputs)[source]

Reduces inputs width and height by half.

Parameters:inputs (tf.Tensor) – input tensor
Returns:input tensor downsampled
dropout(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool)[source]
model_fn(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool) → tensorflow.python.framework.ops.Tensor[source]

Defines U-Net graph using HParams.

Parameters:
  • inputs (tf.Tensor) – The input tensor to the graph
  • is_training (bool) – indicates if the model is in the training phase
Returns:

the output tensor from the graph

Return type:

tf.Tensor

TODO: add input shape check for incompatible tensor shapes

up_sample(inputs)[source]

Doubles inputs width and height.

Transposes the input if necessary for tf.image.resize_images

Parameters:inputs (tf.Tensor) – input tensor
Returns:input tensor upsampled

Module contents