morpheus.core package¶
Submodules¶
morpheus.core.base_model module¶
A base class for building neural network models in TensorFlow.
-
class
morpheus.core.base_model.
Model
(dataset: tensorflow.python.data.ops.dataset_ops.DatasetV2, data_format: str = 'channels_last')[source]¶ Bases:
object
Base class for models.
-
dataset
¶ Dataset Object for training
Type: tf.data.Dataset
-
is_training
¶ indicates if the model is training
Type: bool
-
data_format
¶ ‘channels_first’ or ‘channels_last’
Type: str
- Required methods to override:
- model_fn: the graph function
- Optional methods to override:
- train_metrics: to add metrics during training test_metrics: to add metrics during testing, can be same as train_metrics optimizer: updates params based on a loss tensor loss_func: defines a loss value given and x and y tensor inference: default applies softmax to tensor from model_fn
-
build_graph
(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool) → tensorflow.python.framework.ops.Tensor[source]¶ Base function that returns model_fn evaluated on x. Don’t Override!
Parameters: - inputs (tf.Tensor) – The tensor to be processed, ie a placeholder
- is_training (bool) – whether or not the model is training useful for things like batch normalization or dropout
Returns: returns the tensor that represents the result of model_fn evaluated on the input tensor
- Raises
- NotImplementedError if Model.model_fn() is not overwritten
-
inference
(*args)¶ Placeholder function used as default in __init__
-
loss_func
(*args)¶ Placeholder function used as default in __init__
-
model_fn
(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool) → function[source]¶ Function that defines model. Needs to be Overridden!
Parameters: - inputs (tf.Tensor) – the input tensor
- is_training (bool) – boolean to indicate if in training phase
Returns: Should return a function that takes two inputs tf.Tensor and bool
Raises: NotImplementedError if not overridden
-
optimizer
(*args)¶ Placeholder function used as default in __init__
-
test
() -> (<class 'tensorflow.python.framework.ops.Tensor'>, <class 'tensorflow.python.framework.ops.Tensor'>)[source]¶ Builds the testing routing tensors. Don’t Override!
Returns: - the result of the self.build_graph and
- self.test_metrics respectively
Return type: (logits, metrics) - Raises
- NotImplementedError if Model.model_fn() is not overwritten
-
test_metrics
(*args)¶ Placeholder function used as default in __init__
-
train
() -> (<class 'tensorflow.python.framework.ops.Tensor'>, <class 'tensorflow.python.framework.ops.Tensor'>)[source]¶ Builds the training routine tensors. Don’t Override!
Returns: - the result of self.optimizer and
- self.train_metrics respectively
Return type: (optimize, metrics) - Raises
- NotImplementedError if Model.model_fn() is not overwritten
-
train_metrics
(*args)¶ Placeholder function used as default in __init__
-
morpheus.core.helpers module¶
Helper classes used in Morpheus.
-
class
morpheus.core.helpers.
FitsHelper
[source]¶ Bases:
object
A class that handles basic FITS file functions.
-
static
create_file
(file_name: str, data_shape: tuple, dtype) → None[source]¶ Creates a fits file without loading it into memory.
This is a helper method to create large FITS files without loading an array into memory. The method follows the direction given at: http://docs.astropy.org/en/stable/generated/examples/io/skip_create-large-fits.html
Parameters: - file_name (str) – the complete path to the file to be created.
- data_shape (tuple) – a tuple describe the shape of the file to be created
- dtype (numpy datatype) – the numpy datatype used in the array
Raises: ValueError if dtype is not one of – - np.unit8 - np.int16 - np.int32 - np.float32 - np.float64
- TODO: Figure out why this throws warning about size occasionally
- when files that are created by it are opened
-
static
create_mean_var_files
(shape: List[int], out_dir: str) -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]¶ Creates the output fits files for the mean/variance morpheus output.
Parameters: - shape (List[int]) – The shape to use when making the FITS files
- out_dir (str) – the directory to place the files in. Will make it if it doesn’t already exist.
Returns: for the created files Dict(str, np.ndarray): a dictionary where the key is the data
descriptor and the value is the memmapped data numpy array
Return type: List[fits.HDUList]
-
static
create_n_file
(shape: List[int], out_dir: str) -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]¶ Creates the output fits files for the rank vote morpheus output.
Parameters: - shape (List[int]) – The shape to use when making the FITS files
- out_dir (str) – the directory to place the files in. Will make it if it doesn’t already exist.
Returns: for the created files Dict(str, np.ndarray): a dictionary where the key is the data
descriptor and the value is the memmapped data numpy array
Return type: List[fits.HDUList]
-
static
create_rank_vote_files
(shape: List[int], out_dir: str) -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]¶ Creates the output fits files for the rank vote morpheus output.
Parameters: - shape (List[int]) – The shape to use when making the FITS files
- out_dir (str) – the directory to place the files in. Will make it if it doesn’t already exist.
Returns: for the created files Dict(str, np.ndarray): a dictionary where the key is the data
descriptor and the value is the memmapped data numpy array
Return type: List[fits.HDUList]
-
static
get_files
(file_names: List[str], mode: str = 'readonly') -> (typing.List[astropy.io.fits.hdu.hdulist.HDUList], typing.List[numpy.ndarray])[source]¶ Gets the HDULS and data handles for all the files in file_names.
This is a convience function to opening multiple FITS files using memmap.
Parameters: - file_names (List[str]) – a list of file names including paths to FITS files
- mode (str) – the mode to pass to fits.open
Returns: Tuple of a list numpy arrays that are the mmapped data handles for each of the FITS files and the HDULs that go along with them
-
static
-
class
morpheus.core.helpers.
LabelHelper
[source]¶ Bases:
object
Class to help with label updates.
Class Variables: UPDATE_MASK (np.ndarray): the (40, 40) integer array that indicates which
parts of the output of the model to include in the calculations. default: innermost (30,30)- UPDATE_MASK_N (np.ndarray): the (40, 40) integer array that indicates which
- parts of the count ‘n’ to udpate. default: all (40, 40)
-
MORPHOLOGIES
= ['spheroid', 'disk', 'irregular', 'point_source', 'background']¶
-
UPDATE_MASK
= array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], dtype=int16)¶
-
UPDATE_MASK_N
= array([[1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1]], dtype=int16)¶
-
static
finalize_rank_vote
(data: dict) → None[source]¶ Finalize the rank vote by dividing by n.
Parameters: data (dict) – a dict of numpy arrays containing the data TODO: Refactor to accommodate large files
Returns: None
-
static
finalize_variance
(n: numpy.ndarray, curr_sn: numpy.ndarray, final_map: List[Tuple[int, int]])[source]¶ The second of two methods used to calculate the variance online.
This method calculates the final variance value using equation 25 from
http://people.ds.cam.ac.uk/fanf2/hermes/doc/antiforgery/stats.pdf
but without performing the square root.
Parameters: - n (np.ndarray) – the current number of values included in the calculation
- curr_sn (np.ndarray) – the current $S_n$ values
- List[ (final_map) – a list of indices to calculate the final variance for
Returns: A np.ndarray with the current $S_n$ values and variance values for all indices in final_map
-
static
get_final_map
(shape: List[int], y: int, x: int)[source]¶ Creates a pixel mapping that flags pixels that won’t be updated again.
Parameters: - shape (List[int]) – the shape of the array that x and y are indexing
- y (int) – the current y index
- x (int) – the current x index
Returns: A list of relative indices that won’t be updated again.
-
static
index_generator
(dim0: int, dim1: int) → Iterable[Tuple[int, int]][source]¶ Creates a generator that returns indices to iterate over a 2d array.
Parameters: - dim0 (int) – The upper limit to iterate up to for the first dimension
- dim1 (int) – The upper limit to iterate up to for the second dimension
Returns: A generator that yields indices to iterate over a 2d array with shape [dim0, dim1]
-
static
iterative_mean
(n: numpy.ndarray, curr_mean: numpy.ndarray, x_n: numpy.ndarray, update_mask: numpy.ndarray)[source]¶ Calculates the mean of collection in an online fashion.
The values are calculated using the following equation: http://people.ds.cam.ac.uk/fanf2/hermes/doc/antiforgery/stats.pdf, eq. 4
Parameters: - n (np.ndarray) – a 2d array containing the number of terms in mean so far,
- prev_mean (np.ndarray) – the current calculated mean.
- x_n (np.ndarray) – the new values to add to the mean
- update_mask (np.ndarray) – a 2d boolean array indicating which indices in the array should be updated.
Returns: An array with the same shape as the curr_mean with the newly calculated mean values.
-
static
iterative_rank_vote
(x_n: numpy.ndarray, prev_count: numpy.ndarray, update_mask: numpy.ndarray)[source]¶ Calculates the updated values for the rank vote labels for a one class.
Parameters: - x_n (np.ndarray) – the current rank vote values for the class being updated
- prev_count (np.ndarray) – the array containing the running totals, should be shaped as [labels, height, width]
- update_mask (np.ndarray) – a boolean array indicating which values to update
Returns: A numpy array containing the updated count values
-
static
iterative_variance
(prev_sn: numpy.ndarray, x_n: numpy.ndarray, prev_mean: numpy.ndarray, curr_mean: numpy.ndarray, update_mask: numpy.ndarray)[source]¶ The first of two methods used to calculate the variance online.
This method specifically calculates the $S_n$ value as indicated in equation 24 from:
http://people.ds.cam.ac.uk/fanf2/hermes/doc/antiforgery/stats.pdf
Parameters: - prev_sn (np.ndarray) – the $S_n$ value from the previous step
- x_n (np.ndarray) – the current incoming values
- prev_mean (np.ndarray) – the mean that was previously calculated
- curr_mean (np.ndarray) – the mean, including the current values
- update_mask (np.ndarray) – a boolean mask indicating which values to update
Returns: An np.ndarray containg the current value for $S_n$
-
static
make_mean_var_arrays
(shape: Tuple[int, int]) → dict[source]¶ Create output arrays for use in in-memory classification.
Parameters: shape (Tuple[int]) – The 2d (width, height) for to create the arrays - Returns
- A dictionary with keys being the arrays description and values being the array itself
-
static
make_n_array
(shape: Tuple[int, int]) → dict[source]¶ Create an output array for use in in-memory classification.
Parameters: shape (Tuple[int]) – The 2d (width, height) for to create the arrays - Returns
- A dictionary with keys being the arrays description and values being the array itself
-
static
make_rank_vote_arrays
(shape: Tuple[int, int]) → dict[source]¶ Create output arrays for use in in-memory classification.
Parameters: shape (Tuple[int]) – The 2d (width, height) for to create the arrays - Returns
- A dictionary with keys being the arrays description and values being the array itself
-
static
update_labels
(data: dict, labels: numpy.ndarray, batch_idx: List[Tuple[int, int]], out_type: str) → None[source]¶ Updates the running total label values with the new output values.
Parameters: - data (dict) – data (dict): a dict of numpy arrays containing the data
- labels (np.ndarray) – the new output from the model
- batch_idx (List[Tuple[int, int]]) – a list of indices to update
- out_type (str) – indicates which type of output to update must be one of [‘mean_var’, ‘rank_vote’, ‘both’]
Returns: None
-
static
update_mean_var
(data: dict, labels: numpy.ndarray, batch_idx: List[Tuple[int, int]])[source]¶ Updates the mean and variance outputs with the new model values.
Parameters: - data (dict) – a dict of numpy arrays containing the data
- labels (np.ndarray) – the new output from the model
- batch_idx (List[Tuple[int, int]]) – a list of indices to update
Returns: None
-
static
update_ns
(data: dict, batch_idx: List[Tuple[int, int]], inc: int = 1) → None[source]¶ Updates the n values by inc.
Parameters: - data (dict) – a dictionary of numpy arrays containing the data
- batch_idx (List[Tuple[int, int]]) – a list of indices to update
- inc (int) – the number to increment n by. Default=1
- Returns
- None
-
static
update_rank_vote
(data: dict, labels: numpy.ndarray, batch_idx: List[Tuple[int, int]]) → None[source]¶ Updates the rank vote values with the new output.
Parameters: - data (dict) – data (dict): a dict of numpy arrays containing the data
- labels (np.ndarray) – the new output from the model
- batch_idx (List[Tuple[int, int]]) – a list of indices to update
Returns: None
-
static
windowed_index_generator
(dim0: int, dim1: int) → Iterable[Tuple[int, int]][source]¶ Creates a generator that returns window limited indices over a 2d array.
THe generator returned by this method will yield the indices for the use of a sliding window of size N_UPDATE_MASK.shape over a 2d array with the size (dim0, dim1).
Parameters: - dim0 (int) – The upper limit to iterate up to for the first dimension
- dim1 (int) – The upper limit to iterate up to for the second dimension
Returns: A generator that yields indices to iterate over a 2d array with shape [dim0, dim1]
-
class
morpheus.core.helpers.
OptionalFunc
(warn_msg: str, init_func: function = None)[source]¶ Bases:
object
Descriptor protocol for functions that don’t have to overriden.
This is a helper class that is used to stub methods that don’t have to be overridden.
-
class
morpheus.core.helpers.
TFLogger
[source]¶ Bases:
object
A helper class to color the logging text in TensorFlow.
-
BLUE
()¶
-
GREEN
()¶
-
LIGHTRED
()¶
-
RED
()¶
-
YELLOW
()¶
-
static
debug
(msg: str) → None[source]¶ Log at debug level in yellow.
Parameters: msg (str) – The string to be logged Returns: None
-
static
error
(msg: str)[source]¶ Log at error level in red.
Parameters: msg (str) – The string to be logged Returns: None
-
static
info
(msg: str) → None[source]¶ Log at info level in green.
@staticmethod @staticmethodgged @staticmethod
- Returns:
- None
-
static
tensor_shape
(tensor: tensorflow.python.framework.ops.Tensor, log_func=None, format_str='[{}]::{}') → None[source]¶ Log the the shape of tensor ‘t’.
Parameters: - tensor (tf.Tensor) – A tensorflow Tensor
- logging_func (func) – logging function to to use, default tf_logger.debug
- format_str (str) – A string that will be passed will have .format called on it and given two arguments in the following order: - tensor_name - tensor_shape
Returns: None
-
morpheus.core.model module¶
Contains model code for Morpheus.
-
class
morpheus.core.model.
Morpheus
(hparams: morpheus.core.hparams.HParams, dataset: tensorflow.python.data.ops.dataset_ops.DatasetV1, data_format: str)[source]¶ Bases:
morpheus.core.unet.Model
The main class for the Morpheus model.
This class takes a HParams object as an argument and it should contain the following properties:
Note if you are using pretrained weights for inference only you need to mock the dataset object and use the default hparams.
You can mock the dataset object calling Morpheus.mock_dataset().
You can get the default HParams by calling Morpheus.inference_hparams().
An example call for inference only
>>> dataset = Morpheus.mock_dataset() >>> hparams = Morpheus.inference_hparams() >>> data_format = 'channels_last' >>> morph = Morpheus(hparams, dataset, data_format)
- Required HParams:
- inference (bool): true if using pretrained model
- down_filters (list): number of filters for each down conv section
- num_down_convs (int): number of conv ops per down conv section
- up_filters (list): number of filters for each up conv section
- num_up_convs (int): number of conv ops per up conv section
- batch_norm (bool): use batch normalization
- dropout (bool): use dropout
- Optional HParams:
- learning_rate (float): learning rate for training, required if inference is set to false
- dropout_rate (float): the percentage of neurons to drop [0.0, 1.0]
Parameters: - hparams (morpheus.core.hparams.HParams) – Model Hyperparameters
- dataset (tf.data.Dataset) – dataset to use for training
- data_format – channels_first or channels_last
Todo
- Make optimizer a parameter
-
static
eval_metrics
(yh: tensorflow.python.framework.ops.Tensor, y: tensorflow.python.framework.ops.Tensor) → dict[source]¶ Function to generate metrics for evaluation during training.
Parameters: - yh (tf.Tensor) – network output [n,h,w,c]
- y (tf.Tensor) – labels [n,h,w,c]
Returns: A dictionary collection of (tf.Tensor, tf.Tensor), where the keys are the names of the metrics and the values are running metric pairs. More infor on running accuracy metrics here: https://www.tensorflow.org/api_docs/python/tf/metrics/accuracy
-
inference
(inputs: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor[source]¶ Performs inference on input.
Parameters: inputs (tf.Tensor) – input tensor with shape [batch_size, width, height, 5] Returns: A tf.Tensor of [batch_size, width, height, 5] representing the output the model, includes applying the softmax function.
-
static
inference_hparams
() → morpheus.core.hparams.HParams[source]¶ Generates a mockdataset for inference.
Returns: a morpheus.core.hparams.HParams object with the settings for inference
-
loss_func
(logits: tensorflow.python.framework.ops.Tensor, labels: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor[source]¶ Defines the loss function used in training.
The loss function is defined by combining cross entropy loss calculated against all 5 classes and dice loss calculated against just the background class.
Parameters: - logits (tf.Tensor) – output tensor from graph should be [batch_size, width, height, 5]
- labels (tf.Tensor) – labels used in training should be [batch_size, width, height, 5]
Returns: Tensor representing loss function.
Return type: tf.Tensor
-
static
mock_dataset
() → collections.namedtuple[source]¶ Generates a mockdataset for inference.
Returns: A collections.namedtuple object that can be passed in place of a tf.data.Dataset for ‘dataset’ argument in the constructor
-
optimizer
(loss: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor[source]¶ Overrides the optimizer func in morpheus.core.unet
Parameters: loss (tf.Tensor) – The loss function tensor to pass to the optimizer Returns: the Tensor result of optimizer.minimize() Return type: tf.Tensor
-
test_metrics
(logits: tensorflow.python.framework.ops.Tensor, labels: tensorflow.python.framework.ops.Tensor) -> ((typing.List[str], typing.List[tensorflow.python.framework.ops.Tensor]), typing.List[tensorflow.python.framework.ops.Tensor])[source]¶ Overrides the test_metrics func in morpheus.core.unet
Parameters: - logits (tf.Tensor) – the output logits from the model
- labels (tf.Tensor) – the labels used during training
Returns: - Tuple(Tuple(
List(str): names of metrics, List(tf.Tensor): tensors for metrics
), List(tf.Tensor): Tensors for updating running metrics
-
train_metrics
(logits: tensorflow.python.framework.ops.Tensor, labels: tensorflow.python.framework.ops.Tensor) -> ((typing.List[str], typing.List[tensorflow.python.framework.ops.Tensor]), typing.List[tensorflow.python.framework.ops.Tensor])[source]¶ Overrides the train_metrics func in morpheus.core.unet
Parameters: - logits (tf.Tensor) – the output logits from the model
- labels (tf.Tensor) – the labels used during training
Returns: - Tuple(Tuple(
List(str): names of metrics, List(tf.Tensor): tensors for metrics
), List(tf.Tensor): Tensors for updating running metrics
morpheus.core.unet module¶
Implements variations of the U-Net architecture.
-
class
morpheus.core.unet.
Model
(hparams: morpheus.core.hparams.HParams, dataset: tensorflow.python.data.ops.dataset_ops.DatasetV1, data_format='channels_last')[source]¶ Bases:
morpheus.core.base_model.Model
Based on U-Net (https://arxiv.org/abs/1505.04597).
Parameters: - hparams (morpheus.core.hparams.HParams) – Hyperparamters to use
- dataset (tf.data.Dataset) – dataset to use for training
- data_format (str) – channels_first or channels_last
- Required HParams:
- down_filters (list): number of filters for each down conv section num_down_convs (int): number of conv ops per down conv section up_filters (list): number of filters for each up conv section num_up_convs (int): number of conv ops per up conv section batch_norm (bool): use batch normalization dropout (bool): use dropout
- Optional HParams:
- dropout_rate (float): the percentage of neurons to drop [0.0, 1.0]
-
block_op
(inputs: tensorflow.python.framework.ops.Tensor, num_filters: int, is_training: bool) → tensorflow.python.framework.ops.Tensor[source]¶ Basic unit of work batch_norm->conv->dropout.
Batch normalization and dropout are conditioned on the obect’s HParams
Parameters: - inputs (tf.Tensor) – input tensor
- num_filters (int) – number of inputs for the conv operation
- is_training – indicates if the model is training
Returns: the output tensor from the block operation
Return type: tf.Tensor
-
conv
(inputs, num_filters, padding='same', strides=1, activation=<function relu>, name='conv', kernel_size=3)[source]¶
-
down_sample
(inputs)[source]¶ Reduces inputs width and height by half.
Parameters: inputs (tf.Tensor) – input tensor Returns: input tensor downsampled
-
model_fn
(inputs: tensorflow.python.framework.ops.Tensor, is_training: bool) → tensorflow.python.framework.ops.Tensor[source]¶ Defines U-Net graph using HParams.
Parameters: - inputs (tf.Tensor) – The input tensor to the graph
- is_training (bool) – indicates if the model is in the training phase
Returns: the output tensor from the graph
Return type: tf.Tensor
TODO: add input shape check for incompatible tensor shapes