Developer Guide
Note
Please refer to the Installation guide to configure Cosmic-CoNN.
LCO CR masks reduction
cosmic_conn/cr_pipeline holds the source code for the CR labeling pipeline. In the first of the two-phases reduction, the pipeline searches for individual LCO imaging frames that are consecutive exposures to perform reprojection using the astropy/reproject package. Consecutive frames are merged into a single FITS file that includes three image extensions and a valid mask, see Data Structure. Results from phase one are saved in input_path/aligned_fits.
Note
We provide the reprojected data in the released LCO CR dataset so phase one could be skipped by passing the --aligned flag.
In phase two, the pipeline takes an aligned FITS file from phase one as input, appends the corresponding CR masks and outputs the masked FITS in the input_path/masked_fits. cosmic_conn/reduce_cr.py is the entry file for CR reduction.An example reduction script:
$ bash scripts/reduce_lco_cr.sh
# includes the following arguments:
$ python cosmic_conn/reduce_cr.py \
--data data/demo_data \ # path to data directory
--snr_thres 5 \ # threshold to detect CR with simga > 5
--snr_thres_low 2.5 \ # lower threshold to include CR's peripheral pixels
--dilation 5 \ # dilation range for peripheral pixels
--min_exptime 99. \ # reject short exposed frames
--min_cr_size 2 \ # a minimum CR size of 2 ignores isolated hot pixels
--cpus 8 \ # cores used for multiprocessing acceleration
--aligned \ # ignore phase one with this flag
--no_png \ # do not output png preview with this flag
--comment dilation5-SNR5-2.5 # png preview files suffix
The reduction configuration and log are saved in CR_reduction_log.txt.
Train a new model with the Cosmic-CoNN framework
cosmic_conn/dl_framework holds the source code for the Cosmic-CoNN deep-learning framework. trian.py is the entry file to initiate a new training. We recommend start a training by modifying an example script, e.g. the script that trains a LCO imaging model can be found at scripts/train_lco.sh:
$ python cosmic_conn/train.py
# basic training settings:
--data path to data directory
--mode train | inference, inference mode does not create checkpoint or log
--seed assign manual seed
--random_seed it will initialize the model multiple times to find a
best random seed if flagged
--max_train_size the # of samples randomly draw in each epoch, 0 uses entire dataset
--lr LR learning rate, 0.001 by default
--milestones [MILESTONES ...] milesstones to reduce the learning rate,
e.g. '1000 2000 3000', '0' keeps LR constant
--min_exposure minimum exposure time when sampling training data
--crop training input stamp size
--batch training batch size
--comment comment is appended to the checkpoint directory name
# define the model
--model lco | hst | nres, dataset specific dataloader
--loss bce | median_bce | dice | mse, loss function used for training
--imbalance_alpha number of iterations for the Median Weighted BCE Loss
to linearly increase the lower bound alpha to 1. See
paper for detail.
--norm batch | group | instance, feature normalization method
--n_group fixed group number for group normalization
--gn_channel fixed channel number, >0 will override n_group, 0 uses
fixed group number
--conv_type unet | resnet, types for convolution module
--up_type deconv | upscale, types for deconvolution module
--down_type maxpool | avgpool | stride, types for the pooling layer
--deeper deeper network, one more downsample and upsample layer
--hidden channel # of first conv layer
--epoch total training epochs
--eval_epoch number of phase0 training epochs, only applies to batch normalization
# settings for model validation during the training
--validate_freq number per epochs to perform model validation
--validRatio the ratio of training data reserved for validation, 0.2 by default
--max_valid_size the number of sample reserved for validation, >0 will
override validRatio
--valid_crop stamp size for the center-cropping during validation
# to continue a previous training, use the following arguments
--continue_train to continue a previous training, provide the checkpoint directory name
--continue_epoch the number of epoch to continue
# only called during inference
--load_model path to load a model for inference