Written by Chris Barnes, updates by Sam Robson


Collection of C++ normalization and preprocessing tools. Most of the tools have been designed with CNV analyses in mind however some may be useful in more general applications. The emphasis has been placed on flexibility and the programs have been designed to mimic unix tool behaviour taking input from stdin and printing to stdout which allows efficient piping of the data.


1) To check out the package use anonymous CVS access:

cvs -z3 co -P Normtools

2) Change into the Normtools directory and type make

cd Normtools

This should build the applications univariate_quantile_norm, medianIQR, make_matrix, population_medianIQR.

3) Run the tests to make sure everything is working

make test

You should see "Test completed successfully". Look in the script test/ for examples on data formats and how to run the programs.


Performs quantile normalization on a univariate signal. Quantile normalization is described in this paper:

A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.
Bolstad BM, Irizarry RA, Astrand M, Speed TP.
Bioinformatics. 2003 Jan 22;19(2):185-93 pubmed


Quantile normalisation is essentially a two step process. First the target distribution is generated then each sample is normalized to the target.
In generation mode (-gen see below) the input via stdin is a list of full directory locations of data files containing data distributions for estimation of target distribution
. The data files are assumed to contain tab delimited columns of data. In correction mode the input via stdin is the data file containing the data distribution to normalize to target distribution.


In generation mode the target distribution will be written to -targetfile [file].
In correction mode the corrected distribution (along with any columns specified by -ext_columns[n1 n2 n3]) will be output to stdout.



Performs median and/or interquartile normalisation within samples.



Combines data from individuals to make matrices of intensities.



Applies a cross population median and/or interquartile normalization on matrices of intensities.