                            ======================
                            TREESCAPER README FILE
                            ======================

============================================================================================
          Version 0.1
    Release date: July 2011
============================================================================================

This README file describes how to compile TreeScaper's code. This program is writen in C++.
The random number generator is writen by Mutsuo Saito and Makoto Matsumoto (www.math.sci.
hiroshima-u.ac.jp/~m-mat/MT/emt.html). The library clapack (http://www.netlib.org/clapack/)
is also used here.

==============================================================================

The files list following:

Xtest.out
test.out

randgen.h
warray.h
wDimEst.h
wfile.h
wimport_form.h
wmapping.h
wmatrix.h
wmix.h
wNLDR.h
wstring.h
main.cpp
randgen.cpp
warray.cpp
wDimEst.cpp
wfile.cpp
wimport_form.cpp
wmapping.cpp
wmatrix.cpp
wmix.cpp
wNLDR.cpp
wstring.cpp

dimest_parameters.csv
nldr_parameters.csv 
Makefile (example)
README.txt
TreeScaper

==============================================================================

Let me briefly describe what show we do.

1, download clapack and compile it.
2, write a Makefile
3, run it
4, output files

==============================================================================

1, download clapack and compile it.
The official webpage of clapack is http://www.netlib.org/clapack/.
Download and follow the instroduction. Then compile it.
Under the directory of CLAPACK, there is a fold called INCLUDE. Some libraries is 
also generated.

2, Makefile:
----------------------------------
CC = g++
ROOTPATH = path of CLAPACK
INCDIRS = -I$(ROOTPATH)/INCLUDE
CLAPLIB = $(ROOTPATH)/lapack_*.a          # replace * such that the name 
BLASLIB = $(ROOTPATH)/blas_*.a            # is the same as that in your computer
F2CLIB = $(ROOTPATH)/F2CLIBS/libf2c.a

LDLIBS  = $(CLAPLIB) $(BLASLIB) $(F2CLIB) -lm

TreeScaper:
				$(CC) main.cpp randgen.cpp wstring.cpp warray.cpp wmapping.cpp wmix.cpp wfile.cpp wimport_form.cpp wDimEst.cpp wNLDR.cpp $(LDLIBS) $(INCDIRS) -o TreeScaper

clean:
				rm -f TreeScaper

---------------------------------
Then: make TreeScaper   // generate binary file
			make clean        // delete binary file


3, run it.
		NLDR
			Options:
				-f: data file name.
				-t: 'DIS' represents datum is a upper triangle matrix.
						'COR' represents data are Euclidean coordinators.
				-d: the dimension of Euclidean representations.
				-c: cost function. For example, CLASSIC_MDS, KRUSKAL1, NORMALIZED, SAMMON, CCA.
				-a: algorithm. For example, LINEAR_ITERATION, MAJORIZATION, GAUSS_SEIDEL, STOCHASTIC.
				-i: initialize coordinators. RAND represent that generate data randomly.
						"CLASSIC_MDS" represent that compute low dimension representation by MDS, then
						the low dimension representation is used as initial coordinators.
				-o: suffix of output file names.
				-s: random seed, if initial coordinators are generated randomly.

		Dimension estimator.
			-dimest: this is to indicate that run dimension estimator instead of NLDR.
			Options:
				-f: data file name
				-e: 'CORR_DIM' represents correlation dimension estimator
						'NN_DIM' represents nearest neighbor estimator
						'MLE_DIM' represents Maximum likelihood estimator
				-i: 'DIS' represents datum is a upper triangle matrix.
						'COR' represents data are Euclidean coordinators.
						
		For example, we could use following command to run the test file.
		NLDR:
			./TreeScaper -f test.out -t DIS -d 2 -c CCA -a STOCHASTIC -i RAND -o 0 -s 1
		Dim:
			./TreeScaper -dimest -f test.out -e CORR_DIM -i DIS
		
	Note: The files, 'dimest_parameters.csv' and 'nldr_parameters.csv', include the parameters which are
				used in the algorithms.

4, output files
		NLDR:
			*_*D_*_COR_*.out: *s here represent the filename, dimension, method and algorithm. 
												this file is the coordinates. Each row represents a sample.
	    *_*D_*_DIS_*.out: the distance matrix of the result coordinates.
	    *_*D_*_STR_*.out: the value of stress funtion after optimization.
	    *_*D_*_TIM_*.out: the time cost.
	    *_*D_*_1NN_*.out: first element is the 1nn percentage of original distance matrix. 
	    									Second element is the 1nn percentage of the result distance matrix.
	    *_*D_*_CON_*.out: the continuity. First column is the number of neighbors considered. 
	    									The second column is the continuities which correspond to the first column.
	    *_*D_*_TRU_*.out: the trustworthiness. First column is the number of neighbors considered. 
	    									The second column is the continuities which correspond to the first column.
			
		Dimension Esitmator:
		*_CORR_DIM_logvslog.out : this is the result of correlation dimension. Plot a figure of first 
															column versus second column. The slope of this curve is the correlation dimension.
    *_CORR_DIM_deri_logvsl*	: this is the result of correlation dimension. It is the slope of *_CORR_DIM_logvslog.out 
    													and is given by using central difference method.
    *_MLE_DIM_k.out					: this is the result of maximum likelihood estimator. The ith row represents the dimension 
    													given by considerring the k nearest neighbor. The dimension is the mean of some interval of the neighborhood.
    *_MLE_DIM_dim.out				: this is the result of maximum likelihood estimator. It is the mean of the *_MLE_DIM_k.out.
    *_NN_DIM_logvslog.out		: this is the result of nearest neighbor estimator. Plot a figure of first column versus 
    													second column. The slope of this curve is the NN dimension.
    *_NN_DIM_deri_logvslog*	: this is the result of nearest neighbor estimator. It is the slope of *_NN_DIM_logvslog.out 
    													and is given by using central difference method.
			
