SUBROUTINE SPLIT2(MM, M, N, A, CLAB, RLAB, TITLE, KD, TH, IORD, * DMIWRK, IWORK, DMWORK, WORK, IERR, OUNIT) C C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> C C PURPOSE C ------- C C SPLITS MATRIX OF CASE-BY-VARIABLE DATA VALUES INTO BLOCKS UNTIL C ALL WITHIN-BLOCK VARIANCES ARE LESS THAN A GIVEN THRESHOLD. C INCLUDES USER-CONTROLLED CONSTRAINTS C C DESCRIPTION C ----------- C C 1. THE THRESHOLD IS THE LARGEST VARIANCE FOR THE DATA VALUES IN C THE BLOCKS. THE VARIABLES SHOULD BE SCALED SIMILARLY (CLUSTER C SUBROUTINE CAN BE USED TO STANDARDIZE THE VARIABLES. THE C ROUTINE STARTS WITH THE DATA MATRIX AS ONE BLOCK. THEN THE C BLOCK WITH THE LARGEST VARIANCE IS CHOSEN AND IF THAT VARIANCE C IS LARGER THAN THE THRESHOLD, THE BLOCK IS OPTIMALLY SPLIT BY C BOTH CASES AND VARIABLES. THE VARIANCES FOR THE NEW BLOCKS ARE C DETERMINED AND THE PROCESS REPEATS BY FINDING THE NEWEST C LARGEST VARIANCE. ONCE THE LARGEST VARIANCE IS LESS THAN THE C THRESHOLD, THE RESULTS ARE PRINTED IN A BLOCK DIAGRAM ON C FORTRAN UNIT OUNIT. THE THRESHOLD SHOULD BE CHOSEN WISELY AS A C LARGE THRESHOLD WILL PRODUCE A FEW LARGE BLOCKS AND A SMALL C THRESHOLD WILL PRODUCE MANY SMALL BLOCKS. C C 2. MISSING VALUES SHOULD BE REPRESENTED BY 99999. C C 3. THE CASES AND/OR VARIABLES CAN BE CONSTRAINED BY THE IORD C PARAMETER. SETTING IORD = 0 HAS BOTH CASES AND VARIABLES C UNCONSTRAINED; SETTING IORD = 1 CONSTRAINS ONLY CASES; SETTING C IORD = 2 CONSTRAINS ONLY VARIABLES; AND SETTING IORD = 3 C CONSTRAINS BOTH CASES AND VARIABLES. C C 3. THE BLOCK DIAGRAM IS THE DATA MATRIX WITH THE DATA VALUES C MULTIPLIED BY 10. THE BLOCKS ARE OUTLINED BY THE VERTICAL AND C HORIZONTAL LINES. C C INPUT PARAMETERS C ---------------- C C MM INTEGER SCALAR (UNCHANGED ON OUTPUT). C THE LEADING DIMENSION OF MATRIX A. MUST BE AT LEAST M. C C M INTEGER SCALAR (UNCHANGED ON OUTPUT). C THE NUMBER OF OBJECTS. C C N INTEGER SCALAR (UNCHANGED ON OUTPUT). C THE NUMBER OF VARIABLES. C C A REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND SECOND C DIMENSION MUST BE AT LEAST M (CHANGED ON OUTPUT). C THE DATA MATRIX. C C A(I,J) IS THE VALUE FOR THE J-TH VARIABLE FOR THE I-TH CASE. C C CLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N C (CHANGED ON OUTPUT). C ORDERED LABELS OF THE COLUMNS. C C RLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M C (CHANGED ON OUTPUT). C ORDERED LABELS OF THE ROWS. C C TITLE 10-CHARACTER VARIABLE (UNCHANGED ON OUTPUT). C TITLE OF DATA SET. C C KD INTEGER SCALAR (UNCHANGED ON OUTPUT). C MAXIMUM NUMBER OF BLOCKS. SHOULD BE BETWEEN M AND N*M. C C TH REAL SCALAR (UNCHANGED ON OUTPUT). C THRESHOLD VARIANCE FOR DATA VALUES WITHIN A BLOCK. C C IORD INTEGER SCALAR (UNCHANGED ON OUTPUT). C ORDERING PARAMETER. C C IORD = 0 CASES AND VARIABLES ARE UNCONSTRAINED C IORD = 1 CONSTRAIN CASES C IORD = 2 CONSTRAIN VARIABLES C IORD = 3 CASES AND VARIABLES ARE CONSTRAINED C C DMIWRK INTEGER SCALAR (UNCHANGED ON OUTPUT). C THE LEADING DIMENSION OF MATRIX IWORK. MUST BE AT LEAST 4. C C IWORK INTEGER MATRIX WHOSE FIRST DIMENSION MUST BE DMIWRK AND SECOND C DIMENSION MUST BE AT LEAST KC. C WORK MATRIX. C C DMWORK INTEGER SCALAR (UNCHANGED ON OUTPUT). C THE LEADING DIMENSION OF MATRIX WORK. MUST BE AT LEAST 18. C C WORK REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMWORK AND SECOND C DIMENSION MUST BE AT LEAST MAX(M,N). C WORK MATRIX. C C OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT). C UNIT NUMBER FOR OUTPUT. C C OUTPUT PARAMETER C ---------------- C C IERR INTEGER SCALAR. C ERROR FLAG. C C IERR = 0, NO ERRORS WERE DETECTED DURING EXECUTION C C IERR = 1, THE NUMBER OF BLOCKS NEEDED WAS LARGER THAN THE C NUMBER OF BLOCKS ALLOCATED. EXECUTION IS C TERMINATED. INCREASE KD. C C IERR = 2, EITHER THE FIRST AND LAST CASES OR THE CLUSTER C DIAMETER FOR A CLUSTER IS OUT OF BOUNDS. THE C CLUSTER AND ITS BOUNDARIES ARE PRINTED ON UNIT C OUNIT. EXECUTION WILL CONTINUE WITH QUESTIONABLE C RESULTS FOR THAT CLUSTER. C C REFERENCES C ---------- C C HARTIGAN, J. A. (1972) "DIRECT CLUSTERING OF A DATA MATRIX." C JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION. VOL. 67, C PAGES 123-129. C C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & C SONS, INC., NEW YORK. PAGES 251-277. C C HARTIGAN, J. A. (1975) PRINTER GRAPHICS FOR CLUSTERING. JOURNAL OF C STATISTICAL COMPUTATION AND SIMULATION. VOLUME 4,PAGES 187-213. C C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> C