program cvt_basis_main ! !******************************************************************************* ! !! CVT_BASIS_MAIN seeks to minimize the cost of clustering data. ! ! ! Discussion: ! ! What we really want to do is take a thousand points and put ! them into 8 or 9 clusters. Each cluster will be represented ! by a special generator point (which need not be one of the original ! data points). The cost of a particular clustering is the ! sum of the squares of the distances of each data point to its ! generator point. ! ! The goal of this particular application is then to use the ! generator points as a good basis for representing the dynamics ! of the system that generated the original data. ! ! Or, in other words, we're going to solve a PDE, save a bunch ! of solutions, pick out a few representative ones, and assume ! that the most interesting behavior of solutions to the PDE ! is incorporated in the representatives. ! ! ! The data to be examined is assumed to be stored in a file. ! ! The file is assumed to contain a number of records, with each ! record stored on its own line. ! ! Each record, in turn, contains a fixed number of data values ! that describe a particular gene expression experiment. ! ! Each record will be regarded as a point in N dimensional space. ! ! The program will try to cluster the data, that is, to organize ! the data by defining a number of cluster centers, which are ! also points in N dimensional space, and assigning each record ! to the cluster associated with a particular center. ! ! The method of assigning data aims to minimize the cluster energy, ! which is taken to be the sum of the squares of the distances of ! each data point from its cluster center. ! ! In some contexts, it makes sense to use the usual Euclidean sort ! of distance. In others, it may make more sense to replace each ! data record by a normalized version, and to assign distance ! by computing angles between the unit vectors. ! ! Diary: ! ! 22 June 2002: Lee says drop double cell problem, add INOUT ! (same form as very first problem, but now more correct(??), and ! CAVITY. ! ! 21 June 2002: Corrected allocation of CLUSTER from CLUSTER_HI ! to POINT_NUM. ! ! Added run type 9 for Lee's 600 solutions of "double" cell problem. ! ! 20 June 2002: Lee requested double precision calculations. ! Strangely enough, printing out the generators with the D25.15 ! format seemed to cause some data to print as 0.xxxx-313 instead of ! 0.xxxxE-313. I tried to fix this. ! ! 20 June 2002: Lee has 500 solutions for a new region, the T-Cell. ! Preprocessing is to subtract 5/3 of steady from 1-250, and ! 1/3 of steady from 251 to 500. Also, normalize the result. ! This will be Run type 6. ! ! 19 June 2002: Max requests a run in which data, after preprocessing, ! is normalized, that is, simply divided by its length. OK. ! ! 19 June 2002: Lee requests no "#" comment records in output files. OK. ! ! 24 April 2002: Lee says I should have subtracted STEADY.DAT, not ! UP000.DAT. OK, redo. ! ! 23 April 2002: Got new data from Lee, reran Run #3 and sent results. ! There are no singleton clusters anymore, and the clusters actually ! comprise time segments. ! ! 19 April 2002: Lee complains that some of the generators do not ! have zero boundary conditions as they should. I believe that if ! all points have zero boundary conditions, then their generators ! will as well. So now I have to go back and figure out: ! * are these generators meaningless or discountable for some reason? ! * did he give me the wrong ranges or coefficients, so that the ! preprocessing I did did not fix the BC? ! * or is there something else going on. ! ! ...I discovered that the bad generators are #4 and #6, and that ! these both correspond to clusters of 1 point, and that these ! are data points #100 and #1 respectively. Highly nonrandom values. ! I asked Lee to ponder this. ! ! ...I discovered I was accidentally NOT preprocessing data point #1. ! I'll rerun, but why #100 wasn't getting fixed I still don't know. ! ! 18 April 2002: I still can't locate the problem. I found a version ! of the code that ran on March 30. I recompiled it and ran it, ! and it runs fine. I note that the problem seems to be occurring ! in HMEANS, and in particular in the first step, identification ! of the nearest cluster. Let me see if I can fix that. ! ! ...Looks like I forgot to allocate CLUSTER, which I just moved ! into the main program for some good reason. Let's see. ! ! Another problem may be the treatment of null clusters. I've ! added the variable NULL_CLUSTER_POLICY, currently set to 0, ! which means just ignore null clusters. ! ! ...FINALLY, things got better. The code was running OVERNIGHT ! and not getting results. Now it runs in 5 minutes. ! ! I had to be extra careful about null clusters - stupid junk things! ! Now, under policy 0, I do not adjust the centers of null clusters ! after detecting them, and I just make sure I divide the coordinates ! by max ( cluster_pop, 1 ) to avoid division by 0 problems. Also, ! I am starting to think about policy 1 (replace center by a data point) ! and 2 (replace center by a random point in convex hull of data), ! but after this SECOND miserable experience with trying to avoid ! zero clusters, I say just run more damn initial random configurations ! and lump it. ! ! ! 17 April 2002: Trying to reason out the normalized data code. ! The KMEANS algorithm has to change a lot. ! ! ! 16 April 2002: For Lee's Run #3, the code is running forever. ! I'm worried that the code to avoid empty clusters is the cause ! of this. I just killed a run that had taken more than 6 hours ! of time without result yet. I will turn off zero-cluster avoidance, ! and increase from 15 to 30 random starts, and I bet it runs ! in less than half an hour. The zero cluster business may have ! been exacerbated the simple normalization that Lee prescribed, ! subtracting a multiple of the steady state, which probably brings ! the solution data closer together. ! ! ! 15 April 2002: Modified code by inserting DATA_SIZE and DATA_D2_READ, ! which allow blanks and # comments in data file. This is because ! I'm starting to have multiple data files, and I would like internal ! comments. ! ! 22 March 2002: Returning to code after having worked on KMEANS ! algorithms. HMEANS is the generic name for the cluster/average ! method and KMEANS for the approach which considers moving each ! point to every cluster. I'm sticking with the RAW approach for now. ! ! 06 March 2002: With the current CVT algorithm, Mr Lee's data ! finds a local minimum most of the time, with an energy of about ! 180, and a much better minimum occasionally, with an energy of 63. ! I would estimate 90% of the solutions are the local minimum. ! Whether or not this is wise, I have spent a lot of time trying ! to think of improvements to the algorithm. I have hunted up ! two Applied Statistics algorithms, and looked at HMEANS/KMEANS ! from the Melendez book, and tried to think of better initial ! guesses for the cluster centers. Progress has been uncertain and ! slow. ! ! 28 February 2002: Max says the CVT iteration should converge. ! Rather than computing 50 random initial configurations to ! 30 iterations, try running one configuration to convergence. ! Interesting things: ! * about how many steps does this take? ! * Are we sure that pretty much any initial configuration will drive ! down to the same energy? ! Also, send Mr Lee the first set of generators, so he can get ! an idea of what's coming. ! ! Modified: ! ! 20 June 2002 ! ! Author: ! ! John Burkardt ! implicit none ! double precision a_dot_p integer, allocatable, dimension ( : ) :: cluster double precision, allocatable, dimension (:,:) :: cluster_center integer, parameter :: cluster_max = 10 integer cluster_hi integer cluster_it_max integer cluster_lo integer cluster_num integer cluster_range double precision, allocatable, dimension ( : ) :: cluster_size double precision, allocatable, dimension ( : ) :: cluster_size_inv character ( len = 100 ) data_file integer dim_num integer distance_type double precision, allocatable, dimension ( : ) :: energy integer energy_it_max logical file_exist character ( len = 100 ) gen_file integer gen_unit integer i integer ierror integer inunit integer j integer node_num double precision norm integer normal integer, parameter :: null_cluster_policy = 0 double precision, allocatable, dimension ( :, :) :: point integer point_num integer point_num2 integer run_type double precision dvec_norm2 character ( len = 11 ) s_of_i integer seed character ( len = 100 ) steady_file double precision steady_max double precision steady_norm integer swap_num integer thin double precision, allocatable, dimension ( : ) :: u double precision, allocatable, dimension ( : ) :: u_steady character ( len = 100 ) uv_file integer uv_file_num character ( len = 100 ) uv0_file double precision, allocatable, dimension ( : ) :: v double precision, allocatable, dimension ( : ) :: v_steady double precision, allocatable, dimension ( : ) :: x double precision x_max double precision x_min character ( len = 100 ) xy_file integer xy_lines integer xy_values double precision, allocatable, dimension ( : ) :: y double precision y_max double precision y_min ! call timestamp ( ) write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'CVT_BASIS' write ( *, '(a)' ) ' Arrange a set of PDE solution data into clusters.' write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' Null cluster policy:' write ( *, '(a)' ) ' 0, do nothing, accept null clusters;' write ( *, '(a)' ) ' 1, reset center to a random data point;' write ( *, '(a)' ) ' 2, reset center to random point in hull;' write ( *, '(a)' ) ' ' write ( *, '(a,i6)' ) ' NULL_CLUSTER_POLICY = ', null_cluster_policy seed = 12345678 call random_initialize ( seed ) ! ! Get the run type ! call i_input ( 'What is the run type?', run_type, ierror ) write ( *, '(a)' ) ' ' write ( *, '(a,i6)' ) ' RUN_TYPE = ', run_type if ( run_type == 1 ) then else if ( run_type == 2 ) then else if ( run_type == 3 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' For RUN_TYPE = 3, H C Lee has asked that we' write ( *, '(a)' ) ' read in the steady state solution from "STEADY.DAT"' write ( *, '(a)' ) ' and, letting SS be the steady state solution,' write ( *, '(a)' ) ' subtract 1/3 SS from solution 1' write ( *, '(a)' ) ' subtract 5/3 SS from solutions 2 through 201' write ( *, '(a)' ) ' subtract 1/3 SS from solutions 202 through 401.' write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' The clustering process is then carried out' write ( *, '(a)' ) ' on the modified data.' else if ( run_type == 4 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' For RUN_TYPE = 4, Max Gunzburger wants the same as' write ( *, '(a)' ) ' RUN_TYPE 3, but we want to skip half the data.' write ( *, '(a)' ) ' In other words, we read files 1, 3, 5, ..., 399, 401.' else if ( run_type == 5 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' For RUN_TYPE = 5, Max Gunzburger wants the same as' write ( *, '(a)' ) ' RUN_TYPE 3, but each solution is processed as in #3' write ( *, '(a)' ) ' AND then normalized as well.' else if ( run_type == 6 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' For RUN_TYPE = 6,' write ( *, '(a)' ) ' read in the steady state solution from "STEADY.TXT"' write ( *, '(a)' ) ' and, letting SS be the steady state solution,' write ( *, '(a)' ) ' subtract 5/3 SS from solutions 1 through 250' write ( *, '(a)' ) ' subtract 1/3 SS from solutions 251 through 500.' write ( *, '(a)' ) ' We do NOT normalize the result.' else if ( run_type == 7 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' For RUN_TYPE = 7,' write ( *, '(a)' ) ' read in the steady state solution from "STEADY.TXT"' write ( *, '(a)' ) ' and, letting SS be the steady state solution,' write ( *, '(a)' ) ' subtract 5/3 SS from solutions 1 through 250' write ( *, '(a)' ) ' subtract 1/3 SS from solutions 251 through 500.' write ( *, '(a)' ) ' We NORMALIZE the result.' else if ( run_type == 8 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' For RUN_TYPE = 8,' write ( *, '(a)' ) ' read in the steady state solution from "STEADY.TXT"' write ( *, '(a)' ) ' and, letting SS be the steady state solution,' write ( *, '(a)' ) ' subtract 5/3 SS from solutions 1 through 250' write ( *, '(a)' ) ' subtract 1/3 SS from solutions 251 through 500.' write ( *, '(a)' ) ' We DROP the odd numbered solutions.' write ( *, '(a)' ) ' We do NOT normalize the result.' end if ! ! Get the XY data file. ! call s_input ( 'What is the name of the XY data file?', xy_file, ierror ) call data_size ( xy_file, xy_lines, xy_values ) write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' The file "' // trim ( xy_file ) // '" contains ' & // trim ( s_of_i ( xy_lines ) ) // ' lines.' ! ! Allocate space for some arrays. ! node_num = xy_lines allocate ( u(1:node_num) ) allocate ( u_steady(1:node_num) ) allocate ( v(1:node_num) ) allocate ( v_steady(1:node_num) ) allocate ( x(1:node_num) ) allocate ( y(1:node_num) ) ! ! Read in X and Y. ! call data_d2_read ( xy_file, node_num, x, y ) ! ! Extract some interesting data. ! x_min = minval ( x(1:node_num) ) x_max = maxval ( x(1:node_num) ) y_min = minval ( y(1:node_num) ) y_max = maxval ( y(1:node_num) ) write ( *, '(a)' ) ' ' write ( *, '(a,g14.6)' ) ' X minimum : ', x_min write ( *, '(a,g14.6)' ) ' X maximum : ', x_max write ( *, '(a,g14.6)' ) ' Y minimum : ', y_min write ( *, '(a,g14.6)' ) ' Y maximum : ', y_max ! ! Get the steady state file name. ! call s_input ( 'What is the name of the steady state file?', steady_file, & ierror ) ! ! Read in the steady state solution. ! call data_d2_read ( steady_file, node_num, u_steady, v_steady ) steady_max = & maxval ( sqrt ( u_steady(1:node_num)**2 + v_steady(1:node_num)**2 ) ) steady_norm = & sqrt ( sum ( u_steady(1:node_num)**2 + v_steady(1:node_num)**2 ) ) write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' Steady state information was read from' write ( *, '(a)' ) ' the file "' // trim ( steady_file ) // '".' write ( *, '(a)' ) ' ' write ( *, '(a,g14.6)' ) ' Steady max norm = ', steady_max write ( *, '(a,g14.6)' ) ' Steady l2 norm = ', steady_norm ! ! Get the UV0 file name. ! call s_input ( 'What is the name of the first solution file?', uv0_file, & ierror ) ! ! Presumably, all the solution files have the same name as the first ! solution file, but with a numerical increment. To begin with, simply count ! the number of files. ! uv_file = uv0_file uv_file_num = 0 do if ( .not. file_exist ( uv_file ) ) then exit end if uv_file_num = uv_file_num + 1 call file_name_inc ( uv_file ) end do if ( uv_file_num == 0 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'CVT_BASIS - Fatal error!' write ( *, '(a)' ) ' There do not seem to be any solution files;' write ( *, '(a)' ) ' that is, files whose names are "incremented"' write ( *, '(a)' ) ' versions of the steady state file name.' write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' The first file we looked for was "' // & trim ( uv_file ) // '".' stop end if write ( *, '(a)' ) ' ' write ( *, '(a,i6)' ) & 'We believe the number of solution files is ', uv_file_num ! ! Now we have enough information to set up a data structure. ! ! Determine the spatial dimension (columns) and number of points (rows). ! dim_num = 2 * node_num point_num = uv_file_num write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' The data is stored in an M by N matrix.' write ( *, '(a)' ) ' ' write ( *, '(a,i6)' ) ' The "spatial" dimension M is ', dim_num write ( *, '(a,i6)' ) ' The number of data points N is ', point_num ! ! Allocate space for the POINT array. ! allocate ( point(1:dim_num,1:point_num) ) ! ! Now read the data from the individual files, process it if necessary, ! and gather it into a single array called POINT. ! uv_file = uv0_file do j = 1, point_num call data_d2_read ( uv_file, node_num, u, v ) ! ! For RUN_TYPE = 3 (or 4 or 5), H C Lee wants to ! subtract 1/3 of SS from solution 1. ! subtract 5/3 of SS from solutions 2-201, ! subtract 1/3 of SS from solutions 202-401. ! if ( run_type == 3 .or. run_type == 4 .or. run_type == 5 ) then if ( 1 <= j .and. j <= 1 ) then u(1:node_num) = u(1:node_num) - u_steady(1:node_num) / 3.0D+00 v(1:node_num) = v(1:node_num) - v_steady(1:node_num) / 3.0D+00 else if ( 2 <= j .and. j <= 201 ) then u(1:node_num) = u(1:node_num) - 5.0D+00 * u_steady(1:node_num) / 3.0D+00 v(1:node_num) = v(1:node_num) - 5.0D+00 * v_steady(1:node_num) / 3.0D+00 else if ( 202 <= j .and. j <= 401 ) then u(1:node_num) = u(1:node_num) - u_steady(1:node_num) / 3.0D+00 v(1:node_num) = v(1:node_num) - v_steady(1:node_num) / 3.0D+00 end if ! ! For RUN_TYPE = 6 or 7 or 8, H C Lee wants to ! subtract 5/3 of SS from solutions 1-250, ! subtract 1/3 of SS from solutions 251-500. ! else if ( run_type == 6 .or. run_type == 7 .or. run_type == 8 ) then if ( 1 <= j .and. j <= 250 ) then u(1:node_num) = u(1:node_num) - 5.0D+00 * u_steady(1:node_num) / 3.0D+00 v(1:node_num) = v(1:node_num) - 5.0D+00 * v_steady(1:node_num) / 3.0D+00 else if ( 251 <= j .and. j <= 500 ) then u(1:node_num) = u(1:node_num) - u_steady(1:node_num) / 3.0D+00 v(1:node_num) = v(1:node_num) - v_steady(1:node_num) / 3.0D+00 end if end if point(1:2*node_num-1:2,j) = u(1:node_num) point(2:2*node_num :2,j) = v(1:node_num) ! ! 19 June 2002: At Max's request, normalize each vector after it's ! been processed. ! if ( run_type == 5 .or. run_type == 7 ) then norm = sqrt ( sum ( point(1:2*node_num,j)**2 ) ) if ( norm > 0.0D+00 ) then point(1:2*node_num,j) = point(1:2*node_num,j) / norm end if end if call file_name_inc ( uv_file ) end do write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'All the data has been read into POINT.' ! ! For RUN_TYPE = 4, thin the data by dropping EVEN values. ! if ( run_type == 4 ) then thin = 2 point_num2 = 1 + ( point_num - 1 ) / thin point(1:2*node_num,1:point_num2) = point(1:2*node_num,1:point_num:thin) write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' RUN_TYPE = 4:' write ( *, '(a)' ) ' Thin out the input data points.' write ( *, '(a)' ) ' ' write ( *, '(a,i6)' ) ' Thinning increment is ', thin write ( *, '(a,i6)' ) ' Original input data size is ', point_num write ( *, '(a,i6)' ) ' Thinned data size is ', point_num2 point_num = point_num2 ! ! For RUN_TYPE = 8, thin the data by dropping ODD values. ! else if ( run_type == 8 ) then thin = 2 point_num2 = point_num / thin point(1:2*node_num,1:point_num2) = point(1:2*node_num,2:point_num:thin) write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' RUN_TYPE = 8:' write ( *, '(a)' ) ' Thin out the input data points.' write ( *, '(a)' ) ' ' write ( *, '(a,i6)' ) ' Thinning increment is ', thin write ( *, '(a,i6)' ) ' Original input data size is ', point_num write ( *, '(a,i6)' ) ' Thinned data size is ', point_num2 point_num = point_num2 end if ! ! Get the range of cluster sizes to check. ! call i_range_input ( 'Enter lower and upper number of clusters', & cluster_lo, cluster_hi, ierror ) if ( cluster_hi > point_num ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' CLUSTER_HI exceeds number of points.' cluster_hi = point_num end if write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' The program will try to determine the minimum energy' write ( *, '(a)' ) ' of a clustering, for cluster sizes in the range:' write ( *, '(2x,2i6)' ) cluster_lo, cluster_hi allocate ( cluster(1:point_num) ) allocate ( cluster_center(1:dim_num,1:cluster_hi) ) allocate ( energy(1:cluster_hi) ) ! ! Get the number of different random starting cluster configurations. ! call i_input ( & 'Enter the number of different random cluster configurations to check', & cluster_it_max, ierror ) write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' For each number of clusters, the number of' write ( *, '(a)' ) ' distinct initial random configurations to be checked' write ( *, '(a,i6)' ) ' will be ', cluster_it_max ! ! Get the number of energy iterations for a particular random start: ! call i_input ( 'Enter the number of energy iterations', & energy_it_max, ierror ) write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' For each initial random configuration, the number of' write ( *, '(a)' ) ' times the program will recompute the cluster centers,' write ( *, '(a,i6)' ) ' cluster components, and energy is ', energy_it_max ! ! Get the normalization option: ! call i_input ( 'Enter 0 to use raw data, 1 to use normalized data.', & normal, ierror ) ! !---------------------------------------------------------------------------- ! EITHER: ! ! A) Compute the CVT of the raw data ! ! OR ! ! B) Compute the CVT of normalized data ! !---------------------------------------------------------------------------- ! if ( normal == 0 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'NORMAL = 0' write ( *, '(a)' ) ' Data will NOT be normalized.' ! ! Analyze the data. ! call analysis_raw ( dim_num, point_num, point, cluster_lo, & cluster_hi, cluster_it_max, energy_it_max, energy, cluster_center, & null_cluster_policy ) ! ! Preprocess the data. ! In particular, we plan to consider linear combinations of solutions, ! with the steady state solution being our fundamental solution. ! !---------------------------------------------------------------------------- ! ! THOUGHTS: ! ! Now that I think about it, shouldn't column 1 be completely 0? ! Does the steady state solution have any special status? ! It does if I'm subtracting it from all the others. ! It may deserve that status, particularly from physical reasons. ! Then I'm doing a Voronoi tesselation of some affine space. ! But in that case, I don't want to keep the steady state solution ! in column 1, but rather, all 0's. ! ! So I think I've convinced myself that what I am computing are ! generators for the affine space of perturbations from the steady state. ! ! I save a copy of UV0 separately. ! I include 0 as a data point in column 1. ! ! In fact, now I think I need to constrain the code to always include ! 0 as a generator. ! !---------------------------------------------------------------------------- ! else if ( normal == 1 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'NORMAL = 1' write ( *, '(a)' ) ' Data WILL be normalized.' ! ! Normalize the steady state solution (column 1). ! call dvec_unit_euclidean ( dim_num, point(1:dim_num,1) ) ! ! Subtract the projection of the normalized steady state solution ! from the later solutions, and normalize them. ! do j = 2, point_num a_dot_p = dot_product ( point(1:dim_num,j), point(1:dim_num,1) ) point(1:dim_num,j) = point(1:dim_num,j) - a_dot_p * point(1:dim_num,1) call dvec_unit_euclidean ( dim_num, point(1:dim_num,j) ) end do ! ! Now zap column 1. ! point(1:dim_num,1) = 0.0D+00 ! ! Report any zero columns. ! write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'Zero columns:' write ( *, '(a)' ) '(Expecting only column 1)' write ( *, '(a)' ) ' ' do j = 1, point_num if ( dvec_norm2 ( dim_num, point(1:dim_num,j) ) == 0.0D+00 ) then write ( *, '(i6)' ) j end if end do ! ! Analyze the data. ! call analysis_normal ( dim_num, point_num, point, cluster_lo, & cluster_hi, cluster_it_max, energy_it_max, energy, cluster_center, & null_cluster_policy ) ! ! Bad value of NORMAL. ! else write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'CVT_BASIS - Fatal error!' write ( *, '(a,i6)' ) ' Bad value of NORMAL = ', normal stop end if ! ! Multiple cluster values, presumably we want an energy plot. ! if ( cluster_lo < cluster_hi ) then allocate ( cluster_size(1:cluster_hi) ) allocate ( cluster_size_inv(1:cluster_hi) ) do i = 1, cluster_hi cluster_size(i) = dble ( i ) end do cluster_size_inv(1:cluster_hi) = 1.0D+00 / cluster_size(1:cluster_hi) cluster_range = cluster_hi + 1 - cluster_lo call data_to_gnuplot ( 'raw.txt', cluster_range, & cluster_size(cluster_lo:cluster_hi), energy(cluster_lo:cluster_hi) ) call data_to_gnuplot ( 'raw2.txt', cluster_range, & cluster_size_inv(cluster_lo:cluster_hi), energy(cluster_lo:cluster_hi) ) deallocate ( cluster_size ) deallocate ( cluster_size_inv ) ! ! If there was only one cluster value, presumably we want to ! * print out the population of each cluster; ! * write out the generators. ! else cluster_num = cluster_hi if ( normal == 0 ) then cluster(1:point_num) = 1 call nearest_cluster_raw ( dim_num, point_num, cluster_num, & cluster_center, point, cluster, swap_num ) else ! call nearest_cluster_normal ( dim_num, point_num, cluster_num, & ! cluster_center, point, cluster, swap_num ) end if call cluster_census ( dim_num, point_num, cluster_num, cluster_center, & point, cluster ) if ( .false. ) then call cluster_list ( point_num, cluster ) end if write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'CVT_BASIS:' write ( *, '(a)' ) ' Writing cluster generators to individual files.' write ( *, '(a)' ) ' ' gen_file = 'gen_000.txt' call get_unit ( gen_unit ) do j = 1, cluster_hi call file_name_inc ( gen_file ) write ( *, '(a)' ) ' Write file ' // trim ( gen_file ) u(1:node_num) = cluster_center(1:2*node_num-1:2,j) v(1:node_num) = cluster_center(2:2*node_num :2,j) call data_d2_write ( gen_file, node_num, u, v ) end do end if deallocate ( cluster ) deallocate ( cluster_center ) deallocate ( energy ) deallocate ( point ) deallocate ( u ) deallocate ( u_steady ) deallocate ( v ) deallocate ( v_steady ) deallocate ( x ) deallocate ( y ) write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'CVT_BASIS' write ( *, '(a)' ) ' Normal end of execution.' write ( *, '(a)' ) ' ' call timestamp ( ) stop end subroutine analysis_normal ( dim_num, point_num, point, cluster_lo, & cluster_hi, cluster_it_max, it_max, energy, cluster_center, & null_cluster_policy ) ! !******************************************************************************* ! !! ANALYSIS_NORMAL computes the energy for a range of number of clusters. ! ! ! Discussion: ! ! This version of the analysis routine is for "normalized" data. That is, ! all the solution vectors are replaced by their normalized differences ! with the steady state solution. Thus, our space is essentially ! a sphere, and the distance between two points is now the angle ! between them (in the plane defined by the two points and the ! steady state solution). ! ! Modified: ! ! 17 April 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer NORMAL: ! 0, analyze the raw data. ! 1, analyze the normalized data. ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer POINT_NUM, the number of data points. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the data points. ! ! Input, integer CLUSTER_LO, CLUSTER_HI, the low and high ! cluster numbers that define a range of clusters to check. ! ! Input, integer CLUSTER_IT_MAX, the number of different random ! startup configurations to try. ! ! Input, integer ENERGY_IT_MAX, the maximum number of energy iterations. ! ! Output, double precision ENERGY(CLUSTER_HI), contains in entries CLUSTER_LO ! through CLUSTER_HI the estimated minimum energy over all clusterings ! of this size. ! ! Output, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_HI), contains the ! generators for the minimum cluster energy calculation, but only ! for the last cluster size considered, namely, CLUSTER_HI. ! ! Input, integer NULL_CLUSTER_POLICY, specifies what to do if a ! null cluster is encountered. ! 0, do nothing. ! 1, reset center of null cluster to a random data point; ! 2, reset center of null cluster to a random point in the data hull; ! implicit none ! integer cluster_hi integer dim_num integer point_num ! integer, dimension ( point_num ) :: cluster double precision, dimension (dim_num,cluster_hi) :: cluster_center double precision, dimension (dim_num,cluster_hi) :: cluster_center_test integer cluster_it integer cluster_it_max integer cluster_lo integer cluster_num double precision, dimension (cluster_hi) :: energy double precision energy_max(cluster_hi) double precision energy_min(cluster_hi) double precision energy_test integer i integer ierror integer it integer it_max integer j integer k double precision norm integer null_cluster_policy double precision, dimension (dim_num,point_num) :: point double precision, dimension (dim_num) :: r_min double precision, dimension (dim_num) :: r_max double precision t ! if ( .false. ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'ANALYSIS_NORMAL:' write ( *, '(a)' ) ' Point data:' write ( *, '(a)' ) ' ' call point_print ( dim_num, point_num, point ) end if ! ! Compute the minimum and maximum component values. ! do i = 1, dim_num r_min(i) = minval ( point(i,1:point_num) ) r_max(i) = maxval ( point(i,1:point_num) ) end do if ( dim_num <= 20 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' Minimum and maximum data values:' write ( *, '(a)' ) ' ' do i = 1, dim_num write ( *, '(i6,2f10.4)' ) i, r_min(i), r_max(i) end do end if ! ! Consider a range of clusters. ! do cluster_num = cluster_lo, cluster_hi ! ! For each cluster size, try several random starting configurations. ! energy_min(cluster_num) = huge ( energy_min ) energy_max(cluster_num) = 0.0D+00 do cluster_it = 1, cluster_it_max call hmeans_normal ( dim_num, point_num, cluster_num, it_max, it, & point, cluster, cluster_center, energy, null_cluster_policy ) call kmeans_normal ( dim_num, point_num, cluster_num, it_max, it, & point, cluster, cluster_center, energy, null_cluster_policy ) if ( energy_test < energy_min(cluster_num) ) then energy_min(cluster_num) = energy_test cluster_center(1:dim_num,1:cluster_hi) = & cluster_center_test(1:dim_num,1:cluster_hi) end if energy_max(cluster_num) = max ( energy_max(cluster_num), energy_test ) end do energy(cluster_num) = energy_min(cluster_num) end do ! ! Report energy ranges for the various cluster sizes. ! write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'ANALYSIS_NORMAL:' write ( *, '(a)' ) ' Computed energy range for given cluster size:' write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' (The minimum and maximum should be close if' write ( *, '(a)' ) ' we''re taking enough iterations.)' write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' Cluster Minimum Maximum' write ( *, '(a)' ) ' Size Energy Energy' write ( *, '(a)' ) ' ' do cluster_num = cluster_lo, cluster_hi write ( *, '(i9,2f12.4)' ) & cluster_num, energy_min(cluster_num), energy_max(cluster_num) end do ! ! Report best energy for the various cluster sizes. ! write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' Energy table:' write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'Cluster Energy' write ( *, '(a)' ) 'Size Energy /point Sqrt(E/Pt)' write ( *, '(a)' ) ' ' do cluster_num = cluster_lo, cluster_hi t = energy(cluster_num) / dble ( point_num ) write ( *, '(i9,3f12.4)' ) cluster_num, energy(cluster_num), t, sqrt ( t ) end do return end subroutine analysis_raw ( dim_num, point_num, point, cluster_lo, cluster_hi, & cluster_it_max, energy_it_max, energy, cluster_center, null_cluster_policy ) ! !******************************************************************************* ! !! ANALYSIS_RAW computes the energy for a range of number of clusters. ! ! ! Discussion: ! ! This version of the analysis routine is for "raw" data. That is, ! all the solution vectors are treated as points in Euclidean space ! with the usual distance. ! ! Modified: ! ! 17 April 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer NORMAL: ! 0, analyze the raw data. ! 1, analyze the normalized data. ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer POINT_NUM, the number of data points. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the data points. ! ! Input, integer CLUSTER_LO, CLUSTER_HI, the low and high ! cluster numbers that define a range of clusters to check. ! ! Input, integer CLUSTER_IT_MAX, the number of different random ! startup configurations to try. ! ! Input, integer ENERGY_IT_MAX, the maximum number of energy iterations. ! ! Output, double precision ENERGY(CLUSTER_HI), contains in entries CLUSTER_LO ! through CLUSTER_HI the estimated minimum energy over all clusterings ! of this size. ! ! Output, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_HI), contains the ! generators for the minimum cluster energy calculation, but only ! for the last cluster size considered, namely, CLUSTER_HI. ! ! Input, integer NULL_CLUSTER_POLICY, specifies what to do if a ! null cluster is encountered. ! 0, do nothing. ! 1, reset center of null cluster to a random data point; ! 2, reset center of null cluster to a random point in the data hull; ! implicit none ! integer cluster_hi integer dim_num integer point_num ! integer, dimension ( point_num ) :: cluster double precision, dimension (dim_num,cluster_hi) :: cluster_center double precision, dimension (dim_num,cluster_hi) :: cluster_center_test integer cluster_it integer cluster_it_max integer cluster_lo integer cluster_num logical, parameter :: debug = .true. double precision, dimension (cluster_hi) :: energy integer energy_it_max double precision energy_max(cluster_hi) double precision energy_min(cluster_hi) double precision energy_test integer i integer ierror integer it integer j integer k double precision norm integer null_cluster_policy double precision, dimension (dim_num,point_num) :: point double precision, dimension (dim_num) :: r_min double precision, dimension (dim_num) :: r_max double precision t ! if ( .false. ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'ANALYSIS_RAW:' write ( *, '(a)' ) ' Point data:' write ( *, '(a)' ) ' ' call point_print ( dim_num, point_num, point ) end if ! ! Compute the minimum and maximum component values. ! do i = 1, dim_num r_min(i) = minval ( point(i,1:point_num) ) r_max(i) = maxval ( point(i,1:point_num) ) end do if ( dim_num <= 20 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' Minimum and maximum data values:' write ( *, '(a)' ) ' ' do i = 1, dim_num write ( *, '(i6,2f10.4)' ) i, r_min(i), r_max(i) end do end if ! ! Consider a range of clusters. ! do cluster_num = cluster_lo, cluster_hi write ( *, '(a)' ) ' ' write ( *, '(a,i6)' ) 'Number of clusters allowed: ', cluster_num ! ! For each cluster size, try several random starting configurations. ! energy_min(cluster_num) = huge ( energy_min ) energy_max(cluster_num) = 0.0D+00 do cluster_it = 1, cluster_it_max write ( *, '(a)' ) ' ' write ( *, '(i6)' ) cluster_it call cluster_initialize_raw ( dim_num, point_num, cluster_num, point, & cluster, cluster_center_test, energy_test ) it = 0 write ( *, '(a,g14.6,i6)' ) 'Initial_RAW ', energy_test, it call hmeans_raw ( dim_num, point_num, cluster_num, energy_it_max, & it, point, cluster, cluster_center_test, energy_test, & null_cluster_policy ) write ( *, '(a,g14.6,i6)' ) 'HMEANS_RAW ', energy_test, it call kmeans_raw ( dim_num, point_num, cluster_num, energy_it_max, & it, point, cluster, cluster_center_test, energy_test, & null_cluster_policy ) write ( *, '(a,g14.6,i6)' ) 'KMEANS_RAW ', energy_test, it if ( energy_test < energy_min(cluster_num) ) then energy_min(cluster_num) = energy_test cluster_center(1:dim_num,1:cluster_hi) = & cluster_center_test(1:dim_num,1:cluster_hi) end if energy_max(cluster_num) = max ( energy_max(cluster_num), energy_test ) end do energy(cluster_num) = energy_min(cluster_num) end do ! ! Report energy ranges for the various cluster sizes. ! write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'ANALYSIS_RAW:' write ( *, '(a)' ) ' Computed energy range for given cluster size:' write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' (The minimum and maximum should be close if' write ( *, '(a)' ) ' we''re taking enough iterations.)' write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' Cluster Minimum Maximum' write ( *, '(a)' ) ' Size Energy Energy' write ( *, '(a)' ) ' ' do cluster_num = cluster_lo, cluster_hi write ( *, '(i9,2f14.4)' ) & cluster_num, energy_min(cluster_num), energy_max(cluster_num) end do ! ! Report best energy for the various cluster sizes. ! write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' Energy table:' write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'Cluster Energy' write ( *, '(a)' ) 'Size Energy /point Sqrt(E/Pt)' write ( *, '(a)' ) ' ' do cluster_num = cluster_lo, cluster_hi t = energy(cluster_num) / dble ( point_num ) write ( *, '(i9,3f14.4)' ) cluster_num, energy(cluster_num), t, sqrt ( t ) end do return end subroutine ch_cap ( c ) ! !******************************************************************************* ! !! CH_CAP capitalizes a single character. ! ! ! Modified: ! ! 19 July 1998 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input/output, character C, the character to capitalize. ! implicit none ! character c integer itemp ! itemp = ichar ( c ) if ( 97 <= itemp .and. itemp <= 122 ) then c = char ( itemp - 32 ) end if return end function ch_eqi ( c1, c2 ) ! !******************************************************************************* ! !! CH_EQI is a case insensitive comparison of two characters for equality. ! ! ! Examples: ! ! CH_EQI ( 'A', 'a' ) is .TRUE. ! ! Modified: ! ! 28 July 2000 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character C1, C2, the characters to compare. ! ! Output, logical CH_EQI, the result of the comparison. ! implicit none ! character c1 character c1_cap character c2 character c2_cap logical ch_eqi ! c1_cap = c1 c2_cap = c2 call ch_cap ( c1_cap ) call ch_cap ( c2_cap ) if ( c1_cap == c2_cap ) then ch_eqi = .true. else ch_eqi = .false. end if return end function ch_is_digit ( c ) ! !******************************************************************************* ! !! CH_IS_DIGIT returns .TRUE. if a character is a decimal digit. ! ! ! Modified: ! ! 09 August 1999 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character C, the character to be analyzed. ! ! Output, logical CH_IS_DIGIT, .TRUE. if C is a digit, .FALSE. otherwise. ! implicit none ! character c logical ch_is_digit ! if ( lge ( c, '0' ) .and. lle ( c, '9' ) ) then ch_is_digit = .true. else ch_is_digit = .false. end if return end subroutine ch_to_digit ( c, digit ) ! !******************************************************************************* ! !! CH_TO_DIGIT returns the integer value of a base 10 digit. ! ! ! Example: ! ! C DIGIT ! --- ----- ! '0' 0 ! '1' 1 ! ... ... ! '9' 9 ! ' ' 0 ! 'X' -1 ! ! Modified: ! ! 04 August 1999 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character C, the decimal digit, '0' through '9' or blank ! are legal. ! ! Output, integer DIGIT, the corresponding integer value. If C was ! 'illegal', then DIGIT is -1. ! implicit none ! character c integer digit ! if ( lge ( c, '0' ) .and. lle ( c, '9' ) ) then digit = ichar ( c ) - 48 else if ( c == ' ' ) then digit = 0 else digit = -1 end if return end subroutine cluster_census ( dim_num, point_num, cluster_num, cluster_center, & point, cluster ) ! !******************************************************************************* ! !! CLUSTER_CENSUS computes and prints the population of each cluster. ! ! ! Modified: ! ! 24 April 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer POINT_NUM, the number of data points. ! ! Input, integer CLUSTER_NUM, the number of clusters. ! ! Input, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_NUM), the cluster ! generators. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the data points. ! ! Input, integer CLUSTER(POINT_NUM), the cluster to which each ! point is assigned. ! implicit none ! integer cluster_num integer dim_num integer point_num ! integer cluster(point_num) double precision, dimension (dim_num,cluster_num) :: cluster_center double precision, dimension ( cluster_num ) :: cluster_energy integer, dimension ( cluster_num ) :: cluster_max integer, dimension ( cluster_num ) :: cluster_min integer, dimension ( cluster_num ) :: cluster_population integer i integer j integer percent1 integer percent2 double precision, dimension (dim_num,point_num) :: point integer swap_num double precision total_energy ! call nearest_cluster_raw ( dim_num, point_num, cluster_num, & cluster_center, point, cluster, swap_num ) cluster_energy(1:cluster_num) = 0.0D+00 cluster_population(1:cluster_num) = 0 do i = 1, point_num j = cluster(i) cluster_population(j) = cluster_population(j) + 1 cluster_energy(j) = cluster_energy(j) & + sum ( ( cluster_center(1:dim_num,j) - point(1:dim_num,i) )**2 ) end do total_energy = sum ( cluster_energy(1:cluster_num) ) cluster_min(1:cluster_num) = point_num + 1 cluster_max(1:cluster_num) = 0 do i = 1, point_num j = cluster(i) cluster_min(j) = min ( cluster_min(j), i ) cluster_max(j) = max ( cluster_max(j), i ) end do write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'CLUSTER_CENSUS' write ( *, '(a)' ) ' Individual cluster population and energy' write ( *, '(a)' ) ' ' write ( *, '(a)' ) & ' Index Population Percentage Energy Percentage Min Max' write ( *, '(a)' ) ' ' do i = 1, cluster_num percent1 = int ( dble ( 100 * cluster_population(i) ) / dble ( point_num ) ) percent2 = int ( 100.0D+00 * cluster_energy(i) ) / total_energy write ( *, '(1x,i6,8x,i6,10x,i3,g14.6,4x,i3,2i5)' ) i, cluster_population(i), & percent1, cluster_energy(i), percent2, cluster_min(i), cluster_max(i) end do percent1 = 100 percent2 = 100 write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' ------ --- ------------ ---' write ( *, '(a)' ) ' ' write ( *, '(2x,a5,8x,i6,10x,i3,g14.6,4x,i3,2i5)' ) 'Total', & sum ( cluster_population(1:cluster_num) ), & percent1, sum ( cluster_energy(1:cluster_num) ), percent2, 1, point_num return end subroutine cluster_initialize_raw ( dim_num, point_num, cluster_num, point, & cluster, cluster_center, energy ) ! !******************************************************************************* ! !! CLUSTER_INITIALIZE_RAW initializes the cluster centers to random values. ! ! ! Discussion: ! ! In this case, each cluster center is a random convex combination ! of the data points. ! ! Modified: ! ! 22 March 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer POINT_NUM, the number of points. ! ! Input, integer CLUSTER_NUM, the number of clusters. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the coordinates of ! the points. ! ! Output, integer CLUSTER(POINT_NUM), the clusters to which points ! are assigned. ! ! Output, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_NUM), ! the coordinates of the cluster centers. ! ! Output, double precision ENERGY, the energy of the clustering. ! implicit none ! integer cluster_num integer dim_num integer point_num ! integer cluster(point_num) double precision cluster_center(dim_num,cluster_num) double precision column_sum double precision energy double precision factor(point_num,cluster_num) integer i integer j double precision point(dim_num,point_num) integer swap_num ! ! Get a PxC block of random factors. ! call random_number ( harvest = factor(1:point_num,1:cluster_num) ) ! ! Make each column of factors have unit sum. ! do j = 1, cluster_num column_sum = sum ( factor(1:point_num,j) ) factor(1:point_num,j) = factor(1:point_num,j) / column_sum end do ! ! Set centers = points * factors. ! cluster_center(1:dim_num,1:cluster_num) = & matmul ( point(1:dim_num,1:point_num), factor(1:point_num,1:cluster_num) ) ! ! Assign points to the nearest centers. ! call nearest_cluster_raw ( dim_num, point_num, cluster_num, & cluster_center, point, cluster, swap_num ) ! ! Determine the energy of the clustering. ! call energy_raw ( dim_num, point_num, cluster_num, point, & cluster_center, cluster, energy ) return end subroutine cluster_list ( point_num, cluster ) ! !******************************************************************************* ! !! CLUSTER_LIST prints out the assignments. ! ! ! Modified: ! ! 19 April 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer POINT_NUM, the number of data points. ! ! Input, integer CLUSTER(POINT_NUM), the cluster to which each ! point is assigned. ! implicit none ! integer point_num ! integer cluster(point_num) integer i ! write ( *, '(a)' ) ' ' write ( *, '(a)' ) ' I Cluster(I)' write ( *, '(a)' ) ' ' do i = 1, point_num write ( *, '(i4,i8x,i4)' ) i, cluster(i) end do return end subroutine data_d2_read ( file_name, n, x, y ) ! !******************************************************************************* ! !! DATA_D2_READ reads a data set of pairs of double precision numbers stored in a file. ! ! ! Discussion: ! ! The data set can be thought of as a double precision M by 2 array. ! ! Each row of the array corresponds to one data "item". ! ! The data is stored in a file, one row (pair of values) at a time. ! ! Each row (pair of values) begins on a new line. ! ! Blank lines and comment lines (beginning with '#') are ignored. ! ! Individual data values should be separated by spaces or commas. ! ! Modified: ! ! 15 April 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) FILE_NAME, the name of the file to read. ! ! Input, integer N, the number of data items. ! ! Output, double precision X(N), Y(N), the data values. ! implicit none ! integer n ! character ( len = * ) file_name integer ierror integer input integer ios integer last integer length character ( len = 80 ) line integer n2 double precision x(n) double precision y(n) ! call get_unit ( input ) open ( unit = input, file = file_name, status = 'old' ) x(1:n) = huge ( x(1) ) y(1:n) = huge ( y(1) ) n2 = 0 do read ( input, '(a)', iostat = ios ) line if ( ios /= 0 ) then exit end if if ( len_trim ( line ) == 0 ) then else if ( line(1:1) == '#' ) then else n2 = n2 + 1 last = 0 call s_to_d ( line(last+1:), x(n2), ierror, length ) if ( ierror /= 0 ) then exit end if last = last + length call s_to_d ( line(last+1:), y(n2), ierror, length ) if ( ierror /= 0 ) then exit end if if ( n2 == n ) then exit end if end if end do close ( unit = input ) return end subroutine data_d2_write ( file_name, m, x1, x2 ) ! !******************************************************************************* ! !! DATA_D2_WRITE writes a data set of pairs of double precision numbers into a file. ! ! ! Discussion: ! ! The data set can be thought of as a double precision M by 2 array. ! ! Each row of the array corresponds to one data "item". ! ! The data is stored in a file, one row (pair of values) at a time. ! ! Each row (pair of values) begins on a new line. ! ! Blank lines and comment lines (beginning with '#') are ignored. ! ! Individual data values should be separated by spaces or commas. ! ! Modified: ! ! 20 June 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) FILE_NAME, the name of the file to write. ! ! Input, integer M, the number of data items. ! ! Input, double precision X1(M), X2(M), the data values. ! implicit none ! integer m ! character ( len = * ) file_name integer i integer output double precision u double precision v double precision x1(m) double precision x2(m) ! call get_unit ( output ) open ( unit = output, file = file_name, status = 'replace' ) ! ! These comment lines turned off at Mr Lee's request. ! ! write ( output, '(a)' ) '# ' // trim ( file_name ) ! write ( output, '(a)' ) '#' ! write ( output, '(a)' ) '# Created by DATA_D2_WRITE.' ! write ( output, '(a,i6)' ) '# Number of data records is ', m ! write ( output, '(a)' ) '#' do i = 1, m ! ! Mr Lee requests D25.15 format. ! ! write ( output, '(2g14.6)' ) x1(i), x2(i) if ( abs ( x1(i) ) < 1.0D-10 ) then u = 0.0D+00 else u = x1(i) end if if ( abs ( x2(i) ) < 1.0E-10 ) then v = 0.0D+00 else v = x2(i) end if write ( output, '(2d25.15)' ) u, v end do close ( unit = output ) return end subroutine data_size ( file_name, m, n ) ! !******************************************************************************* ! !! DATA_SIZE counts the size of a data set stored in a file. ! ! ! Discussion: ! ! Blank lines and comment lines (which begin with '#') are ignored). ! ! All other lines are assumed to represent data items. ! ! This routine assumes that each data line contains exactly the ! same number of values, which are separated by spaces. ! ! (This means this routine cannot handle cases where a data item ! extends over more than one line, or cases where data is squeezed ! together with no spaces, or where commas are used as separators, ! but with space on both sides.) ! ! Modified: ! ! 11 April 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) FILE_NAME, the name of the file to read. ! ! Output, integer M, the number of nonblank, noncomment lines. ! ! Output, integer N, the number of values per line. ! implicit none ! character ( len = * ) file_name integer input integer ios character ( len = 80 ) line integer m integer n integer n_max integer n_min integer n_word ! m = 0 n_max = - huge ( n_max ) n_min = huge ( n_min ) call get_unit ( input ) open ( unit = input, file = file_name, status = 'old' ) do read ( input, '(a)', iostat = ios ) line if ( ios /= 0 ) then exit end if if ( len_trim ( line ) == 0 ) then else if ( line(1:1) == '#' ) then else m = m + 1 call word_count ( line, n_word ) n_max = max ( n_max, n_word ) n_min = min ( n_min, n_word ) end if end do if ( n_max /= n_min ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'DATA_SIZE - Fatal error!' write ( *, '(a)' ) ' Number of words per line varies.' write ( *, '(a,i6)' ) ' Minimum is ', n_min write ( *, '(a,i6)' ) ' Maximum is ', n_max n = 0 else n = n_min end if close ( unit = input ) return end subroutine data_to_gnuplot ( file_name, n, x, y ) ! !******************************************************************************* ! !! DATA_TO_GNUPLOT writes data to a file suitable for processing by GNUPLOT. ! ! ! Discussion: ! ! Once the data file is written, the GNUPLOT program can be used ! to display the data, using commands like: ! ! set term post default ! set grid ! set yrnage [0:*] ! set title "Number of Clusters vs Total Energy, Normalized Data" ! plot 'file_name' ! quit ! ! Modified: ! ! 12 September 2001 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) FILE_NAME, the name of the file ! to which the data is written. ! ! Input, integer N, the number of data values. ! ! Input, double precision X(N), Y(N), the data. ! integer n ! character ( len = * ) file_name integer i integer iunit double precision x(n) double precision y(n) ! call get_unit ( iunit ) open ( unit = iunit, file = file_name, status = 'replace' ) do i = 1, n write ( iunit, '(g16.8,2x,g16.8)' ) x(i), y(i) end do close ( unit = iunit ) return end subroutine digit_inc ( c ) ! !******************************************************************************* ! !! DIGIT_INC increments a decimal digit. ! ! ! Example: ! ! Input Output ! ----- ------ ! '0' '1' ! '1' '2' ! ... ! '8' '9' ! '9' '0' ! 'A' 'A' ! ! Modified: ! ! 04 August 1999 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input/output, character C, a digit to be incremented. ! implicit none ! character c integer digit ! call ch_to_digit ( c, digit ) if ( digit == -1 ) then return end if digit = digit + 1 if ( digit == 10 ) then digit = 0 end if call digit_to_ch ( digit, c ) return end subroutine digit_to_ch ( digit, c ) ! !******************************************************************************* ! !! DIGIT_TO_CH returns the character representation of a decimal digit. ! ! ! Example: ! ! DIGIT C ! ----- --- ! 0 '0' ! 1 '1' ! ... ... ! 9 '9' ! 17 '*' ! ! Modified: ! ! 04 August 1999 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIGIT, the digit value between 0 and 9. ! ! Output, character C, the corresponding character, or '*' if DIGIT ! was illegal. ! implicit none ! character c integer digit ! if ( 0 <= digit .and. digit <= 9 ) then c = char ( digit + 48 ) else c = '*' end if return end function distance_normal_sq ( n, x, y ) ! !******************************************************************************* ! !! DISTANCE_NORMAL_SQ computes the distance between normalized vectors. ! ! ! Discussion: ! ! Actually, it computes the SQUARE of the distance. ! ! For "normalized" vectors, we assume that an appropriate multiple ! of the steady state solution has been subtracted, and that what ! remains can be regarded as perturbations from 0 whose direction ! (but not magnitude) is of importance. ! ! The distance between two such vectors, then, is the angle between ! them, or, for our purposes, the absolute value of the sine of the ! angle. (This allows us to identify vectors V and -V, for instance). ! ! Modified: ! ! 16 April 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer N, the dimension of the vectors. ! ! Input, double precision X(N), Y(N), the vectors whose distance is ! to be computed. ! ! Output, double precision DISTANCE_NORMAL_SQ, the square of the ! "distance" between the vectors. ! integer n ! double precision cosine double precision distance_normal_sq double precision x(n) double precision x_norm double precision y(n) double precision y_norm ! x_norm = sqrt ( dot_product ( x(1:n), x(1:n) ) ) y_norm = sqrt ( dot_product ( y(1:n), y(1:n) ) ) cosine = dot_product ( x(1:n), y(1:n) ) / ( x_norm * y_norm ) distance_normal_sq = 1.0D+00 - cosine**2 return end subroutine energy_normal ( dim_num, point_num, cluster_num, point, & cluster_center, cluster, energy ) ! !******************************************************************************* ! !! ENERGY_NORMAL computes the total energy of a given clustering. ! ! ! Discussion: ! ! For "normalized" vectors, we assume that an appropriate multiple ! of the steady state solution has been subtracted, and that what ! remains can be regarded as perturbations from 0 whose direction ! (but not magnitude) is of importance. ! ! The total energy function is the sum of the cluster energies. ! ! The energy of a cluster is the sum of the "distances" of ! each point in the cluster to the center point of the cluster. ! ! Modified: ! ! 16 April 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer POINT_NUM, the number of data points. ! ! Input, integer CLUSTER_NUM, the number of clusters. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the data points. ! ! Input, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_NUM), the ! center points. ! ! Input, integer CLUSTER(POINT_NUM), indicates the cluster to which ! each data point belongs. ! ! Output, double precision ENERGY, the total energy of the clustering. ! implicit none ! integer cluster_num integer dim_num integer point_num ! integer, dimension ( point_num ) :: cluster double precision, dimension ( dim_num, cluster_num ) :: cluster_center double precision, dimension ( cluster_num ) :: cluster_energy double precision distance_normal_sq double precision energy integer i integer j double precision, dimension ( dim_num, point_num ) :: point ! cluster_energy(1:cluster_num) = 0.0D+00 do i = 1, point_num j = cluster(i) cluster_energy(j) = cluster_energy(j) & + distance_normal_sq ( dim_num, point(1,i), cluster_center(1,j) ) end do energy = sum ( cluster_energy(1:cluster_num) ) return end subroutine energy_raw ( dim_num, point_num, cluster_num, point, & cluster_center, cluster, energy ) ! !******************************************************************************* ! !! ENERGY_RAW computes the total energy of a given clustering. ! ! ! Discussion: ! ! This routine is used with the raw data. No normalization is ! done to the data. ! ! The total energy function is the sum of the cluster energies. ! ! The energy of a cluster is the sum of the squares of the distances of ! each point in the cluster to the center point of the cluster. ! ! Modified: ! ! 28 March 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer POINT_NUM, the number of data points. ! ! Input, integer CLUSTER_NUM, the number of clusters. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the data points. ! ! Input, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_NUM), the ! center points. ! ! Input, integer CLUSTER(POINT_NUM), indicates the cluster to which ! each data point belongs. ! ! Output, double precision ENERGY, the total energy of the clustering. ! implicit none ! integer cluster_num integer dim_num integer point_num ! integer, dimension ( point_num ) :: cluster double precision, dimension ( dim_num, cluster_num ) :: cluster_center double precision, dimension ( cluster_num ) :: cluster_energy double precision energy integer i integer j double precision, dimension ( dim_num, point_num ) :: point ! cluster_energy(1:cluster_num) = 0.0D+00 do i = 1, point_num j = cluster(i) cluster_energy(j) = cluster_energy(j) + sum ( & ( cluster_center(1:dim_num,j) - point(1:dim_num,i) )**2 ) end do energy = sum ( cluster_energy(1:cluster_num) ) return end subroutine file_column_count ( file_name, ncolumn ) ! !******************************************************************************* ! !! FILE_COLUMN_COUNT counts the number of columns in the first line of a file. ! ! ! Discussion: ! ! The file is assumed to be a simple text file. ! ! Most lines of the file is presumed to consist of NCOLUMN words, separated ! by spaces. There may also be some blank lines, and some comment lines, ! which have a "#" in column 1. ! ! The routine tries to find the first non-comment non-blank line and ! counts the number of words in that line. ! ! If all lines are blanks or comments, it goes back and tries to analyze ! a comment line. ! ! Modified: ! ! 21 June 2001 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) FILE_NAME, the name of the file. ! ! Output, integer NCOLUMN, the number of columns assumed to be in the file. ! implicit none ! character ( len = * ) file_name logical got_one integer ios integer iunit character ( len = 256 ) line integer ncolumn ! ! Open the file. ! call get_unit ( iunit ) open ( unit = iunit, file = file_name, status = 'old', form = 'formatted', & access = 'sequential', iostat = ios ) if ( ios /= 0 ) then ncolumn = - 1 write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'FILE_COLUMN_COUNT - Fatal error!' write ( *, '(a)' ) ' Could not open the file:' write ( *, '(a)' ) ' ' // trim ( file_name ) return end if ! ! Read one line, but skip blank lines and comment lines. ! got_one = .false. do read ( iunit, '(a)', iostat = ios ) line if ( ios /= 0 ) then exit end if if ( len_trim ( line ) == 0 ) then cycle end if if ( line(1:1) == '#' ) then cycle end if got_one = .true. exit end do if ( .not. got_one ) then rewind ( iunit ) do read ( iunit, '(a)', iostat = ios ) line if ( ios /= 0 ) then exit end if if ( len_trim ( line ) == 0 ) then cycle end if got_one = .true. exit end do end if close ( unit = iunit ) if ( .not. got_one ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'FILE_COLUMN_COUNT - Warning!' write ( *, '(a)' ) ' The file does not seem to contain any data.' ncolumn = 0 return end if call word_count ( line, ncolumn ) return end function file_exist ( file_name ) ! !******************************************************************************* ! !! FILE_EXIST reports whether a file exists. ! ! ! Modified: ! ! 19 February 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) FILE_NAME, the name of the file. ! ! Output, logical FILE_EXIST, is TRUE if the file exists. ! implicit none ! character ( len = * ) file_name logical file_exist ! inquire ( file = file_name, exist = file_exist ) return end subroutine file_line_count ( file_name, nline ) ! !******************************************************************************* ! !! FILE_LINE_COUNT counts the number of lines in a file. ! ! ! Discussion: ! ! The file is assumed to be a simple text file. ! ! Blank lines and comment lines, which begin with '#', are not counted. ! ! Modified: ! ! 21 June 2001 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) FILE_NAME, the name of the file. ! ! Output, integer NLINE, the number of lines found in the file. ! implicit none ! character ( len = * ) file_name integer ios integer iunit character ( len = 256 ) line integer nline ! nline = 0 ! ! Open the file. ! call get_unit ( iunit ) open ( unit = iunit, file = file_name, status = 'old', form = 'formatted', & access = 'sequential', iostat = ios ) if ( ios /= 0 ) then nline = - 1 write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'FILE_LINE_COUNT - Fatal error!' write ( *, '(a)' ) ' Could not open the file:' // trim ( file_name ) return end if ! ! Count the lines. ! do read ( iunit, '(a)', iostat = ios ) line if ( ios /= 0 ) then exit end if if ( len_trim ( line ) == 0 ) then cycle end if if ( line(1:1) == '#' ) then cycle end if nline = nline + 1 end do close ( unit = iunit ) return end subroutine file_name_inc ( file_name ) ! !******************************************************************************* ! !! FILE_NAME_INC generates the next filename in a series. ! ! ! Discussion: ! ! It is assumed that the digits in the name, whether scattered or ! connected, represent a number that is to be increased by 1 on ! each call. If this number is all 9's on input, the output number ! is all 0's. Non-numeric letters of the name are unaffected, and ! if the name contains no digits, then nothing is done. ! ! Examples: ! ! Input Output ! ----- ------ ! a7to11.txt a7to12.txt ! a7to99.txt a8to00.txt ! a9to99.txt a0to00.txt ! cat.txt cat.txt ! ! Modified: ! ! 09 August 1999 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input/output, character ( len = * ) FILE_NAME. ! On input, a character string to be incremented. ! On output, the incremented string. ! implicit none ! character c logical ch_is_digit character ( len = * ) file_name integer i integer lens ! lens = len_trim ( file_name ) do i = lens, 1, -1 c = file_name(i:i) if ( ch_is_digit ( c ) ) then call digit_inc ( c ) file_name(i:i) = c if ( c /= '0' ) then return end if end if end do return end subroutine get_unit ( iunit ) ! !******************************************************************************* ! !! GET_UNIT returns a free FORTRAN unit number. ! ! ! Discussion: ! ! A "free" FORTRAN unit number is an integer between 1 and 99 which ! is not currently associated with an I/O device. A free FORTRAN unit ! number is needed in order to open a file with the OPEN command. ! ! Modified: ! ! 02 March 1999 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Output, integer IUNIT. ! ! If IUNIT = 0, then no free FORTRAN unit could be found, although ! all 99 units were checked (except for units 5 and 6). ! ! Otherwise, IUNIT is an integer between 1 and 99, representing a ! free FORTRAN unit. Note that GET_UNIT assumes that units 5 and 6 ! are special, and will never return those values. ! implicit none ! integer i integer ios integer iunit logical lopen iunit = 0 do i = 1, 99 if ( i /= 5 .and. i /= 6 ) then inquire ( unit = i, opened = lopen, iostat = ios ) if ( ios == 0 ) then if ( .not. lopen ) then iunit = i return end if end if end if end do return end subroutine hmeans_normal ( dim_num, point_num, cluster_num, it_max, it, & point, cluster, cluster_center, energy, null_cluster_policy ) ! !******************************************************************************* ! !! HMEANS_NORMAL seeks the minimal energy of a cluster of a given size. ! ! ! Discussion: ! ! This routine works with the normalized data. ! ! The data for the H-Means problem is a set of N points X in ! M-dimensions, and a desired number of clusters K. ! ! The goal is to determine K points Z, called cluster centers, so that ! if we associate each point X with its nearest Z value, we minimize ! the standard deviation or cluster energy. Writing CLUSTER(I) to ! indicate the index of the nearest cluster center to point X(I), the ! energy can be written as: ! ! Energy = Sum ( 1 <= I <= N ) ( distance ( X(I), Z(CLUSTER(I)) ) )**2 ! ! where ! ! distance ( X, Z ) = | sine ( angle between X and Z ) | ! ! Modified: ! ! 17 April 2002 ! ! Reference: ! ! Wendy Martinez and Angel Martinez, ! Computational Statistics Handbook with MATLAB, ! pages 373-376, ! Chapman and Hall / CRC, 2002. ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer POINT_NUM, the number of data points. ! ! Input, integer CLUSTER_NUM, the number of clusters. ! ! Input, integer IT_MAX, the maximum number of iterations allowed. ! ! Output, integer IT, the number of iterations taken. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the data points. ! ! Output, integer CLUSTER(POINT_NUM), the cluster to which each ! data point belongs. ! ! Input/output, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_NUM), ! the centers associated with the minimal energy clustering. ! ! Output, double precision ENERGY, the total energy associated with ! the minimal energy clustering. ! ! Input, integer NULL_CLUSTER_POLICY, specifies what to do if a ! null cluster is encountered. ! 0, do nothing. ! 1, reset center of null cluster to a random data point; ! 2, reset center of null cluster to a random point in the data hull; ! implicit none ! integer cluster_num integer dim_num integer point_num ! integer, dimension ( point_num ) :: cluster double precision, dimension ( dim_num, cluster_num ) :: cluster_center integer, dimension ( cluster_num ) :: cluster_population double precision column_sum logical, parameter :: debug = .false. double precision energy double precision factor(point_num) integer i integer it integer it_max integer j integer null_cluster_num integer null_cluster_policy double precision, dimension ( dim_num, point_num ) :: point integer swap ! do it = 1, it_max ! ! #1: Assign each point to the cluster of its nearest center. ! swap = 0 do i = 1, point_num call nearest_cluster_normal ( dim_num, cluster_num, cluster_center, & point(1:dim_num,i), j ) if ( j /= cluster(i) ) then cluster(i) = j swap = swap + 1 end if end do ! ! #2: Determine the energy of the new clustering with the current centers. ! call energy_normal ( dim_num, point_num, cluster_num, point, & cluster_center, cluster, energy ) if ( .false. ) then write ( *, '(i6,3x,14x,3x,g14.6)' ) it, energy end if ! ! #3: Determine the populations of the new clusters. ! If null clusters are not OK, we need to handle any empty clusters. ! If any cluster is empty, set its center to a random point within ! the convex hull of the data, reassign points, and repeat the entire ! process. ! do cluster_population(1:cluster_num) = 0 do i = 1, point_num j = cluster(i) cluster_population(j) = cluster_population(j) + 1 end do null_cluster_num = 0 do j = 1, cluster_num if ( cluster_population(j) == 0 ) then null_cluster_num = null_cluster_num + 1 call random_number ( harvest = factor(1:point_num) ) column_sum = sum ( factor(1:point_num) ) factor(1:point_num) = factor(1:point_num) / column_sum cluster_center(1:dim_num,j) = & matmul ( point(1:dim_num,1:point_num), factor(1:point_num) ) end if end do if ( null_cluster_num == 0 ) then exit end if if ( null_cluster_policy == 0 ) then exit end if if ( .false. ) then write ( *, '(a,i6)' ) & 'HMEANS_NORMAL, number of empty clusters = ', null_cluster_num end if ! ! Resort the points. ! do i = 1, point_num call nearest_cluster_normal ( dim_num, cluster_num, cluster_center, & point(1:dim_num,i), j ) cluster(i) = j end do end do ! ! #4: Recompute the cluster centers as the centroids of the points ! in the cluster. ! cluster_center(1:dim_num,1:cluster_num) = 0.0D+00 do i = 1, point_num j = cluster(i) cluster_center(1:dim_num,j) = cluster_center(1:dim_num,j) & + point(1:dim_num,i) end do do i = 1, cluster_num cluster_center(1:dim_num,i) = cluster_center(1:dim_num,i) & / dble ( max ( cluster_population(i), 1 ) ) end do ! ! #5: Determine the energy of the current clustering with the new centers. ! call energy_normal ( dim_num, point_num, cluster_num, point, & cluster_center, cluster, energy ) if ( .false. ) then write ( *, '(i6,3x,g14.6)' ) it, energy end if if ( swap == 0 .and. it > 1 ) then exit end if end do return end subroutine hmeans_raw ( dim_num, point_num, cluster_num, it_max, it, & point, cluster, cluster_center, energy, null_cluster_policy ) ! !******************************************************************************* ! !! HMEANS_RAW seeks the minimal energy of a cluster of a given size. ! ! ! Discussion: ! ! This routine works with the raw data, and does not do any ! normalization. ! ! The data for the H-Means problem is a set of N points X in ! M-dimensions, and a desired number of clusters K. ! ! The goal is to determine K points Z, called cluster centers, so that ! if we associate each point X with its nearest Z value, we minimize ! the standard deviation or cluster energy. Writing CLUSTER(I) to ! indicate the index of the nearest cluster center to point X(I), the ! energy can be written as: ! ! Energy = Sum ( 1 <= I <= N ) distance ( X(I), Z(CLUSTER(I)) )**2 ! ! where ! ! distance ( X - Z ) = Sqrt ( Sum ( 1 <= J <= M ) ( X(J) - Z(J) )**2 ) ! ! Modified: ! ! 18 April 2002 ! ! Reference: ! ! Wendy Martinez and Angel Martinez, ! Computational Statistics Handbook with MATLAB, ! pages 373-376, ! Chapman and Hall / CRC, 2002. ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer POINT_NUM, the number of data points. ! ! Input, integer CLUSTER_NUM, the number of clusters. ! ! Input, integer IT_MAX, the maximum number of iterations allowed. ! ! Output, integer IT, the number of iterations taken. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the data points. ! ! Output, integer CLUSTER(POINT_NUM), the cluster to which each ! data point belongs. ! ! Input/output, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_NUM), ! the centers associated with the minimal energy clustering. ! ! Output, double precision ENERGY, the total energy associated with ! the minimal energy clustering. ! ! Input, integer NULL_CLUSTER_POLICY, specifies what to do if a ! null cluster is encountered. ! 0, do nothing. ! 1, reset center of null cluster to a random data point; ! 2, reset center of null cluster to a random point in the data hull; ! implicit none ! integer cluster_num integer dim_num integer point_num ! integer, dimension ( point_num ) :: cluster double precision, dimension ( dim_num, cluster_num ) :: cluster_center integer, dimension ( cluster_num ) :: cluster_population double precision column_sum logical, parameter :: debug = .true. double precision energy double precision factor(point_num) integer i integer it integer it_max integer j integer k integer null_cluster_num integer null_cluster_policy double precision, dimension ( dim_num, point_num ) :: point integer swap_num ! do it = 1, it_max ! ! #1: Assign each point to the cluster of its nearest center. ! call nearest_cluster_raw ( dim_num, point_num, cluster_num, & cluster_center, point, cluster, swap_num ) ! ! #2: Determine the energy of the new clustering with the current centers. ! call energy_raw ( dim_num, point_num, cluster_num, point, & cluster_center, cluster, energy ) if ( .false. ) then write ( *, '(i6,3x,14x,3x,g14.6)' ) it, energy end if ! ! #3: Determine the populations of the new clusters. ! If null clusters are not OK, we need to handle any empty clusters. ! If any cluster is empty, set its center to a random point within ! the convex hull of the data, reassign points, and repeat the entire ! process. ! do cluster_population(1:cluster_num) = 0 do i = 1, point_num j = cluster(i) cluster_population(j) = cluster_population(j) + 1 end do null_cluster_num = 0 do j = 1, cluster_num if ( cluster_population(j) == 0 ) then null_cluster_num = null_cluster_num + 1 end if end do if ( null_cluster_num == 0 ) then exit end if if ( null_cluster_policy == 0 ) then exit end if if ( .false. ) then write ( *, '(a,i6)' ) & 'HMEANS_RAW, number of empty clusters = ', null_cluster_num end if if ( null_cluster_policy == 1 ) then do j = 1, cluster_num if ( cluster_population(j) == 0 ) then call i_random ( 1, point_num, k ) cluster_center(1:dim_num,j) = point(1:dim_num,k) end if end do else if ( null_cluster_policy == 2 ) then do j = 1, cluster_num if ( cluster_population(j) == 0 ) then call random_number ( harvest = factor(1:point_num) ) column_sum = sum ( factor(1:point_num) ) factor(1:point_num) = factor(1:point_num) / column_sum cluster_center(1:dim_num,j) = & matmul ( point(1:dim_num,1:point_num), factor(1:point_num) ) end if end do end if ! ! Resort the points. ! call nearest_cluster_raw ( dim_num, point_num, cluster_num, & cluster_center, point, cluster, swap_num ) end do ! ! #4: Recompute the cluster centers as the centroids of the points ! in the cluster. ! cluster_center(1:dim_num,1:cluster_num) = 0.0D+00 do i = 1, point_num j = cluster(i) cluster_center(1:dim_num,j) = cluster_center(1:dim_num,j) & + point(1:dim_num,i) end do do i = 1, cluster_num cluster_center(1:dim_num,i) = cluster_center(1:dim_num,i) & / dble ( max ( cluster_population(i), 1 ) ) end do ! ! #5: Determine the energy of the current clustering with the new centers. ! call energy_raw ( dim_num, point_num, cluster_num, point, & cluster_center, cluster, energy ) if ( .false. ) then write ( *, '(i6,3x,g14.6)' ) it, energy end if if ( swap_num == 0 .and. it > 1 ) then exit end if end do return end subroutine i_input ( string, value, ierror ) ! !******************************************************************************* ! !! I_INPUT prints a prompt string and reads an integer from the user. ! ! ! Discussion: ! ! If the input line starts with a comment character ('#') or is ! blank, the routine ignores that line, and tries to read the next one. ! ! Modified: ! ! 27 March 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) STRING, the prompt string. ! ! Output, integer VALUE, the value input by the user. ! ! Output, integer IERROR, an error flag, which is zero if no error occurred. ! implicit none ! integer ierror integer last character ( len = 80 ) line character ( len = * ) string integer value ! ierror = 0 value = huge ( value ) ! ! Write the prompt. ! write ( *, '(a)' ) ' ' write ( *, '(a)' ) trim ( string ) do read ( *, '(a)', iostat = ierror ) line if ( ierror /= 0 ) then return end if ! ! If the line begins with a comment character, go back and read the next line. ! if ( line(1:1) == '#' ) then cycle end if if ( len_trim ( line ) == 0 ) then cycle end if ! ! Extract integer information from the string. ! call s_to_i ( line, value, ierror, last ) if ( ierror /= 0 ) then value = huge ( value ) return end if exit end do return end subroutine i_random ( ilo, ihi, i ) ! !******************************************************************************* ! !! I_RANDOM returns a random integer in a given range. ! ! ! Modified: ! ! 23 September 2000 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer ILO, IHI, the minimum and maximum acceptable values. ! ! Output, integer I, the randomly chosen integer. ! implicit none ! integer i integer ihi integer ilo double precision r double precision, parameter :: rhi = 1.0D+00 double precision, parameter :: rlo = 0.0D+00 ! call d_random ( rlo, rhi, r ) i = ilo + int ( r * dble ( ihi + 1 - ilo ) ) ! ! In case of oddball events at the boundary, enforce the limits. ! i = max ( i, ilo ) i = min ( i, ihi ) return end subroutine i_range_input ( string, value1, value2, ierror ) ! !******************************************************************************* ! !! I_RANGE_INPUT reads a pair of integers from the user, representing a range. ! ! ! Discussion: ! ! If the input line starts with a comment character ('#') or is blank, ! the routine ignores that line, and tries to read the next one. ! ! The pair of integers may be separated by spaces or a comma or both. ! ! Modified: ! ! 27 March 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) STRING, the prompt string. ! ! Output, integer VALUE1, VALUE2, the values entered by the user. ! ! Output, integer IERROR, an error flag, which is zero if no error occurred. ! implicit none ! character, parameter :: comma = ',' integer ierror integer last integer last2 character ( len = 80 ) line character, parameter :: space = ' ' character ( len = * ) string integer value1 integer value2 ! ierror = 0 value1 = huge ( value1 ) value2 = huge ( value2 ) ! ! Write the prompt. ! write ( *, '(a)' ) ' ' write ( *, '(a)' ) trim ( string ) do read ( *, '(a)', iostat = ierror ) line if ( ierror /= 0 ) then return end if ! ! If the line begins with a comment character, go back and read the next line. ! if ( line(1:1) == '#' ) then cycle end if if ( len_trim ( line ) == 0 ) then cycle end if ! ! Remove commas. ! call s_rep_ch ( line, comma, space ) ! ! Extract integer information from the string. ! call s_to_i ( line, value1, ierror, last ) if ( ierror /= 0 ) then value1 = huge ( value1 ) return end if call s_to_i ( line(last+1:), value2, ierror, last2 ) if ( ierror /= 0 ) then value2 = huge ( value2 ) return end if exit end do return end subroutine kmeans_normal ( dim_num, point_num, cluster_num, it_max, it, & point, cluster, cluster_center, energy, null_cluster_policy ) ! !******************************************************************************* ! !! KMEANS_NORMAL tries to improve a partition of points. ! ! ! Discussion: ! ! This routine works with the "normalized" data. ! ! For "normalized" vectors, we assume that an appropriate multiple ! of the steady state solution has been subtracted, and that what ! remains can be regarded as perturbations from 0 whose direction ! (but not magnitude) is of importance. ! ! The data for the K-Means problem is a set of N points X in ! M-dimensions, and a desired number of clusters K. ! ! The goal is to determine K points Z, called cluster centers, so that ! if we associate each point X with its nearest Z value, we minimize ! the standard deviation or cluster energy. Writing CLUSTER(I) to ! indicate the index of the nearest cluster center to point X(I), the ! energy can be written as: ! ! Energy = Sum ( 1 <= I <= N ) ( distance ( X(I), Z(CLUSTER(I)) ) )**2 ! ! where ! ! distance ( X, Z ) = | sine ( angle between X and Z ) | ! ! Reference: ! ! Wendy Martinez and Angel Martinez, ! Computational Statistics Handbook with MATLAB, ! pages 373-376, ! Chapman and Hall / CRC, 2002 ! ! Modified: ! ! 17 April 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer POINT_NUM, the number of data points. ! ! Input, integer CLUSTER_NUM, the number of clusters. ! ! Input, integer IT_MAX, the maximum number of iterations allowed. ! ! Output, integer IT, the number of iterations taken. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the data points. ! ! Input/output, integer CLUSTER(POINT_NUM), the cluster to which ! each point belongs. On output, these may have been altered. ! ! Input/output, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_NUM), ! the centers associated with the clustering. On output, these may ! have been altered. ! ! Input, integer NULL_CLUSTER_POLICY, specifies what to do if a ! null cluster is encountered. ! 0, do nothing. ! 1, reset center of null cluster to a random data point; ! 2, reset center of null cluster to a random point in the data hull; ! implicit none ! integer cluster_num integer dim_num integer point_num ! integer ci integer cj integer, dimension ( point_num ) :: cluster double precision, dimension ( dim_num, cluster_num ) :: cluster_center double precision, dimension ( cluster_num ) :: cluster_energy integer, dimension ( cluster_num ) :: cluster_population double precision column_sum logical, parameter :: debug = .false. double precision, dimension ( cluster_num ) :: distsq double precision energy double precision, dimension ( point_num ) :: factor integer i integer it integer it_max integer j integer list(1) integer null_cluster_num integer null_cluster_policy double precision, dimension ( dim_num, point_num ) :: point integer swap integer swap_total ! ! For each observation, calculate the distance from each cluster ! center, and assign to the nearest. ! do do i = 1, point_num call nearest_cluster_normal ( dim_num, cluster_num, cluster_center, & point(1:dim_num,i), j ) if ( j /= cluster(i) ) then cluster(i) = j end if end do ! ! Determine the cluster populations. ! cluster_population(1:cluster_num) = 0 do i = 1, point_num j = cluster(i) cluster_population(j) = cluster_population(j) + 1 end do ! ! If a cluster is empty, give it a new cluster center, and restart ! the process. ! WARNING: This can take a long time! ! null_cluster_num = 0 do j = 1, cluster_num if ( cluster_population(j) == 0 ) then null_cluster_num = null_cluster_num + 1 call random_number ( harvest = factor(1:point_num) ) column_sum = sum ( factor(1:point_num) ) factor(1:point_num) = factor(1:point_num) / column_sum cluster_center(1:dim_num,j) = & matmul ( point(1:dim_num,1:point_num), factor(1:point_num) ) end if end do if ( null_cluster_num == 0 ) then exit end if if ( null_cluster_policy == 0 ) then exit end if if ( debug ) then write ( *, '(a,i6)' ) & 'KMEANS_NORMAL - Number of null clusters = ', null_cluster_num end if end do ! ! Recompute the cluster centers as the centroids of the points ! in the cluster. ! cluster_center(1:dim_num,1:cluster_num) = 0.0D+00 do i = 1, point_num j = cluster(i) cluster_center(1:dim_num,j) = cluster_center(1:dim_num,j) & + point(1:dim_num,i) end do do i = 1, cluster_num cluster_center(1:dim_num,i) = cluster_center(1:dim_num,i) & / dble ( max ( cluster_population(i), 1 ) ) end do ! ! Carry out the iteration. ! it = 0 swap_total = 0 do call energy_normal ( dim_num, point_num, cluster_num, point, & cluster_center, cluster, energy ) if ( .false. ) then write ( *, '(i6,3x,14x,3x,g14.6)' ) it, energy end if if ( it >= it_max ) then exit end if it = it + 1 swap = 0 do i = 1, point_num ci = cluster(i) if ( cluster_population(ci) <= 1 ) then cycle end if do cj = 1, cluster_num if ( cj == ci ) then distsq(cj) = sum ( & ( point(1:dim_num,i) - cluster_center(1:dim_num,cj) )**2 ) & * dble ( cluster_population(cj) ) & / dble ( cluster_population(cj) - 1 ) else if ( cluster_population(cj) == 0 ) then cluster_center(1:dim_num,cj) = point(1:dim_num,i) distsq(cj) = 0.0D+00 else distsq(cj) = sum ( & ( point(1:dim_num,i) - cluster_center(1:dim_num,cj) )**2 ) & * dble ( cluster_population(cj) ) & / dble ( cluster_population(cj) + 1 ) end if end do ! ! Find the index of the minimum value of DISTSQ. ! list = minloc ( distsq(1:cluster_num) ) ! ! If that's not the cluster to which point I now belongs, move it there. ! if ( list(1) == ci ) then cycle end if cj = list(1) cluster_center(1:dim_num,ci) = & ( dble ( cluster_population(ci) ) * cluster_center(1:dim_num,ci) & - point(1:dim_num,i) ) / dble ( cluster_population(ci) - 1 ) cluster_center(1:dim_num,cj) = & ( dble ( cluster_population(cj) ) * cluster_center(1:dim_num,cj) & + point(1:dim_num,i) ) / dble ( cluster_population(cj) + 1 ) cluster_population(ci) = cluster_population(ci) - 1 cluster_population(cj) = cluster_population(cj) + 1 cluster(i) = cj swap = swap + 1 swap_total = swap_total + 1 end do if ( swap == 0 ) then exit end if end do return end subroutine kmeans_raw ( dim_num, point_num, cluster_num, it_max, it, & point, cluster, cluster_center, energy, null_cluster_policy ) ! !******************************************************************************* ! !! KMEANS_RAW tries to improve a partition of points. ! ! ! Discussion: ! ! This routine works with the raw data, and does not do any ! normalization. ! ! The data for the K-Means problem is a set of N points X in ! M-dimensions, and a desired number of clusters K. ! ! The goal is to determine K points Z, called cluster centers, so that ! if we associate each point X with its nearest Z value, we minimize ! the standard deviation or cluster energy. Writing CLUSTER(I) to ! indicate the index of the nearest cluster center to point X(I), the ! energy can be written as: ! ! Energy = Sum ( 1 <= I <= N ) distance ( X(I), Z(CLUSTER(I)) )**2 ! ! where ! ! distance ( X - Z ) = Sqrt ( Sum ( 1 <= J <= M ) ( X(J) - Z(J) )**2 ) ! ! Reference: ! ! Wendy Martinez and Angel Martinez, ! Computational Statistics Handbook with MATLAB, ! pages 373-376, ! Chapman and Hall / CRC, 2002 ! ! Modified: ! ! 28 March 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer POINT_NUM, the number of data points. ! ! Input, integer CLUSTER_NUM, the number of clusters. ! ! Input, integer IT_MAX, the maximum number of iterations allowed. ! ! Output, integer IT, the number of iterations taken. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the data points. ! ! Input/output, integer CLUSTER(POINT_NUM), the cluster to which ! each point belongs. On output, these may have been altered. ! ! Input/output, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_NUM), ! the centers associated with the clustering. On output, these may ! have been altered. ! ! Input, integer NULL_CLUSTER_POLICY, specifies what to do if a ! null cluster is encountered. ! 0, do nothing. ! 1, reset center of null cluster to a random data point; ! 2, reset center of null cluster to a random point in the data hull; ! implicit none ! integer cluster_num integer dim_num integer point_num ! integer ci integer cj integer, dimension ( point_num ) :: cluster double precision, dimension ( dim_num, cluster_num ) :: cluster_center double precision, dimension ( cluster_num ) :: cluster_energy integer, dimension ( cluster_num ) :: cluster_population double precision column_sum logical, parameter :: debug = .false. double precision, dimension ( cluster_num ) :: distsq double precision energy double precision, dimension ( point_num ) :: factor integer i integer it integer it_max integer j integer k integer list(1) integer null_cluster_num integer null_cluster_policy double precision, dimension ( dim_num, point_num ) :: point integer swap integer swap_num integer swap_total ! ! For each observation, calculate the distance from each cluster ! center, and assign to the nearest. ! do call nearest_cluster_raw ( dim_num, point_num, cluster_num, & cluster_center, point, cluster, swap_num ) ! ! Determine the cluster populations. ! cluster_population(1:cluster_num) = 0 do i = 1, point_num j = cluster(i) cluster_population(j) = cluster_population(j) + 1 end do ! ! If a cluster is empty, give it a new cluster center, and restart ! the process. ! WARNING: This can take a long time! ! null_cluster_num = 0 do j = 1, cluster_num if ( cluster_population(j) == 0 ) then null_cluster_num = null_cluster_num + 1 end if end do if ( null_cluster_num == 0 ) then exit end if if ( null_cluster_policy == 0 ) then exit end if if ( debug ) then write ( *, '(a,i6)' ) & 'KMEANS_RAW - Number of null clusters = ', null_cluster_num end if if ( null_cluster_policy == 1 ) then do j = 1, cluster_num if ( cluster_population(j) == 0 ) then call i_random ( 1, point_num, k ) cluster_center(1:dim_num,j) = point(1:dim_num,k) end if end do else if ( null_cluster_policy == 2 ) then do j = 1, cluster_num if ( cluster_population(j) == 0 ) then call random_number ( harvest = factor(1:point_num) ) column_sum = sum ( factor(1:point_num) ) factor(1:point_num) = factor(1:point_num) / column_sum cluster_center(1:dim_num,j) = & matmul ( point(1:dim_num,1:point_num), factor(1:point_num) ) end if end do end if end do ! ! Calculate the new cluster centers. ! cluster_center(1:dim_num,1:cluster_num) = 0.0D+00 do i = 1, point_num j = cluster(i) cluster_center(1:dim_num,j) = cluster_center(1:dim_num,j) & + point(1:dim_num,i) end do do i = 1, cluster_num cluster_center(1:dim_num,i) = cluster_center(1:dim_num,i) & / dble ( max ( cluster_population(i), 1 ) ) end do ! ! Carry out the iteration. ! it = 0 swap_total = 0 do ! ! Determine the energy. ! call energy_raw ( dim_num, point_num, cluster_num, point, & cluster_center, cluster, energy ) if ( .false. ) then write ( *, '(i6,3x,14x,3x,g14.6)' ) it, energy end if if ( it >= it_max ) then exit end if it = it + 1 swap = 0 do i = 1, point_num ci = cluster(i) if ( cluster_population(ci) <= 1 ) then cycle end if do cj = 1, cluster_num if ( cj == ci ) then distsq(cj) = sum ( & ( point(1:dim_num,i) - cluster_center(1:dim_num,cj) )**2 ) & * dble ( cluster_population(cj) ) & / dble ( cluster_population(cj) - 1 ) else if ( cluster_population(cj) == 0 ) then cluster_center(1:dim_num,cj) = point(1:dim_num,i) distsq(cj) = 0.0D+00 else distsq(cj) = sum ( & ( point(1:dim_num,i) - cluster_center(1:dim_num,cj) )**2 ) & * dble ( cluster_population(cj) ) & / dble ( cluster_population(cj) + 1 ) end if end do ! ! Find the index of the minimum value of DISTSQ. ! list = minloc ( distsq(1:cluster_num) ) ! ! If that's not the cluster to which point I now belongs, move it there. ! if ( list(1) == ci ) then cycle end if cj = list(1) cluster_center(1:dim_num,ci) = & ( dble ( cluster_population(ci) ) * cluster_center(1:dim_num,ci) & - point(1:dim_num,i) ) / dble ( cluster_population(ci) - 1 ) cluster_center(1:dim_num,cj) = & ( dble ( cluster_population(cj) ) * cluster_center(1:dim_num,cj) & + point(1:dim_num,i) ) / dble ( cluster_population(cj) + 1 ) cluster_population(ci) = cluster_population(ci) - 1 cluster_population(cj) = cluster_population(cj) + 1 cluster(i) = cj swap = swap + 1 swap_total = swap_total + 1 end do if ( swap == 0 ) then exit end if end do return end subroutine nearest_cluster_normal ( dim_num, cluster_num, cluster_center, & point, nearest ) ! !******************************************************************************* ! !! NEAREST_CLUSTER_NORMAL finds the cluster nearest to a data point. ! ! ! Discussion: ! ! This routine uses the "normalized" data. ! ! An appropriate multiple of the steady state solution has been ! subtracted from each data vector, so that it satisfies the ! zero boundary condition of the problem. ! ! The distance between two solutions is the absolute value of ! the sine of the angle between them. ! ! Modified: ! ! 16 April 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer CLUSTER_NUM, the number of center points. ! ! Input, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_NUM), the ! coordinates of the center points. ! ! Input, double precision POINT(DIM_NUM), the data point to be checked. ! ! Output, integer NEAREST, the index of the center point closest to ! the data point. ! implicit none ! integer cluster_num integer dim_num ! double precision, dimension ( dim_num, cluster_num ) :: cluster_center double precision dist double precision dist_new double precision distance_normal_sq integer j integer nearest double precision, dimension ( dim_num ) :: point ! dist = huge ( dist ) nearest = 0 do j = 1, cluster_num dist_new = distance_normal_sq ( dim_num, point, cluster_center(1,j) ) if ( dist_new < dist ) then dist = dist_new nearest = j end if end do return end subroutine nearest_cluster_raw ( dim_num, point_num, cluster_num, & cluster_center, point, cluster, swap_num ) ! !******************************************************************************* ! !! NEAREST_CLUSTER_RAW finds the cluster nearest to a data point. ! ! ! Discussion: ! ! This routine uses the "raw" data. Data is not normalized. ! Distance is the usual Euclidean distance. ! ! Modified: ! ! 16 April 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer CLUSTER_NUM, the number of center points. ! ! Input, double precision CLUSTER_CENTER(DIM_NUM,CLUSTER_NUM), the ! coordinates of the center points. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the data points ! to be checked. ! ! Input/output, integer CLUSTER(POINT_NUM). On input, the cluster to ! which each point was assigned. On output, the cluster to which ! each point has been reassigned. ! ! Output, integer SWAP_NUM, the number of times a point was moved ! from its input cluster to a different cluster. ! implicit none ! integer cluster_num integer dim_num integer point_num ! integer, dimension ( point_num ) :: cluster double precision, dimension ( dim_num, cluster_num ) :: cluster_center double precision dist double precision dist_new integer i integer j integer nearest double precision, dimension ( dim_num, point_num ) :: point integer swap_num ! swap_num = 0 do i = 1, point_num dist = huge ( dist ) nearest = 0 do j = 1, cluster_num dist_new = sum ( ( point(1:dim_num,i) - cluster_center(1:dim_num,j) )**2 ) if ( dist_new < dist ) then dist = dist_new nearest = j end if end do if ( nearest /= cluster(i) ) then swap_num = swap_num + 1 end if cluster(i) = nearest end do return end subroutine nearest_point ( dim_num, cluster_num, center, point, nearest ) ! !******************************************************************************* ! !! NEAREST_POINT finds the center point nearest a data point. ! ! ! Modified: ! ! 09 August 2001 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer CLUSTER_NUM, the number of center points. ! ! Input, double precision CENTER(DIM_NUM,CLUSTER_NUM), the coordinates of the ! center points. ! ! Input, double precision POINT(DIM_NUM), the data point to be checked. ! ! Output, integer NEAREST, the index of the center point closest to ! the data point. ! implicit none ! integer cluster_num integer dim_num ! double precision, dimension ( dim_num, cluster_num ) :: center double precision dist double precision dist_new integer j integer nearest double precision, dimension ( dim_num ) :: point ! dist = huge ( dist ) nearest = 0 do j = 1, cluster_num dist_new = sum ( ( point(1:dim_num) - center(1:dim_num,j) )**2 ) if ( dist_new < dist ) then dist = dist_new nearest = j end if end do return end subroutine point_generate ( point_dist, dim_num, r_min, r_max, point_num, & point ) ! !******************************************************************************* ! !! POINT_GENERATE generates data points for the problem. ! ! ! Modified: ! ! 09 August 2001 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer POINT_DIST, the point distribution to use. ! 1, use a uniform random distribution. ! 2, use a uniform grid of points. (This hasn't been set up properly ! except for 1D!). ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, double precision R_MIN(DIM_NUM), R_MAX(DIM_NUM), the coordinates ! of the minimum and maximum corners of the region. ! ! Input, integer POINT_NUM, the number of points to generate. ! ! Output, double precision POINT(DIM_NUM,POINT_NUM), the coordinates ! of the points. ! implicit none ! integer dim_num integer point_num ! integer i double precision point(dim_num,point_num) integer point_dist double precision r double precision r_max(dim_num) double precision r_min(dim_num) ! if ( point_dist == 1 ) then call random_number ( harvest = point(1:dim_num,1:point_num) ) do i = 1, dim_num point(i,1:point_num) = r_min(i) + & point(i,1:point_num) * ( r_max(i) - r_min(i) ) end do else if ( point_dist == 2 ) then do i = 1, point_num if ( point_num > 1 ) then r = dble ( i - 1 ) / ( point_num - 1 ) else r = 0.5D+00 end if point(1:dim_num,i) = ( 1.0D+00 - r ) * r_min(1:dim_num) & + r * r_max(1:dim_num) end do else write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'POINT_GENERATE - Fatal error!' write ( *, '(a)' ) ' Meaningless input value of point distribution,' write ( *, '(a,i6)' ) ' POINT_DIST = ', point_dist stop end if return end subroutine point_print ( dim_num, point_num, point ) ! !******************************************************************************* ! !! POINT_PRINT prints out the values of the data points. ! ! ! Modified: ! ! 09 August 2001 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer DIM_NUM, the number of spatial dimensions. ! ! Input, integer POINT_NUM, the number of points. ! ! Input, double precision POINT(DIM_NUM,POINT_NUM), the coordinates of ! the points. ! implicit none ! integer dim_num integer point_num ! integer i double precision point(dim_num,point_num) ! write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'Data points:' write ( *, '(a)' ) ' ' do i = 1, point_num write ( *, '(i6)' ) i write ( *, '(8f10.4)' ) point(1:dim_num,i) end do return end subroutine d_random ( rlo, rhi, r ) ! !******************************************************************************* ! !! D_RANDOM returns a random double precision in a given range. ! ! ! Modified: ! ! 06 April 2001 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, double precision RLO, RHI, the minimum and maximum values. ! ! Output, double precision R, the randomly chosen value. ! implicit none ! double precision r double precision rhi double precision rlo double precision t ! ! Pick T, a random number in (0,1). ! call random_number ( harvest = t ) ! ! Set R in ( RLO, RHI ). ! r = ( 1.0D+00 - t ) * rlo + t * rhi return end subroutine random_initialize ( seed ) ! !******************************************************************************* ! !! RANDOM_INITIALIZE initializes the FORTRAN 90 random number seed. ! ! ! Discussion: ! ! If you don't initialize the random number generator, its behavior ! is not specified. If you initialize it simply by: ! ! call random_seed ! ! its behavior is not specified. On the DEC ALPHA, if that's all you ! do, the same random number sequence is returned. In order to actually ! try to scramble up the random number generator a bit, this routine ! goes through the tedious process of getting the size of the random ! number seed, making up values based on the current time, and setting ! the random number seed. ! ! Modified: ! ! 03 April 2001 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input/output, integer SEED. ! If SEED is zero on input, then you're asking this routine to come up ! with a seed value, which is returned as output. ! If SEED is nonzero on input, then you're asking this routine to ! use the input value of SEED to initialize the random number generator. ! implicit none ! integer count integer count_max integer count_rate integer i integer seed integer, allocatable :: seed_vector(:) integer seed_size double precision t ! ! Initialize the random number seed. ! call random_seed ! ! Determine the size of the random number seed. ! call random_seed ( size = seed_size ) ! ! Allocate a seed of the right size. ! allocate ( seed_vector(seed_size) ) if ( seed /= 0 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'RANDOM_INITIALIZE' write ( *, '(a,i12)' ) ' Initialize RANDOM_NUMBER with user SEED = ', seed else call system_clock ( count, count_rate, count_max ) seed = count write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'RANDOM_INITIALIZE' write ( *, '(a,i12)' ) & ' Initialize RANDOM_NUMBER with arbitrary SEED = ', seed end if ! ! Now set the seed. ! seed_vector(1:seed_size) = seed call random_seed ( put = seed_vector(1:seed_size) ) ! ! Free up the seed space. ! deallocate ( seed_vector ) ! ! Call the random number routine a bunch of times. ! do i = 1, 100 call random_number ( harvest = t ) end do return end function dvec_norm2 ( n, a ) ! !******************************************************************************* ! !! DVEC_NORM2 returns the 2-norm of a vector. ! ! ! Definition: ! ! The vector 2-norm is defined as: ! ! DVEC_NORM2 = Sqrt ( Sum ( 1 <= I <= N ) A(I)**2 ). ! ! Modified: ! ! 18 November 2000 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer N, the number of entries in A. ! ! Input, double precision A(N), the vector whose 2-norm is desired. ! ! Output, double precision DVEC_NORM2, the 2-norm of A. ! implicit none ! integer n ! double precision a(n) double precision dvec_norm2 ! dvec_norm2 = sqrt ( sum ( a(1:n)**2 ) ) return end subroutine dvec_range_input ( string, dim_num, value1, value2, ierror ) ! !******************************************************************************* ! !! DVEC_RANGE_INPUT reads two double precision vectors from the user, representing a range. ! ! ! Modified: ! ! 09 August 2001 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) STRING, the prompt string. ! ! Input, integer DIM_NUM, the number of dimensions. ! ! Output, double precision VALUE1(DIM_NUM), VALUE2(DIM_NUM), the values ! entered by the user. ! ! Output, integer IERROR, an error flag, which is zero if no error occurred. ! implicit none ! integer dim_num ! integer ierror character ( len = * ) string double precision value1(dim_num) double precision value2(dim_num) ! ierror = 0 write ( *, '(a)' ) ' ' write ( *, '(a)' ) trim ( string ) read ( *, *, iostat = ierror ) value1(1:dim_num), value2(1:dim_num) return end subroutine dvec_unit_euclidean ( n, a ) ! !******************************************************************************* ! !! DVEC_UNIT_EUCLIDEAN normalizes a N-vector in the Euclidean norm. ! ! ! Discussion; ! ! The euclidean norm is also sometimes called the l2 or ! least squares norm. ! ! Modified: ! ! 20 February 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer N, the dimension of the vector. ! ! Input/output, A(N), the vector to be normalized. ! implicit none ! integer n ! double precision a(n) double precision norm ! norm = sqrt ( sum ( a(1:n)**2 ) ) if ( norm /= 0.0D+00 ) then a(1:n) = a(1:n) / norm end if return end subroutine s_input ( string, value, ierror ) ! !******************************************************************************* ! !! S_INPUT prints a prompt string and reads a string from the user. ! ! ! Discussion: ! ! If the input line starts with a comment character ('#'), or is blank, ! the routine ignores that line, and tries to read the next one. ! ! Modified: ! ! 27 March 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) STRING, the prompt string. ! ! Output, character ( len = * ) VALUE, the value input by the user. ! ! Output, integer IERROR, an error flag, which is zero if no error occurred. ! implicit none ! integer ierror character ( len = 80 ) line character ( len = * ) string character ( len = * ) value ! ierror = 0 value = ' ' ! ! Write the prompt. ! write ( *, '(a)' ) ' ' write ( *, '(a)' ) trim ( string ) do read ( *, '(a)', iostat = ierror ) line if ( ierror /= 0 ) then write ( *, '(a)' ) ' ' write ( *, '(a)' ) 'S_INPUT: Fatal error!' write ( *, '(a)' ) ' Input error!' stop end if ! ! If the line begins with a comment character, go back and read the next line. ! if ( line(1:1) == '#' ) then cycle end if if ( len_trim ( line ) == 0 ) then cycle end if value = line exit end do return end function s_of_i ( i ) ! !******************************************************************************* ! !! S_OF_I converts an integer to a left-justified string. ! ! ! Examples: ! ! I S ! ! 1 1 ! -1 -1 ! 0 0 ! 1952 1952 ! 123456 123456 ! ! Modified: ! ! 13 February 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, integer I, an integer to be converted. ! ! Output, character ( len = 11 ) S_OF_I, the representation of the ! integer. The integer will be left-justified. ! implicit none ! character c integer i integer idig integer ihi integer ilo integer ipos integer ival integer j character ( len = 11 ) s character ( len = 11 ) s_of_i ! s = ' ' ilo = 1 ihi = 11 ! ! Make a copy of the integer. ! ival = i ! ! Handle the negative sign. ! if ( ival < 0 ) then if ( ihi <= 1 ) then s(1:1) = '*' return end if ival = - ival s(1:1) = '-' ilo = 2 end if ! ! The absolute value of the integer goes into S(ILO:IHI). ! ipos = ihi ! ! Find the last digit, strip it off, and stick it into the string. ! do idig = mod ( ival, 10 ) ival = ival / 10 if ( ipos < ilo ) then do j = 1, ihi s(j:j) = '*' end do return end if call digit_to_ch ( idig, c ) s(ipos:ipos) = c ipos = ipos - 1 if ( ival == 0 ) then exit end if end do ! ! Shift the string to the left. ! s(ilo:ilo+ihi-ipos-1) = s(ipos+1:ihi) s(ilo+ihi-ipos:ihi) = ' ' s_of_i = s return end subroutine s_rep_ch ( s, c1, c2 ) ! !******************************************************************************* ! !! S_REP_CH replaces all occurrences of one character by another. ! ! ! Modified: ! ! 27 March 2002 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input/output, character ( len = * ) S, the string. ! ! Input, character C1, C2, the character to be replaced, and the ! replacement character. ! implicit none ! character c1 character c2 integer i character ( len = * ) s ! do i = 1, len ( s ) if ( s(i:i) == c1 ) then s(i:i) = c2 end if end do return end subroutine s_to_i ( s, ival, ierror, last ) ! !******************************************************************************* ! !! S_TO_I reads an integer value from a string. ! ! ! Modified: ! ! 28 June 2000 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) S, a string to be examined. ! ! Output, integer IVAL, the integer value read from the string. ! If the string is blank, then IVAL will be returned 0. ! ! Output, integer IERROR, an error flag. ! 0, no error. ! 1, an error occurred. ! ! Output, integer LAST, the last character of S used to make IVAL. ! implicit none ! character c integer i integer ierror integer isgn integer istate integer ival integer last character ( len = * ) s ! ierror = 0 istate = 0 isgn = 1 ival = 0 do i = 1, len_trim ( s ) c = s(i:i) ! ! Haven't read anything. ! if ( istate == 0 ) then if ( c == ' ' ) then else if ( c == '-' ) then istate = 1 isgn = -1 else if ( c == '+' ) then istate = 1 isgn = + 1 else if ( lle ( '0', c ) .and. lle ( c, '9' ) ) then istate = 2 ival = ichar ( c ) - ichar ( '0' ) else ierror = 1 return end if ! ! Have read the sign, expecting digits. ! else if ( istate == 1 ) then if ( c == ' ' ) then else if ( lle ( '0', c ) .and. lle ( c, '9' ) ) then istate = 2 ival = ichar ( c ) - ichar ( '0' ) else ierror = 1 return end if ! ! Have read at least one digit, expecting more. ! else if ( istate == 2 ) then if ( lle ( '0', c ) .and. lle ( c, '9' ) ) then ival = 10 * ival + ichar ( c ) - ichar ( '0' ) else ival = isgn * ival last = i - 1 return end if end if end do ! ! If we read all the characters in the string, see if we're OK. ! if ( istate == 2 ) then ival = isgn * ival last = len_trim ( s ) else ierror = 1 last = 0 end if return end subroutine s_to_d ( s, r, ierror, lchar ) ! !******************************************************************************* ! !! S_TO_D reads a double precision number from a string. ! ! ! Discussion: ! ! This routine will read as many characters as possible until it reaches ! the end of the string, or encounters a character which cannot be ! part of the double precision number. ! ! Legal input is: ! ! 1 blanks, ! 2 '+' or '-' sign, ! 2.5 spaces ! 3 integer part, ! 4 decimal point, ! 5 fraction part, ! 6 'E' or 'e' or 'D' or 'd', exponent marker, ! 7 exponent sign, ! 8 exponent integer part, ! 9 exponent decimal point, ! 10 exponent fraction part, ! 11 blanks, ! 12 final comma or semicolon. ! ! with most quantities optional. ! ! Examples: ! ! S R ! ! '1' 1.0 ! ' 1 ' 1.0 ! '1A' 1.0 ! '12,34,56' 12.0 ! ' 34 7' 34.0 ! '-1E2ABCD' -100.0 ! '-1X2ABCD' -1.0 ! ' 2D-1' 0.2 ! '23.45' 23.45 ! '-4.2D+2' -420.0 ! '17d2' 1700.0 ! '-14e-2' -0.14 ! 'e2' 100.0 ! '-12.73e-9.23' -12.73 * 10.0**(-9.23) ! ! Modified: ! ! 12 February 2001 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) S, the string containing the ! data to be read. Reading will begin at position 1 and ! terminate at the end of the string, or when no more ! characters can be read to form a legal number. Blanks, ! commas, or other nonnumeric data will, in particular, ! cause the conversion to halt. ! ! Output, double precision R, the value that was read from the string. ! ! Output, integer IERROR, error flag. ! ! 0, no errors occurred. ! ! 1, 2, 6 or 7, the input number was garbled. The ! value of IERROR is the last type of input successfully ! read. For instance, 1 means initial blanks, 2 means ! a plus or minus sign, and so on. ! ! Output, integer LCHAR, the number of characters read from ! the string to form the number, including any terminating ! characters such as a trailing comma or blanks. ! implicit none ! character c logical ch_eqi integer ierror integer ihave integer isgn integer iterm integer jbot integer jsgn integer jtop integer lchar integer nchar integer ndig double precision r double precision rbot double precision rexp double precision rtop character ( len = * ) s character, parameter :: TAB = char ( 9 ) ! nchar = len_trim ( s ) ierror = 0 r = 0.0D+00 lchar = - 1 isgn = 1 rtop = 0.0D+00 rbot = 1.0D+00 jsgn = 1 jtop = 0 jbot = 1 ihave = 1 iterm = 0 do lchar = lchar + 1 c = s(lchar+1:lchar+1) ! ! Blank or TAB character. ! if ( c == ' ' .or. c == TAB ) then if ( ihave == 2 ) then else if ( ihave == 6 .or. ihave == 7 ) then iterm = 1 else if ( ihave > 1 ) then ihave = 11 end if ! ! Comma. ! else if ( c == ',' .or. c == ';' ) then if ( ihave /= 1 ) then iterm = 1 ihave = 12 lchar = lchar + 1 end if ! ! Minus sign. ! else if ( c == '-' ) then if ( ihave == 1 ) then ihave = 2 isgn = - 1 else if ( ihave == 6 ) then ihave = 7 jsgn = - 1 else iterm = 1 end if ! ! Plus sign. ! else if ( c == '+' ) then if ( ihave == 1 ) then ihave = 2 else if ( ihave == 6 ) then ihave = 7 else iterm = 1 end if ! ! Decimal point. ! else if ( c == '.' ) then if ( ihave < 4 ) then ihave = 4 else if ( ihave >= 6 .and. ihave <= 8 ) then ihave = 9 else iterm = 1 end if ! ! Exponent marker. ! else if ( ch_eqi ( c, 'E' ) .or. ch_eqi ( c, 'D' ) ) then if ( ihave < 6 ) then ihave = 6 else iterm = 1 end if ! ! Digit. ! else if ( ihave < 11 .and. lge ( c, '0' ) .and. lle ( c, '9' ) ) then if ( ihave <= 2 ) then ihave = 3 else if ( ihave == 4 ) then ihave = 5 else if ( ihave == 6 .or. ihave == 7 ) then ihave = 8 else if ( ihave == 9 ) then ihave = 10 end if call ch_to_digit ( c, ndig ) if ( ihave == 3 ) then rtop = 10.0D+00 * rtop + dble ( ndig ) else if ( ihave == 5 ) then rtop = 10.0D+00 * rtop + dble ( ndig ) rbot = 10.0D+00 * rbot else if ( ihave == 8 ) then jtop = 10 * jtop + ndig else if ( ihave == 10 ) then jtop = 10 * jtop + ndig jbot = 10 * jbot end if ! ! Anything else is regarded as a terminator. ! else iterm = 1 end if ! ! If we haven't seen a terminator, and we haven't examined the ! entire string, go get the next character. ! if ( iterm == 1 .or. lchar+1 >= nchar ) then exit end if end do ! ! If we haven't seen a terminator, and we have examined the ! entire string, then we're done, and LCHAR is equal to NCHAR. ! if ( iterm /= 1 .and. lchar+1 == nchar ) then lchar = nchar end if ! ! Number seems to have terminated. Have we got a legal number? ! Not if we terminated in states 1, 2, 6 or 7! ! if ( ihave == 1 .or. ihave == 2 .or. ihave == 6 .or. ihave == 7 ) then ierror = ihave return end if ! ! Number seems OK. Form it. ! if ( jtop == 0 ) then rexp = 1.0D+00 else if ( jbot == 1 ) then rexp = 10.0D+00**( jsgn * jtop ) else rexp = jsgn * jtop rexp = rexp / jbot rexp = 10.0D+00**rexp end if end if r = isgn * rexp * rtop / rbot return end subroutine timestamp ( ) ! !******************************************************************************* ! !! TIMESTAMP prints the current YMDHMS date as a time stamp. ! ! ! Example: ! ! May 31 2001 9:45:54.872 AM ! ! Modified: ! ! 31 May 2001 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! None ! implicit none ! character ( len = 8 ) ampm integer d character ( len = 8 ) date integer h integer m integer mm character ( len = 9 ), parameter, dimension(12) :: month = (/ & 'January ', 'February ', 'March ', 'April ', & 'May ', 'June ', 'July ', 'August ', & 'September', 'October ', 'November ', 'December ' /) integer n integer s character ( len = 10 ) time integer values(8) integer y character ( len = 5 ) zone ! call date_and_time ( date, time, zone, values ) y = values(1) m = values(2) d = values(3) h = values(5) n = values(6) s = values(7) mm = values(8) if ( h < 12 ) then ampm = 'AM' else if ( h == 12 ) then if ( n == 0 .and. s == 0 ) then ampm = 'Noon' else ampm = 'PM' end if else h = h - 12 if ( h < 12 ) then ampm = 'PM' else if ( h == 12 ) then if ( n == 0 .and. s == 0 ) then ampm = 'Midnight' else ampm = 'AM' end if end if end if write ( *, '(a,1x,i2,1x,i4,2x,i2,a1,i2.2,a1,i2.2,a1,i3.3,1x,a)' ) & trim ( month(m) ), d, y, h, ':', n, ':', s, '.', mm, trim ( ampm ) return end subroutine word_count ( s, nword ) ! !******************************************************************************* ! !! WORD_COUNT counts the number of "words" in a string. ! ! ! Modified: ! ! 14 April 1999 ! ! Author: ! ! John Burkardt ! ! Parameters: ! ! Input, character ( len = * ) S, the string to be examined. ! ! Output, integer NWORD, the number of "words" in the string. ! Words are presumed to be separated by one or more blanks. ! implicit none ! logical blank integer i integer lens integer nword character ( len = * ) s ! nword = 0 lens = len ( s ) if ( lens <= 0 ) then return end if blank = .true. do i = 1, lens if ( s(i:i) == ' ' ) then blank = .true. else if ( blank ) then nword = nword + 1 blank = .false. end if end do return end