125 lines
		
	
	
		
			5.5 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			125 lines
		
	
	
		
			5.5 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| 		Brief Description of chkfft, by K1JT
 | |
| 		------------------------------------
 | |
| 
 | |
| Discrete Fourier transforms (DFTs) are found at the root of most
 | |
| digital signal processing tasks. In WSJT and its sister programs the
 | |
| transforms are done using the FFTW library, and subroutine four2
 | |
| provides a convenient interface to the library.  Program chkfft is a
 | |
| command-line utility offering a convenient way to test FFT execution
 | |
| times under a variety of circumstances.
 | |
| 
 | |
| To compile chkfft in Linux:
 | |
| 
 | |
| $ gfortran -o chkfft chkfft.f90 four2a.f90 f77_wisdom.f90 gran.c -lfftw3f
 | |
| 
 | |
| To compile chkfft in Windows (you may need to customize the hard-coded
 | |
| path shown here for libfftw3f-3.dll):
 | |
| 
 | |
| > gfortran -o chkfft chkfft.f90 four2a.f90 f77_wisdom.f90 gran.c \
 | |
|   /JTSDK-QT/appsupport/runtime/libfftw3f-3.dll
 | |
| 
 | |
| To see a brief usage message, type chkfft at the command prompt:
 | |
|  
 | |
| $ chkfft
 | |
|  Usage: chkfft <nfft | infile> nr nw nc np
 | |
|         nfft:   length of FFT
 | |
|         nfft=0: do lengths 2^n, n=2^4 to 2^23
 | |
|         infile: name of file with nfft values, one per line
 | |
|         nr:     0/1 to not read (or read) wisdom
 | |
|         nw:     0/1 to not write (or write) wisdom
 | |
|         nc:     0/1 for real or complex data
 | |
|         np:     0-4 patience for finding best algorithm
 | |
| 
 | |
| As an example, to measure the speed of a complex DFT of length 131072:
 | |
| 
 | |
| #######################################################################
 | |
| $ chkfft 131072 0 1 1 2
 | |
| 
 | |
| nfft:     131072   nr: 0   nw 1   nc: 1   np: 2
 | |
| 
 | |
|     NFFT     Time        rms      MHz   MFlops  iters  tplan
 | |
| -------------------------------------------------------------
 | |
|   131072  0.0021948  0.00000032  59.72  5076.1     231   2.9
 | |
| #######################################################################
 | |
| 
 | |
| Program output shows that on the test machine the average time for one
 | |
| forward (or inverse) transform of length N=131072 is about 2.2 ms,
 | |
| corresponding to slightly over 5 GFlops computing speed.  The planning
 | |
| time in FFTW was 2.9 s.
 | |
| 
 | |
| Running the command again with parameter nr=1 will use the 
 | |
| "wisdom" already accumulated for complex N=131072 FFTs.  The execution
 | |
| speed will be essentially the same, but no planning time is required:  
 | |
| 
 | |
| #######################################################################
 | |
| $ chkfft 131072 1 1 1 2
 | |
| 
 | |
| nfft:     131072   nr: 1   nw 1   nc: 1   np: 2
 | |
| 
 | |
|     NFFT     Time        rms      MHz   MFlops  iters  tplan
 | |
| -------------------------------------------------------------
 | |
|   131072  0.0021575  0.00000032  60.75  5164.0     235   0.0
 | |
| #######################################################################
 | |
| 
 | |
| Optimized algorithms can compute DFTs much faster for lengths that are
 | |
| the product of small integers.  Length N=131072 = 2^17 is a good
 | |
| example, and FFTs should be very efficient.  For comparison, look at
 | |
| the speed for N=131071, a prime number.  The average time is now about
 | |
| 7 times larger:
 | |
| 
 | |
| #######################################################################
 | |
| C:\JTSDK-QT\src\wsjtx\lib>chkfft 131071 1 1 1 2
 | |
| 
 | |
| nfft:     131071   nr: 1   nw 1   nc: 1   np: 2
 | |
| 
 | |
|     NFFT     Time        rms      MHz   MFlops  iters  tplan
 | |
| -------------------------------------------------------------
 | |
|   131071  0.0153637  0.00000065   8.53   725.2      33   5.6
 | |
| #######################################################################
 | |
| 
 | |
| Here's an example that measures execution times for all integral
 | |
| power-of-2 lengths from 2^4 to 2^23:
 | |
| 
 | |
| #######################################################################
 | |
| $ chkfft 0 1 1 1 2
 | |
| 
 | |
| nfft:          0   nr: 1   nw 1   nc: 1   np: 2
 | |
| 
 | |
|   n   N=2^n    Time        rms      MHz   MFlops  iters  tplan
 | |
| ---------------------------------------------------------------
 | |
|  4      16  0.0000003  0.00000014  58.61  1172.2 1000000   0.0
 | |
|  5      32  0.0000004  0.00000016  89.19  2229.6 1000000   0.0
 | |
|  6      64  0.0000006  0.00000016 109.44  3283.2  866975   0.0
 | |
|  7     128  0.0000009  0.00000021 135.92  4757.1  538369   0.0
 | |
|  8     256  0.0000016  0.00000020 158.40  6335.8  313701   0.0
 | |
|  9     512  0.0000032  0.00000021 162.53  7313.8  160943   0.1
 | |
| 10    1024  0.0000067  0.00000023 152.53  7626.5   75521   0.1
 | |
| 11    2048  0.0000136  0.00000025 150.42  8273.3   37239   0.2
 | |
| 12    4096  0.0000316  0.00000027 129.75  7784.8   16060   0.3
 | |
| 13    8192  0.0000720  0.00000026 113.75  7393.8    7040   0.5
 | |
| 14   16384  0.0001620  0.00000028 101.11  7078.0    3129   0.9
 | |
| 15   32768  0.0003227  0.00000030 101.53  7615.1    1571   1.7
 | |
| 16   65536  0.0010020  0.00000030  65.41  5232.5     506   4.1
 | |
| 17  131072  0.0021575  0.00000032  60.75  5164.0     235   0.0
 | |
| 18  262144  0.0053937  0.00000032  48.60  4374.2      94   3.6
 | |
| 19  524288  0.0190668  0.00000034  27.50  2612.2      27   6.8
 | |
| 20 1048576  0.0468001  0.00000035  22.41  2240.5      11   2.4
 | |
| 21 2097152  0.0936012  0.00000036  22.41  2352.5       6  31.6
 | |
| 22 4194304  0.1949997  0.00000037  21.51  2366.0       3   9.8
 | |
| 23 8388608  0.4212036  0.00000038  19.92  2290.3       2 112.9
 | |
| #######################################################################
 | |
| 
 | |
| Test data for all transforms is gaussian random noise of zero mean and
 | |
| standard deviation 1.  Tabulated values of "rms" are the
 | |
| root-mean-square differences between the original data and the
 | |
| back-transfmred data.
 | |
| 
 | |
| File nfft.dat contains all numbers between 2^3 and 2^23 with no factor
 | |
| greater than 7, followed by their factors.  These numbers are good
 | |
| choices for FFT lengths.  File all_fft.out gives the result on one
 | |
| machine of running the command 
 | |
| 
 | |
| $ chkfft nfft.dat 0 1 1 2
 | |
| 
 | |
| Take note: this task may take as much as 24 hours, or even more!
 | 
