FFT results of c++ vs MATLAB

Pages: 12
I have the following code in c++ with an fft function that has been tested before:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <cmath>
#include<math.h>
#include<stdio.h>
#include "fftw3.h"
#include <sys/resource.h>

#define size 3 // for the cross product function call
using namespace std;
int main(){	
double Lx = 2 * M_PI;
double A = (2 * M_PI)/Lx;

double *ne0;
ne0 = (double*) fftw_malloc((((ny+1)*nx))*sizeof(double)); 
	
fftw_complex *ne0k; 
ne0k = (fftw_complex*) fftw_malloc(((ny+1)*nyk)*sizeof(fftw_complex)); 
memset(ne0k, 42, ((ny+1)*nyk)* sizeof(fftw_complex)); 


double *XX;
XX = (double*) fftw_malloc((nx*(ny+1))*sizeof(double));
double *YY;
YY = (double*) fftw_malloc((nx*(ny+1))*sizeof(double));
for(int i = 0; i< nx+1; i++){
		for(int j = 0; j< ny; j++){
			XX[i + (ny+1)*j] = (j)*2*M_PI/nx; 
			YY[i + (ny+1)*j] = -1. * cos(((i) * M_PI )/ny);	
                        ne0[i + (ny+1)*j] = pow((YY[i + (ny+1)*j]-0.5),2) * sin(A * XX[i + (ny+1)*j]);
		 		
		}		
	}
//Take FFT
r2cfft(ne0, ne0k);

}

where my r2cfft looks like:
1
2
3
4
5
6
7
8
9
10
11
void r2cfft(double rArr[], double cArr[][ncomp]){
	fftw_plan r2c; 

	 r2c = fftw_plan_dft_r2c_2d(nx, ny, &rArr[0], &cArr[0], FFTW_ESTIMATE);

	fftw_execute(r2c);
	fftw_destroy_plan(r2c);
	fftw_cleanup();

	
}

This is however is not matching a simple MATALB code:
1
2
u = (Y-0.5).^2 .* sin( (2*pi / Lx) * X); %Y.^2 .* sin( (2*pi / Lx) * X);
uh = fft(u);

Everything else in code matches (X,Y, and u), except for the fft part.
use <cmath> and drop math.h ... you don't need the C (math.h) one.

is A supposed to be ~1.0? see line 11?

I advise you to print all the variables in both languages for one iteration and see which is different.
As a starting point check that matlab is using radians, not degrees, for trig. I think it can be set either way but c++ is radians only. That seems ok if Y matches... work backwards, does NEO match?

it is in your interest to make the code as identical as possible at first, get it working. Then if you need to condense it for efficiency, you can do that.
got u in matlab, make u in c++.
got Y in matlab, make Y in c++.
got array one liner processing in matlab, use a reference to the index in c++ to hide the indexing logic from the equation.
the closer it looks, the easier it will be to debug and follow.
Last edited on
Try swapping the dimensions nx and ny in your r2cfft function. This will correct issues with a mismatched storage order.

If that doesn't solve the problem, a compileable example & expected output would definitely help.
Do the values returned from FFTW differ from your "expected" by a constant factor?

(it's possible that you need to re-scale the result)
Last edited on
Thanks everyone! I will share the outputs and a compliable example as well for you to look at.
@jonnin
is A supposed to be ~1.0? see line 11?

Yes it is. I printed everything between the two codes and I got X and Y matching. A is matching and ne0[i + (ny+1)*j] is also matching between C++ and MATLAB. Only the FFT output is not.

@mbozzi I will try sharing a folder or a full code here that you can run.

@kigar64551
Unfortunately they differ completely!
Here are the outputs,

FFT in C++:

+3.82e+00  +0.00e+00  -1.80e+00  -4.00e+01  -1.97e+00  -6.30e-01  +4.05e-01  -1.52e+00  +9.34e-01  -6.64e-01  +1.05e+00  +0.00e+00
-1.85e+01  -2.02e+01  +6.62e-01  +2.07e+00  -1.23e+00  +6.41e-02  +2.02e-01  -8.18e-01  +6.36e-01  -4.72e-01  +9.01e-01  -6.76e-02
+6.46e-01  -8.28e-01  -5.02e-02  +1.32e+00  -1.19e+00  -6.35e-02  -9.81e-02  -5.93e-01  +1.47e-01  -4.52e-01  +3.12e-01  -3.33e-01
+6.04e-01  -4.22e-01  -2.48e-01  +9.81e-01  -9.30e-01  -2.42e-01  +4.00e-02  -4.36e-01  +1.82e-01  -2.85e-01  +2.70e-01  -1.70e-01
+5.91e-01  -1.87e-01  -4.27e-01  +7.81e-01  -7.76e-01  -4.23e-01  +1.37e-01  -3.52e-01  +2.15e-01  -1.90e-01  +2.38e+00  +4.92e+00
+5.88e-01  +0.00e+00  -6.02e-01  +6.27e-01  -6.59e-01  -6.19e-01  +2.24e-01  -2.89e-01  +6.16e-01  +5.00e+00  +2.54e-01  +0.00e+00
-1.80e+00  -4.00e+01  -1.97e+00  -6.30e-01  +4.05e-01  -1.52e+00  +9.34e-01  -6.64e-01  +1.05e+00  +0.00e+00  +9.34e-01  +6.64e-01
+6.62e-01  +2.07e+00  -1.23e+00  +6.41e-02  +2.02e-01  -8.18e-01  +6.36e-01  -4.72e-01  +9.01e-01  -6.76e-02  +1.10e+00  +5.46e-01
-5.02e-02  +1.32e+00  -1.19e+00  -6.35e-02  -9.81e-02  -5.93e-01  +1.47e-01  -4.52e-01  +3.12e-01  -3.33e-01  +4.76e-01  -1.93e-01
-2.48e-01  +9.81e-01  -9.30e-01  -2.42e-01  +4.00e-02  -4.36e-01  +1.82e-01  -2.85e-01  +2.70e-01  -1.70e-01  +4.73e+00  +4.76e+00


FFT in MATLAB:

Columns 1 through 6

   0.0000 + 0.0000i   5.1431 + 0.0000i   8.3217 + 0.0000i   8.3217 + 0.0000i   5.1431 + 0.0000i   0.0000 + 0.0000i
   0.0000 + 0.0000i   2.4319 - 2.3526i   3.9350 - 3.8066i   3.9350 - 3.8066i   2.4319 - 2.3526i   0.0000 - 0.0000i
   0.0000 + 0.0000i   0.5249 - 1.0386i   0.8493 - 1.6805i   0.8493 - 1.6805i   0.5249 - 1.0386i   0.0000 - 0.0000i
   0.0000 + 0.0000i   0.5738 - 0.5483i   0.9284 - 0.8872i   0.9284 - 0.8872i   0.5738 - 0.5483i   0.0000 - 0.0000i
   0.0000 + 0.0000i   0.5843 - 0.2843i   0.9454 - 0.4601i   0.9454 - 0.4601i   0.5843 - 0.2843i   0.0000 - 0.0000i
   0.0000 + 0.0000i   0.5874 - 0.0890i   0.9505 - 0.1440i   0.9505 - 0.1440i   0.5874 - 0.0890i   0.0000 - 0.0000i
   0.0000 + 0.0000i   0.5874 + 0.0890i   0.9505 + 0.1440i   0.9505 + 0.1440i   0.5874 + 0.0890i   0.0000 + 0.0000i
   0.0000 + 0.0000i   0.5843 + 0.2843i   0.9454 + 0.4601i   0.9454 + 0.4601i   0.5843 + 0.2843i   0.0000 + 0.0000i
   0.0000 + 0.0000i   0.5738 + 0.5483i   0.9284 + 0.8872i   0.9284 + 0.8872i   0.5738 + 0.5483i   0.0000 + 0.0000i
   0.0000 + 0.0000i   0.5249 + 1.0386i   0.8493 + 1.6805i   0.8493 + 1.6805i   0.5249 + 1.0386i   0.0000 + 0.0000i
   0.0000 + 0.0000i   2.4319 + 2.3526i   3.9350 + 3.8066i   3.9350 + 3.8066i   2.4319 + 2.3526i   0.0000 + 0.0000i

  Columns 7 through 10

  -5.1431 + 0.0000i  -8.3217 + 0.0000i  -8.3217 + 0.0000i  -5.1431 + 0.0000i
  -2.4319 + 2.3526i  -3.9350 + 3.8066i  -3.9350 + 3.8066i  -2.4319 + 2.3526i
  -0.5249 + 1.0386i  -0.8493 + 1.6805i  -0.8493 + 1.6805i  -0.5249 + 1.0386i
  -0.5738 + 0.5483i  -0.9284 + 0.8872i  -0.9284 + 0.8872i  -0.5738 + 0.5483i
  -0.5843 + 0.2843i  -0.9454 + 0.4601i  -0.9454 + 0.4601i  -0.5843 + 0.2843i
  -0.5874 + 0.0890i  -0.9505 + 0.1440i  -0.9505 + 0.1440i  -0.5874 + 0.0890i
  -0.5874 - 0.0890i  -0.9505 - 0.1440i  -0.9505 - 0.1440i  -0.5874 - 0.0890i
  -0.5843 - 0.2843i  -0.9454 - 0.4601i  -0.9454 - 0.4601i  -0.5843 - 0.2843i
  -0.5738 - 0.5483i  -0.9284 - 0.8872i  -0.9284 - 0.8872i  -0.5738 - 0.5483i
  -0.5249 - 1.0386i  -0.8493 - 1.6805i  -0.8493 - 1.6805i  -0.5249 - 1.0386i
  -2.4319 - 2.3526i  -3.9350 - 3.8066i  -3.9350 - 3.8066i  -2.4319 - 2.3526i

This is a full compliable script in C++:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
#define _USE_MATH_DEFINES // for C++
#include <cmath>
#include<math.h>
#include<stdio.h>
#include "fftw3.h"
#include <cstring>
#include <sstream>
#include <string>
#include <sys/resource.h>
#include <iostream> 
#include <vector>

using namespace std;
int main(){			
 
	double Lx = 2 * M_PI;


	double *XX;
	XX = (double*) fftw_malloc((nx*(ny+1))*sizeof(double));
	memset(XX, 42, (nx*(ny+1))* sizeof(double));  
	double *YY;
	YY = (double*) fftw_malloc((nx*(ny+1))*sizeof(double));
	memset(YY, 42, (nx*(ny+1))* sizeof(double)); 

	
	for(int i = 0; i< nx+1; i++){
		for(int j = 0; j< ny; j++){
			XX[i + (ny+1)*j] = (j)*2*M_PI/nx; //(i)*2*M_PI/nx;
			YY[i + (ny+1)*j] = -1. * cos(((i) * M_PI )/ny);	//-1. * cos(((j) * M_PI )/ny);
		
			//printf("%+4.2le  ",XX[i + (ny+1)*j]);	//
			//XX and YY are correct	
		}	
		//printf("\n");	
	}
	
	double *ne0;
	ne0 = (double*) fftw_malloc((((ny+1)*nx))*sizeof(double)); 
	
	fftw_complex *ne0k; //for density perturbations
	ne0k = (fftw_complex*) fftw_malloc(((ny+1)*nyk)*sizeof(fftw_complex)); 
	
	memset(ne0k, 42, ((ny+1)*nyk)* sizeof(fftw_complex)); 
	
    double A = (2 * M_PI)/Lx;
	

	for (int i = 0; i < nx+1; i++){ 
		for (int j = 0; j < ny; j++){	
		  ne0[i + (ny+1)*j] = pow((YY[i + (ny+1)*j]-0.5),2) * sin(A * XX[i + (ny+1)*j]); 
		  //printf("%+4.2le  ",ne0[i + (ny+1)*j]);

			
		}
		//printf("\n");
	}	

	//r2cfft(ne0, ne0k);
	
    fftw_plan r2c; 
	// Define the FFT plan
	r2c = fftw_plan_dft_r2c_2d(nx, ny , &ne0[0], &ne0k[0], FFTW_ESTIMATE);
	fftw_execute(r2c);
	fftw_destroy_plan(r2c);
	fftw_cleanup();

	

	
	 
	
	for (int i = 0; i < nx; i++){
		for (int j = 0; j < nyk; j++){
			for (int k = 0; k < ncomp; k++){
				//std::cout<<  <<" ";
				printf("%+4.2le  ",ne0k[i + nyk*j][k]); //[j + nyk*i][k] fourier. from [i][j]

				}
		}
		printf("\n");
	}

	 
	


	free(XX);
	free(YY);
	free(ne0k);
	free(ne0);
	



 return 0;


 }




This doesn't quite compile for me because I didn't have the definitions of some constants, but I think
int constexpr nx = 10, ny = 10, ncomp = 2, nyk = 6;
and after adding that it compiles.

I ran the inverse (c2r) transform on MATLAB's output to compare it to ne0, and it's totally different. I don't see any programming bugs that would cause this kind of wrong output. So I think the problem's a translation issue.

It might be helpful to post the MATLAB program.
Last edited on
@JamieAl,
Why are you using nx+1 when you should be using nx, ny+1 when you should be using ny?
Why are you confusing i and j?
Why isn’t your (stated) output from the c++ code a set of complex numbers?
What is nyk?
What function are you trying to transform? Your posts don’t actually correspond to your code (especially the y-dependence).

If the array is only 10*10 then you could write a routine to do the dft from first principles. The “fast” bit of the fft isn’t going to help much here, as 10 only has one factor of 2.
I've spent a while looking at @JamieAL's code so I'll try to answer or comment about some of the questions.

Why isn’t your (stated) output from the c++ code a set of complex numbers?
What is nyk?

As a performance optimization, the real-to-complex transform doesn't write the second half of the matrix because it would be symmetric to the first half. So the output that's printed is nyk = 6 = (ny / 2 + 1) columns of poorly-formatted complex numbers.

The first bit of the stated C++ output is +3.82e+00 +0.00e+00 which corresponds to the complex number 3.82 + 0.00i.

Here is a shorter program with better-formatted output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#define _USE_MATH_DEFINES // for C++

#include <cmath>
#include <iostream> 
#include <iomanip>
#include <complex>

#include "fftw3.h"

static constexpr double square(double x) noexcept { return x * x; }

int main() 
{
  int constexpr nx = 10, ny = 10, nyk = nx / 2 + 1;

  double* ne0 = fftw_alloc_real((ny + 1) * nx);
  for (int i = 0; i < nx + 1; i++) {
    for (int j = 0; j < ny; j++) {
      double const XXij = j * 2.0 * M_PI / nx;
      double const YYij = -std::cos(i * M_PI / ny);
      ne0[i + (ny + 1) * j] = square(YYij - 0.5) * std::sin(XXij);
    }
  }

  fftw_complex* ne0k = fftw_alloc_complex((ny + 1) * nyk);
  fftw_execute(fftw_plan_dft_r2c_2d(nx, ny, &ne0[0], &ne0k[0], FFTW_ESTIMATE));

  std::cout << std::fixed << std::setprecision(2);
  for (int i = 0; i < nx; i++) {
    for (int j = 0; j < nyk; j++) {
      double* ne0kij = ne0k[i + nyk * j];
      std::cout << std::setw(16) << std::complex<double>{ ne0kij[0], ne0kij[1] } << ", ";
    }
    std::cout << '\n';
  }
  return 0;
}


The output of this one is
     (3.82,0.00),   (-1.80,-40.01),    (-1.97,-0.63),     (0.41,-1.52),     (0.93,-0.66),      (1.05,0.00),
 (-18.55,-20.20),      (0.66,2.07),     (-1.23,0.06),     (0.20,-0.82),     (0.64,-0.47),     (0.90,-0.07),
    (0.65,-0.83),     (-0.05,1.32),    (-1.19,-0.06),    (-0.10,-0.59),     (0.15,-0.45),     (0.31,-0.33),
    (0.60,-0.42),     (-0.25,0.98),    (-0.93,-0.24),     (0.04,-0.44),     (0.18,-0.28),     (0.27,-0.17),
    (0.59,-0.19),     (-0.43,0.78),    (-0.78,-0.42),     (0.14,-0.35),     (0.21,-0.19),      (2.38,4.92),
     (0.59,0.00),     (-0.60,0.63),    (-0.66,-0.62),     (0.22,-0.29),      (0.62,5.00),      (0.25,0.00),
  (-1.80,-40.01),    (-1.97,-0.63),     (0.41,-1.52),     (0.93,-0.66),      (1.05,0.00),      (0.93,0.66),
     (0.66,2.07),     (-1.23,0.06),     (0.20,-0.82),     (0.64,-0.47),     (0.90,-0.07),      (1.10,0.55),
    (-0.05,1.32),    (-1.19,-0.06),    (-0.10,-0.59),     (0.15,-0.45),     (0.31,-0.33),     (0.48,-0.19),
    (-0.25,0.98),    (-0.93,-0.24),     (0.04,-0.44),     (0.18,-0.28),     (0.27,-0.17),      (4.73,4.76),


In addition I did the inverse transform over the desired output to get the desired input, which looks like this when normalized:
 12.69,   2.00,  -4.85,  -0.00,  -1.50,   0.00,  -1.50,   0.00,  -4.85,  -2.00,
 10.99,  -0.15,  -4.20,   0.00,  -1.30,   0.00,  -1.30,  -0.00,  -4.20,   0.15,
  8.90,  -1.57,  -3.40,   0.00,  -1.05,   0.00,  -1.05,  -0.00,  -3.40,   1.57,
  6.50,  -2.10,  -2.48,   0.00,  -0.77,   0.00,  -0.77,  -0.00,  -2.48,   2.10,
  4.34,  -1.75,  -1.66,   0.00,  -0.51,   0.00,  -0.51,  -0.00,  -1.66,   1.75,
  2.84,  -0.83,  -1.08,  -0.00,  -0.34,   0.00,  -0.34,   0.00,  -1.08,   0.83,
  2.13,   0.22,  -0.81,   0.00,  -0.25,   0.00,  -0.25,  -0.00,  -0.81,  -0.22,
  1.98,   1.04,  -0.76,  -0.00,  -0.23,   0.00,  -0.23,   0.00,  -0.76,  -1.04,
  1.96,   1.48,  -0.75,  -0.00,  -0.23,   0.00,  -0.23,   0.00,  -0.75,  -1.48,
  1.54,   1.66,  -0.59,   0.00,  -0.18,   0.00,  -0.18,  -0.00,  -0.59,  -1.66,
Last edited on
Thank-you @mbozzi.

I've coded up a 2-d DFT from first principles (i.e. definition sum) and checked the inverse for various functions. However, I can't make head or tail of @JamieAL's original function.

@JamieAL - you wrote this as your Matlab script. Could you correct it please.
u = (Y-0.5).^2 .* sin( (2*pi / Lx) * X)

Make sure that you explain what X and Y are as well. I suggest that you show the whole of your Matlab code.

You appear to have 10 columns but 11 (ELEVEN) rows in your Matlab FFT output - is that correct?
Last edited on
I missed a row in the Matlab output (assumed it was 10 rows)
This changes the "desired input" a little bit:
 14.53,   0.00,  -5.55,   0.00,  -1.71,   0.00,  -1.71,   0.00,  -5.55,   0.00,
 13.59,   0.00,  -5.19,   0.00,  -1.60,   0.00,  -1.60,   0.00,  -5.19,   0.00,
 11.06,   0.00,  -4.23,   0.00,  -1.31,   0.00,  -1.31,   0.00,  -4.23,   0.00,
  7.64,   0.00,  -2.92,   0.00,  -0.90,   0.00,  -0.90,   0.00,  -2.92,   0.00,
  4.23,   0.00,  -1.61,   0.00,  -0.50,   0.00,  -0.50,   0.00,  -1.61,   0.00,
  1.61,   0.00,  -0.62,   0.00,  -0.19,   0.00,  -0.19,   0.00,  -0.62,   0.00,
  0.24,   0.00,  -0.09,   0.00,  -0.03,   0.00,  -0.03,   0.00,  -0.09,   0.00,
  0.05,   0.00,  -0.02,   0.00,  -0.01,   0.00,  -0.01,   0.00,  -0.02,   0.00,
  0.62,   0.00,  -0.24,   0.00,  -0.07,   0.00,  -0.07,   0.00,  -0.24,   0.00,
  1.31,   0.00,  -0.50,   0.00,  -0.15,   0.00,  -0.15,   0.00,  -0.50,   0.00,
  1.61,   0.00,  -0.62,   0.00,  -0.19,   0.00,  -0.19,   0.00,  -0.62,   0.00,
Last edited on
That "desired input" looks a lot more plausible: it's clearly now of the form
(function of x) * (function of y)

However, there is no way that the function of x could be of the form sin(Ax) because
(a) It's not consistent with that table (all the non-zero columns would have to be same magnitude, alternating sign);
(b) It's a single frequency in x, so it would show up in the FFT as only two non-zero rows (or two non-zero columns, depending on which way round you choose x and y).

So we really need @JamieAL to show us his complete Matlab code.
@JamieAl,
Are you sure that the MatLab output that you gave is really for a TWO-dimensional FFT? It appears still to have a vestigial sin(2.pi.x/L) dependence in it.

It looks as if you are trying to compare a complete TWO-dimensional fft with c++ against a ONE-dimensional fft with Matlab (fft with respect to y).
OK, this finally reproduces @JamieAl's Matlab output. It's using a straight DFT for simplicity, not the nominally faster FFT - however, they will produce the same output and there won't be any difference in timing if N=11 because that is prime and can't make use of the faster recursion.

I think that @JamieAl's Matlab code is flawed. I doubt that the output is really that intended.

(1) This is a ONE-DIMENSIONAL fourier transform in y, NOT a TWO-DIMENSIONAL fourier transform in x and y.

(2) NX and NY are different! NX=10 and NY=11.

(3) To get exactly the same result as @JamieAL there is a deliberate error in evaluating the y-dependence - it uses NX, not NY, which, as mentioned above, are slightly different.

(4) If this was really intended then there is no point in using a 2-d array, flattened or otherwise, because you are simply forming the outer product of 1-d arrays.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
#include <iostream>
#include <iomanip>
#include <complex>
using namespace std;

const double PI = 3.14159265358979323846;
const int NX = 10, NY = 11;                          // <==== NOTE: NX and NY different
const double LX = 2.0 * PI, LY = 2.0 * PI;

using Complex = complex<double>;

//----------------------------------------------------------------------

double funcx( double x )
{
   return sin( x );
}

double funcy( double y )
{
   double yy = -cos( 0.5 * y ) - 0.5;
   return yy * yy;
}

//----------------------------------------------------------------------

void printAsMatlab( Complex fx[NX], Complex fy[NY] )
{
   cout << fixed << setprecision( 4 );
   for ( int row = 0; row < NY; row++ )
   {
      for ( int col = 0; col < NX; col++ ) cout << setw( 16 ) << fy[row] * fx[col] << " ";
      cout << '\n';
   }
}

//----------------------------------------------------------------------

void dft( Complex f[], Complex ftilde[], int N )
{
   Complex w = polar( 1.0, -2.0 * PI / N );
   Complex wk = 1.0;
   for ( int k = 0; k < N; k++ )
   {
      ftilde[k] = 0.0;
      Complex wkm = 1.0;
      for ( int m = 0; m < N; m++ )
      {
         ftilde[k] += wkm * f[m];
         wkm *= wk;
      }
      wk *= w;
   }
}

//----------------------------------------------------------------------

void idft( Complex ftilde[], Complex f[], int N )
{
   Complex * ftildeConjugate = new Complex[N];
   for ( int k = 0; k < N; k++ ) ftildeConjugate[k] = conj( ftilde[k] );

   dft( ftildeConjugate, f, N );

   double factor = 1.0 / N;
   for ( int m = 0; m < N; m++ ) f[m] = conj( f[m] ) * factor;

   delete [] ftildeConjugate;
}

//======================================================================

int main()
{
   // Set original function(s)
   Complex fx[NX], fy[NY];
   for ( int m = 0; m < NX; m++ ) fx[m] = funcx( LX * m / NX );
   for ( int m = 0; m < NY; m++ ) fy[m] = funcy( LY * m / NX );  // <=== DELIBERATE error, replacing NY by NX
     
   // Discrete Fourier transform OF fy ONLY
   Complex fytilde[NY];
   dft( fy, fytilde, NY );

   // Invert as a check
   Complex gy[NY];
   idft( fytilde, gy, NY );

   // Write arrays
   cout << "Original function:\n";
   printAsMatlab( fx, fy );
   cout << "\n\nDiscrete Fourier transform w.r.t. y ONLY:\n";
   printAsMatlab( fx, fytilde );
// cout << "\n\nInverted back as a check:\n";
// printAsMatlab( fx, gy );
}

//====================================================================== 


Original function:
 (0.0000,0.0000)  (1.3225,0.0000)  (2.1399,0.0000)  (2.1399,0.0000)  (1.3225,0.0000)  (0.0000,0.0000) (-1.3225,0.0000) (-2.1399,0.0000) (-2.1399,0.0000) (-1.3225,0.0000) 
 (0.0000,0.0000)  (1.2376,0.0000)  (2.0025,0.0000)  (2.0025,0.0000)  (1.2376,0.0000)  (0.0000,0.0000) (-1.2376,0.0000) (-2.0025,0.0000) (-2.0025,0.0000) (-1.2376,0.0000) 
 (0.0000,0.0000)  (1.0072,0.0000)  (1.6297,0.0000)  (1.6297,0.0000)  (1.0072,0.0000)  (0.0000,0.0000) (-1.0072,0.0000) (-1.6297,0.0000) (-1.6297,0.0000) (-1.0072,0.0000) 
 (0.0000,0.0000)  (0.6955,0.0000)  (1.1254,0.0000)  (1.1254,0.0000)  (0.6955,0.0000)  (0.0000,0.0000) (-0.6955,0.0000) (-1.1254,0.0000) (-1.1254,0.0000) (-0.6955,0.0000) 
 (0.0000,0.0000)  (0.3847,0.0000)  (0.6225,0.0000)  (0.6225,0.0000)  (0.3847,0.0000)  (0.0000,0.0000) (-0.3847,0.0000) (-0.6225,0.0000) (-0.6225,0.0000) (-0.3847,0.0000) 
 (0.0000,0.0000)  (0.1469,0.0000)  (0.2378,0.0000)  (0.2378,0.0000)  (0.1469,0.0000)  (0.0000,0.0000) (-0.1469,0.0000) (-0.2378,0.0000) (-0.2378,0.0000) (-0.1469,0.0000) 
 (0.0000,0.0000)  (0.0214,0.0000)  (0.0347,0.0000)  (0.0347,0.0000)  (0.0214,0.0000)  (0.0000,0.0000) (-0.0214,0.0000) (-0.0347,0.0000) (-0.0347,0.0000) (-0.0214,0.0000) 
 (0.0000,0.0000)  (0.0045,0.0000)  (0.0073,0.0000)  (0.0073,0.0000)  (0.0045,0.0000)  (0.0000,0.0000) (-0.0045,0.0000) (-0.0073,0.0000) (-0.0073,0.0000) (-0.0045,0.0000) 
 (0.0000,0.0000)  (0.0561,0.0000)  (0.0908,0.0000)  (0.0908,0.0000)  (0.0561,0.0000)  (0.0000,0.0000) (-0.0561,0.0000) (-0.0908,0.0000) (-0.0908,0.0000) (-0.0561,0.0000) 
 (0.0000,0.0000)  (0.1196,0.0000)  (0.1935,0.0000)  (0.1935,0.0000)  (0.1196,0.0000)  (0.0000,0.0000) (-0.1196,0.0000) (-0.1935,0.0000) (-0.1935,0.0000) (-0.1196,0.0000) 
 (0.0000,0.0000)  (0.1469,0.0000)  (0.2378,0.0000)  (0.2378,0.0000)  (0.1469,0.0000)  (0.0000,0.0000) (-0.1469,0.0000) (-0.2378,0.0000) (-0.2378,0.0000) (-0.1469,0.0000) 


Discrete Fourier transform w.r.t. y ONLY:
 (0.0000,0.0000)  (5.1431,0.0000)  (8.3217,0.0000)  (8.3217,0.0000)  (5.1431,0.0000)  (0.0000,0.0000) (-5.1431,0.0000) (-8.3217,0.0000) (-8.3217,0.0000) (-5.1431,0.0000) 
 (0.0000,0.0000) (2.4319,-2.3526) (3.9350,-3.8066) (3.9350,-3.8066) (2.4319,-2.3526) (0.0000,-0.0000) (-2.4319,2.3526) (-3.9350,3.8066) (-3.9350,3.8066) (-2.4319,2.3526) 
 (0.0000,0.0000) (0.5249,-1.0386) (0.8493,-1.6805) (0.8493,-1.6805) (0.5249,-1.0386) (0.0000,-0.0000) (-0.5249,1.0386) (-0.8493,1.6805) (-0.8493,1.6805) (-0.5249,1.0386) 
 (0.0000,0.0000) (0.5738,-0.5483) (0.9284,-0.8872) (0.9284,-0.8872) (0.5738,-0.5483) (0.0000,-0.0000) (-0.5738,0.5483) (-0.9284,0.8872) (-0.9284,0.8872) (-0.5738,0.5483) 
 (0.0000,0.0000) (0.5843,-0.2843) (0.9454,-0.4601) (0.9454,-0.4601) (0.5843,-0.2843) (0.0000,-0.0000) (-0.5843,0.2843) (-0.9454,0.4601) (-0.9454,0.4601) (-0.5843,0.2843) 
 (0.0000,0.0000) (0.5874,-0.0890) (0.9505,-0.1440) (0.9505,-0.1440) (0.5874,-0.0890) (0.0000,-0.0000) (-0.5874,0.0890) (-0.9505,0.1440) (-0.9505,0.1440) (-0.5874,0.0890) 
 (0.0000,0.0000)  (0.5874,0.0890)  (0.9505,0.1440)  (0.9505,0.1440)  (0.5874,0.0890)  (0.0000,0.0000) (-0.5874,-0.0890) (-0.9505,-0.1440) (-0.9505,-0.1440) (-0.5874,-0.0890) 
 (0.0000,0.0000)  (0.5843,0.2843)  (0.9454,0.4601)  (0.9454,0.4601)  (0.5843,0.2843)  (0.0000,0.0000) (-0.5843,-0.2843) (-0.9454,-0.4601) (-0.9454,-0.4601) (-0.5843,-0.2843) 
 (0.0000,0.0000)  (0.5738,0.5483)  (0.9284,0.8872)  (0.9284,0.8872)  (0.5738,0.5483)  (0.0000,0.0000) (-0.5738,-0.5483) (-0.9284,-0.8872) (-0.9284,-0.8872) (-0.5738,-0.5483) 
 (0.0000,0.0000)  (0.5249,1.0386)  (0.8493,1.6805)  (0.8493,1.6805)  (0.5249,1.0386)  (0.0000,0.0000) (-0.5249,-1.0386) (-0.8493,-1.6805) (-0.8493,-1.6805) (-0.5249,-1.0386) 
 (0.0000,0.0000)  (2.4319,2.3526)  (3.9350,3.8066)  (3.9350,3.8066)  (2.4319,2.3526)  (0.0000,0.0000) (-2.4319,-2.3526) (-3.9350,-3.8066) (-3.9350,-3.8066) (-2.4319,-2.3526) 


Thanks everyone and apologies for the late responses as unfortunately I have a part time job during weekends. Let me try answer some questions, I guess I might've been confusing the 2D FFT in C++ with MATLAB's 1D FFT fft(u) and also the full MATLAB code is:
1
2
3
4
5
6
7
8
9
10
11
12
13
Nx = 10; 
Ny = 10; 
Lx = 2*pi; 

ygl = -cos(pi*(0:Ny)/Ny)'; %Gauss-Lobatto chebyshev points 
x  = (0:Nx-1)/Nx*2*pi;
%make mesh
[X,Y]   = meshgrid(x,ygl); 
A = 2*pi / Lx;

u = (Y-0.5).^2 .* sin( (2*pi / Lx) * X); %Y.^2 .* sin( (2*pi / Lx) * X);
uh = fft(u); 
 


I will read your codes as well.
@mbozzi

This doesn't quite compile for me because I didn't have the definitions of some constants, but I think
int constexpr nx = 10, ny = 10, ncomp = 2, nyk = 6;
and after adding that it compiles.


You're absolutely right!
@JamieAl,

If you use y indices 0:Ny then that is 11 points.
You then use x indices 0:Nx-1 which is only 10 points.

From your earlier results, your Matlab fft is clearly only doing a 1-d transform. If it had been doing a transform in the x direction as well then for sin(x) you would only get two non-zero columns, as it is single frequency. (You get two components rather than one in the complex version because it picks up exp(ikx) and exp(-ikx)).

Maybe it would be better to state what you want at the end of the day: it's not easy to guess that from code.
I guess what I am really trying to do is be able to perform something that look like in MATLAB:
 
fft(u,[],1) %or fft(u,[],2)

Basically being able to apply the 1D FFT along one specific dimension in C++ (for example along y or x only). But, then I ran into this issue of how my 2D FFT (or 1D FFT) in c++ is not matching that of MATLAB, which granted is my fault for not fully understanding the differences b/w FFTW and FFT in MATLAB.
Actually what you did here is technically giving the correct results of taking 1D FFT along one direction, I just I am trying to reproduce it using FFTW library and sort of failing:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
void dft( Complex f[], Complex ftilde[], int N )
{
   Complex w = polar( 1.0, -2.0 * PI / N );
   Complex wk = 1.0;
   for ( int k = 0; k < N; k++ )
   {
      ftilde[k] = 0.0;
      Complex wkm = 1.0;
      for ( int m = 0; m < N; m++ )
      {
         ftilde[k] += wkm * f[m];
         wkm *= wk;
      }
      wk *= w;
   }
}
Pages: 12