function lsqfit() that performs a least-squares fit

Write youThe dataset in the file “linedat.txt” contains two columns and it can be described by a linear
function of the form y = ax + b. The x -value is represented by the first column and the y-value
by the second column. Write a C++ program containing a function lsqfit() that performs a
least-squares fit to the data and returns the values of a and b with their standard deviations as
well as the correlation coefficient R2. (18 marks)

[Hint: to read the data from the file into two arrays one representing X[len] and another
Y [len] (with len indicating length of the array) use the following code fragment:
1 ve c tor<double> X, Y;
double vecx , vecy ;
3
i f s t r e am i n f i l e ( ” l i n e d a t a . txt ” ) ;
5
whi l e ( i n f i l e >> vecx >> vecy ){
7 X. push back ( vecx ) ;
Y. push back ( vecy ) ;
9 }
You must include the fstream & vector preprocessor directives before using them.]r question here.

below is the lie data txt

0 -0.10
1 1.61
2 4.27
3 8.03
4 10.52
5 11.12
6 15.68
7 18.71
8 20.94
9 28.36
10 26.25
11 30.03
12 32.58
13 34.65
14 39.62
15 42.45
16 49.06
17 48.50
18 49.87
19 58.21
20 60.47
21 60.54
22 60.99
23 66.11
24 69.69
25 73.55
26 75.48
27 79.89
28 82.78
29 82.50
30 86.10
31 89.52
32 92.98
33 96.36
34 100.02
35 96.94
36 105.09
37 111.48
38 109.87
39 116.87
40 118.70
41 120.94
42 124.36
43 123.87
44 129.83
45 136.21
46 136.20
47 139.08
48 140.53
49 144.94
50 148.46
51 151.85
52 153.25
53 156.53
54 164.05
55 158.48
56 170.46
57 169.68
58 174.00
59 171.67
60 176.82
61 180.44
62 184.85
63 183.66
64 190.94
65 190.57
66 196.13
67 200.30
68 202.65
69 207.17
70 210.01
71 209.70
72 214.51
73 215.11
74 217.36
75 224.85
76 226.00
77 228.89
78 233.82
79 236.19
80 238.70
81 243.50
82 245.86
83 247.48
84 248.62
85 251.70
86 258.38
87 255.78
88 261.95
89 261.10
90 270.04
91 272.72
92 274.00
93 276.86
94 275.03
95 284.16
96 281.62
97 284.36
98 292.16
99 293.10
100 298.82
Last edited on
and the c++ question is?

As the x data is integers in the range 0 - 100 with none missing, only one vector is actually needed here to store the y data values.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include <fstream>
#include <iostream>
#include <vector>

int main() {
	std::ifstream infile("linedata.txt");

	if (!infile)
		return (std::cout << "Cannot open file\n"), 1;

	std::vector<double> vecy;

	for (double y {}; infile >> y >> y; vecy.push_back(y));

	std::cout << vecy.size() << " points read\n";
}

Last edited on
a = 2.99625
b = -2.02447
R2 = 0.999471

R2 is not itself a correlation coefficient (in general, it's one minus the normalised squared error in the approximation - which may or may not be linear - but in the linear case it does work out as the square of the correlation coefficient, as @againtry points out below), so you need your lecturer to clarify. It's also meaningless to talk about the standard deviations of a and b, since you only have one value of each.
Last edited on
@jordan
Combining the two previous posts with the following should give you a bit more to go on. First step is read the data in and get the means:

The mathisfun site is helpful too - https://www.mathsisfun.com/data/correlation.html There are explanations and sample data there if you look up mean etc

No. of records: 101
mean_x = 50
mean_y = 147.788
Least squares best fit is y = 2.99625x + -2.02447
 (y is the weekly balance and x is the week number)
Std dev'n x: 29.1548 
Std dev'n y: 87.3781
Correlation: 0.999735 ( see https://www.mathsisfun.com/data/correlation.html 
-  which is R and not R^2, so if required answer is R^2 then square it.) 
Program ended with exit code: 0
Last edited on
Topic archived. No new replies allowed.