Reading from a non-blocking socket

In the thread https://cplusplus.com/forum/beginner/285198/ kbw(9472) presented the following function for reading from a socket:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
int readSocket( std::string &sockResponseString) {
	// use lambdas as local functions
	auto can_read = [](int s) -> bool {
		fd_set read_set;
		FD_ZERO(&read_set);
		FD_SET(s, &read_set);
		struct timeval timeout {};
		int rc = select(s + 1, &read_set, NULL, NULL, &timeout);
		return (rc == 1) && FD_ISSET(s, &read_set);
	};
	auto do_read = [&sockResponseString](int s) -> bool {
		// don't need non-blocking checks, code works with both blocking
		// and non-blocking sockets as select() says we're good to read
		// char buf[BUFSIZ];
		char buf[8];  // test if reading small chunks functions
		int nbytes = recv(s, buf, sizeof(buf), 0);
		if (nbytes <= 0)
			return false;
		sockResponseString += std::string(buf, static_cast<size_t>(nbytes));
		return true;
	};

	sockResponseString.clear();
	bool done{};
	while (!done) {
		// keep looping until first read
		if (can_read(Master_sfd) && do_read(Master_sfd)) {
			// then return once all the buffered input is read
			while (!done) {
				if (!can_read(Master_sfd))
					done = true;
				do_read(Master_sfd);
			}
		}
	}
	return static_cast<int>(sockResponseString.size());
}

This function seems to be returning exactly what is needed but somewhere there must be an error (or I am doing something wrong).

1
2
3
4
5
6
7
std::string rts;

cout << "Timestamp rts.c_str():  [" << rts.c_str() << "]" << endl;
bytes_read = readSocket( rts);                                    // 1: Server sends the timestamp
cout << "Timestamp rts.c_str():  [" << rts.c_str() << "]" << endl;
cout << "Timestamp rts        :  [" << rts         << "]" << endl;
printf( "Timestamp rts.c_str(): [%s]\n", rts.c_str());

produces this output:
Timestamp rts.c_str(): []
Timestamp rts.c_str(): [BaseX:15347635952017]
Timestamp rts        : [BaseX:15347635952017
Timestamp rts.c_str(): [BaseX:15347635952017]

The output from line 6 misses the ']'.

The results of the reading assignments look good at first sight, but further processing shows that there is an error somewhere.

In the Eclipse debugger, I have tried to display both 'sockReponseString' as used inside the function and 'rts' as an array.
After right-clicking these variables, the following error is displayed:
Multiple errors reported.

1) Failed to execute MI command:
-var-create - * (*((sockResponseString)+0)@1)
Error message from debugger back end:
No symbol "operator+" in current context.

2) Unable to create variable object

3) Failed to execute MI command:
-data-evaluate-expression (*((sockResponseString)+0)@1)
Error message from debugger back end:
No symbol "operator+" in current context.

4) Failed to execute MI command:
-var-create - * (*((sockResponseString)+0)@1)
Error message from debugger back end:
No symbol "operator+" in current context.

I can't find any errors and neither do I know how to deal with the messages.

I tried to change
sockResponseString += std::string(buf, static_cast<size_t>(nbytes));
into
1
2
std::string app = std::string(buf, static_cast<size_t>(nbytes));
sockResponseString.append(app);

but that didn't help.

Any idea how I can solve this?

Ben
I don't see any way that printing an std::string and then another character could cause the later character to not appear, but I'll entertain it for the sake of argument. What's rts.size()? It should be 20.
Last edited on
DizzyDon also wondered in https://cplusplus.com/forum/beginner/285273/ what could be the casuse of this strange behaviour.
rts.size() returns 21 but I count 20 characters. My guess is that rts[20] is the terminator?
Hmm... See, the thing about the console is that it's supposed to work as if you were printing using an actual paper printer. You can send special characters to stdout to get the device on the other end to do different things, such as overwrite the previous character, but if you send the character ']' to stdout, regardless of what came previously the next character written should be ']'. It would be a different if you tried to print "hello\b's bells", but control characters can only ever overwrite previously printed characters, not future printed characters.

This program:
1
2
3
4
5
6
7
8
9
#include <iostream>
#include <string>

int main(){
    std::string s = "hello";
    s += (char)0;
    std::cout << "[" << s << "]" << std::endl;
    return 0;
}
prints "[hello ]". Are you sure you didn't miss anything in the output?
I don't see any way that printing an std::string and then another character could cause the later character to not appear


If the previous char was ESC, it's possible that future chars will be taken to be part of an escape sequence to determine the console behaviour. eg ESC[D might move the cursor back. Might not apply here, but worth knowing about.
Last edited on
I changed
1
2
3
4
cout << "Timestamp rts.size() : [" << rts.size() << "]" << endl;
cout << "Timestamp rtsc_str() : [" << rts.c_str() << "]" << endl;
cout << "Timestamp rts        : [" << rts         << "]" << endl;
printf( "Timestamp rts.c_str(): [%s]\n", rts.c_str());

to
1
2
3
4
cout << "Timestamp rts.size() : <<[" << rts.size() << "]>>" << endl;
cout << "Timestamp rtsc_str() : <<[" << rts.c_str() << "]>>" << endl;
cout << "Timestamp rts        : <<[" << rts         << "]>>" << endl;
printf( "Timestamp rts.c_str(): <<[%s]>>\n", rts.c_str());


In the Eclipse console, output changed to
Timestamp rts.size() : <<[20]>>
Timestamp rtsc_str() : <<[BaseX:2105729244235]>>
Timestamp rts        : <<[BaseX:2105729244235
Timestamp rtsc_str() : <<[BaseX:2105729244235]>>


But when I use Ctrl-C/Ctrl-V to copy the output to this reply, only this result is pasted:
Timestamp rts.size() : <<[20]>>
Timestamp rtsc_str() : <<[BaseX:2105729244235]>>
Timestamp rts        : <<[BaseX:2105729244235


Could this mean that some 'hidden' character is inserted after the end of the third line?
Try printing the ASCII codes for each char in the string. Something like:

1
2
for (unsigned char c : rts)
    std::cout << int(c) << ' ';


You'll then be able to determine what the final char is in rts.
> Could this mean that some 'hidden' character is inserted after the end of the third line?

Examine the contents of the string. For example, with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <iostream>
#include <string>
#include <iomanip>

std::ostream& debug_dump( const std::string& str, std::ostream& stm = std::cout )
{
    stm << str.size() << " bytes: " << " [ " << std::hex << std::setfill('0') ;
    for( unsigned int byte : str ) stm << std::setw(2) << byte << ' ' ;
    return stm << ']' ;
}

int main()
{
    const std::string rts = "hello world\b\032\001" ;
    std::cout << std::quoted(rts) << '\n' ;
    debug_dump(rts) << '\n' ;
}

1
2
bytes_read = readSocket( rts); 
debug_dump(rts) << '\n' ;


21 bytes:  [ 42 61 73 65 58 3a 31 30 39 34 34 35 35 35 36 36 35 30 32 37 00 ]
bytes_read:           21
Realm:timestamp =       [BaseX:10944555665027]

Looks normal to me?

Ben
Last edited on
The
00
at the end should not appear withing the std::string.
0 is a valid char within a std::string - unlike a c-style string where it signifies its end. That's why .c_str() shows OK but displaying rts itself doesn't. Somewhere within the code that builds rts you're adding an extra 0 to the end.

NB It looks like recv() receives 0 as the last byte. Before return L36 you might want to remove trailing 0. eg:

1
2
while (!sockResponseString.empty() && sockResponseString.back() == '\0')
    sockResponseString.pop_back();

Last edited on
According to the Client/server protocol this convention is used to describe all traffic on the socket:
{...}: utf8 strings or raw data, suffixed with a \00 byte. To avoid confusion with this end-of-string byte, all transferred \00 and \FF bytes are prefixed by an additional \FF byte.

And this is the sequence that has to be sent to the server for succesfull authentication
{username} {md5(md5(username:realm:password) + nonce)}

In one of the C-examples this is written as
BaseX Server expects an authentification sequence:
{username}\0{md5(md5(user:realm:password) + timestamp)}\0

This means that all the time I have to deal with terminating zero's. To make things worse, after successfull authentication, sequences that are sent to or received from the server can contain embedded \00's and \FF bytes,
From my experiences in R, I learned that Unix and Windows handle these zero's in different ways. The constant attention I had to pay to the zeros eventually led me to the decision to convert all strings to byte arrays first and then only send byte arrays over the socket. When I see how many problems the zeros cause me during the authentication process alone, I'm seriously considering using this approach in the C++ version of the client as well.

So far I am very happy with everyone else's comments. I have already learned a lot about C++

Ben
PS. Through the R-developers mailing list I heard that when using non-blocking sockets my approach was doomed to fail from the start. With a non-blocking socket you have to build in a function between sending the authentication data and requesting the status byte that waits until the server prepares the status byte in the socket.
It turned out that this function was already present in the readSocket function.
1
2
3
4
5
6
7
8
9
10
11
12
13
bool wait(int s) {
  bool done{};
  while (!done ) {
  	fd_set read_set;
  	FD_ZERO(&read_set);
  	FD_SET(s, &read_set);
  	struct timeval timeout {};
  	memset(&timeout, 0, sizeof(timeout));
  	int rc = select(s + 1, &read_set, NULL, NULL, &timeout);
  	done = (rc == 1) && FD_ISSET(s, &read_set);
  };
  return done;
};
Last edited on
@JLBorges
Thanks to you debug_dump function, I was able to inspect the content of the timestamp (rts) that was read. It showed rts.c_str() ended with a \00. After removing that zero I have finally been able to authenticate.
Thanks.

@kbw(9472)
The readSocket functiondoes exactly what is needed and for me it was a good introduction to using lambda functions.
I still don't understand why I can't display the contents of sockResponseString as an array. But I'll report that to the appropriate Eclipse forum.
Let me know if you want to be informed on the result


I still don't understand why I can't display the contents of sockResponseString as an array.


??? sockResponseString is a std::string - not an array. How are you trying to display it's contents?
One of the options in the Eclipse CDT debugger view is to display the c_str() as array. Selecting that option (right-click a variable -> Display As Array) results in the errors as shown in the first item in this thread.
Topic archived. No new replies allowed.