ECCENTRIC PHENOMENON when passing values

I came to an eccentric phenomenon when running my program.

My process accepts TCP connections and then reads data from the socket. It had been worked allright for months on the other servers. For the sake of test, I ran it on a new server. At the begin, everything was OK. Six or Seven hours later, it didnt work!
So I gdbed to look into it. I found there is something strange happened when exec the following code:
1
2
3
iRet = NBlockTcpRead(pstClient->iSock,
  (char*)(&pstClient->stPkg) + pstClient->iRecvLen,
  sizeof(CSPKGHEAD) - pstClient->iRecvLen);

and NBlockTcpRead implemented as
 
int	CZoneCon::NBlockTcpRead(int iSockFd,char *pRecvBuff,int iRecvLen)


I checked pstClient->iSock before executed NBlockTcpRead, and it showed pstClient->iSock = 4, and sizeof(CSPKGHEAD) - pstClient->iRecvLen = 16.
But when I steped into NBlockTcpRead, it showed that iSockFd is some negative number, and iRecvLen = 0. When executed this function, I checked the pstClient again, pstClient->iSock = 4 all the same!

And my workmate told me that when he ran a process on this server, he tried to get the pointer of an Object and save in a var, but when using this val, the value in the var was different from the value, pointer of that object, that he assigned before. We checked the codes together, and were sure of that the var had never been reassigned between his assigning and using. Also, the process runs correctly on the other severs.

Why? Has somebody ever met such problems?
I'd blame hardware instability. If the exact same code* has been running for months on various servers, but fails after a few hours on a new server, it's likely the memory or the CPU are either faulty or overheating, and generating wrong results.

*"exact same code"=="the same binary file that hasn't been rebuilt during those months"
Thanks for you reply.
the binary file would be rebuilt for every product version. but i confirm that the code has never been changed as well as the files related to this process. the codes of this process hasn't been changed for a long time.
and, we have run this process( exact same code) on the other four servers for 2 days, and it works well.

I guess hardware instability too. And happy to hear that you have the same conclusion with me. So I will ask the operation engineer to reboot the server-- he would be angry if I require him to do the things that turn out to be useless ^_^

Hope it useful!

I plan to reboot the server when
Topic archived. No new replies allowed.