WAV format can be both mono and stereo (in fact, can have many channels), but in the stereo version of it is ordered like this: { Left channel at time 0, Right channel at time 0, Left channel at time 1, Right channel at time 1, Left channel at time 2, Right channel at time 2, etc. }
does that mean there are two samples in the sample2? 
Yes  one for each channel. Note there can be some variation in terminology. By "sample" in the link you mentioned, they mean point in time. In other words, in stereo, at every point of time, there are two channels (left and right), and a value for each channel.
In your example,
0x24 0x17 is the 16bit value of the left channel at time 2 ("sample 2")
and 0x1e 0xf3 is the 16bit value of the right channel at time 2.
Hope that makes sense, let me know if it doesn't.
Is it possible to get volume/time information from the sample? 
This requires some calculation.
For pure sine waves, the volume is easy. It's just the amplitude of the sine wave, the A in A*sin(t). But for real signals you'll encounter, you need to do what's called "loudness metering" on a signal. One method to do this is to simply define a window that you'll collect samples in (say, a few hundred milliseconds worth of samples), and then find the maximum value in that array.
You move this window so that it looks at samples in range [N, N + k] at a time.
For example, of your signal is {0, 3, 4, 9, 4, 5, 7, 5, 4, 3, 2, 1, 0 },
and the window size is 3,
{0, 3, 4, 9, 4, 5, 7, 5, 4, 3, 2, 1, 0 }
[. * .] > max = 4 at time 1. Volume is 4.
{0, 3, 4, 9, 4, 5, 7, 5, 4, 3, 2, 1, 0 }
[. * .] > max = 9 at time 3. Volume is 9.

(Note: For samples that go into the negatives, you take the absolute value)
The SO link below gives alternatives to the above method, which can be more accurate. Look at where the answerer talks about RMS.
The two StackExchange links might explain it better than I can:
https://dsp.stackexchange.com/questions/46147/howtogetthevolumelevelfrompcmaudiodata
https://stackoverflow.com/questions/8282394/findingthevolumeofawavatagiventime
The time information of a sample is simply the index of that sample in your array.
If you're sampling at 1000 Hz and the 1st sample is at 0 seconds, then the 2nd sample is at 1/1000 seconds, 3rd sample is at 2/1000, etc.
Edit: Fixed mistakes.