I’m trying to read in a MorningStar webpage to screen scrape Dividends and Capital Gains:
http://quotes.morningstar.com/fund/fbiox/f?t=FBIOX
When the page is displayed, I can hit Cntrl-A, Cntrl-C and then paste the data in notepad and the data I want is there. But, I want to screen scrape it!
The problem is that the data I want is not part of the base HTML file. When I right click and display the page source, the section I want looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13
|
<!-- distribution start-->
<div id="iddistribution">
<div class="gr_section_b2">
<h2><div class="gr_row_b1 mt30" >
<span class="gr_text_subhead" id="Dividend">Dividend and
Capital Gains Distributions</span>
<span class="gr_text_subhead"> <span>FBIOX</span></span>
</div>
</h2>
<div id="DividendAndCaptical" class="gr_section_b1">
</div>
</div>
</div>
|
The data is displayed by {<div id="DividendAndCaptical" class="gr_section_b1">}
Here is the C++ code I’m using to read in the page, and for tracing only, print it to the console:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
|
string WebPage = "";
sprintf(URL,"http://quotes.morningstar.com/fund/fbiox/f?t=FBIOX");
OpenAddress = InternetOpenUrlA ( hWeb , URL ,NULL , 0 ,
INTERNET_FLAG_RELOAD | INTERNET_FLAG_DONT_CACHE , 0 );
//INTERNET_FLAG_PRAGMA_NOCACHE|INTERNET_FLAG_KEEP_CONNECTION , 0 );
while(InternetReadFile(OpenAddress, DataReceived,
4096, &NumberOfBytesRead) && NumberOfBytesRead )
{
if ( NumberOfBytesRead > 0 )
{
DataReceived[NumberOfBytesRead] = 0;
WebPage += DataReceived;
}
}
InternetCloseHandle(OpenAddress);
printf("%s\n",WebPage.c_str());
|
1) Can InternetOpenUrlA open the page including the replacement of the <div> code?
2) If so, what Flags do I need to use?
3) If not, is there any other way to read in the wanted data?