Controlling Internet Explorer

Nov 24, 2011 at 3:32pm
Hello Everyone!

I've tried many ways to download a webpage: wininet, IWebBrowser2, curl, but all deemed failure.

I know IWebBrowser2 somehow controls IE. But how can I, using C++, save the sourcecode of a webpage to disk?

What I'm looking for should do the following, instead of grabbing the mouse and going at it, I'm looking for a program that does the following:

1. Open IE
2. Choose file
3. Choose save as
4. Choose Web Archive (html only)
5. Choose OK

I'm sure this can be done somehow because I have at my disposal a program the automatically configures the proxy address in the itnernet options tab in IE, so there must be a way to do that.

Any help is appreciated!

(The c++ community always asks for displaying some effort, well you can track the last 5 or so topics of mine!)
Nov 24, 2011 at 3:45pm
Use URLDownloadToFile: http://msdn.microsoft.com/en-us/library/ms775123(VS.85).aspx

By far the easiest. If you want to automate IE, you can do it but requires that you know COM. Do you know COM?
Nov 24, 2011 at 5:56pm
Given your needs, I'd go with webJose's suggestion. It solves the requirements you laid out exactly (assuming step 2 should read something like "navigate to web page?")

Automating IE would be total overkill for this purpose. (Though you could always try it for "fun"...)
Last edited on Nov 24, 2011 at 5:56pm
Nov 25, 2011 at 6:24am
I thought I could download a page in html if I gave a URL downloader the link, e.g. www.foo.com or www.foo.com\home.html

I was totally wrong, it didn't work with wininet which I was using to download a file.

I converted to looking for downloading a page, or downloading source, I also used wininet but the thing is that I was redirected to somewhere else because of javascript (I presume, I'm no expert)

So the only solution was to take control of IE, because, for example, when in IE, I can view the source code of any webpage.


What's COM? (No, I don't know COM)
If that's the only solution, how long would it take a student to learn it given he has the load of other courses?

Thanks!
Nov 25, 2011 at 6:46am
COM
http://en.wikipedia.org/wiki/Component_Object_Model

What is COM?
http://www.microsoft.com/com/default.mspx

WebBrowser Control Overviews and Tutorials
http://msdn.microsoft.com/en-us/library/aa752041%28v=vs.85%29.aspx

And try googling "automating internet explorer c++"

Andy
Nov 25, 2011 at 2:04pm
Try the function I gave you to see if it works. If it doesn't because of the redirections, you have to code more yourself in order to detect these redirections. There is no magic function that works for all of us. If you don't know how to detect the redirections, I guess you'll have to search for it (I don't know either as I don't mess with the HTTP protocol like this).

COM, as andy pointed out, is Component Object Model, a technology that came to be naturally after OLE and OLE2 (Object Linking and Embedding) that allows resuing of components in a binary form. It is probably the most popular way of object sharing and component development, but it is no piece of cake. It takes entire books to understand it.

But for your simple process, you might be able to get this done after reading the basics plus an example or two. Good luck!
Nov 26, 2011 at 12:10pm
Thank you both!
Nov 26, 2011 at 12:56pm
PS Actually, COM came into being "just before" OLE2. OLE2 was implemented using COM, as a replacement for OLE. But the basis for the achitecture for COM came from (was inspired by?) OLE, which evolved from (was build on top of?) DDE (Dynamic Data Exchange).

Andy

Object Linking and Embedding
http://en.wikipedia.org/wiki/Object_linking_and_embedding

Dynamic Data Exchange
http://en.wikipedia.org/wiki/Dynamic_data_exchange





Last edited on Nov 26, 2011 at 12:56pm
Topic archived. No new replies allowed.