Page 1 of 1

Extract all images from browser

Posted: Wed Nov 08, 2017 6:48 am
by RaelB
Hello,

Do you have any suggestions/advice on how to extract all images that have been loaded in the browser, i.e. image content, not (just) image name.

Thanks
Rael

Re: Extract all images from browser

Posted: Wed Nov 08, 2017 9:08 am
by salvadordf
Hi,

I've never tried it but I guess you can do this :
  • Leave the GlobalCEFApp.cache blank to use "in-memory" cache.
  • Use the TChromium.OnBeforeResourceLoad event while the web page is loading and store all the image URLs. Check that the request.ResourceType property is RT_IMAGE.
  • Use the TChromium.StartDownload function to download each stored image URL. The download should be instantaneous because TChromium will save the copy in the memory cache.
You will miss all these :
  • Images drawn in a canvas.
  • Images drawn piling up DIVs with a background color or stretched pixels.
  • Images drawn using unicode characters with fonts loaded from the Internet.


Remember that the SimpleOSRBrowser demo has a "snapshot" button to save the whole web page as an image.

Re: Extract all images from browser

Posted: Thu Nov 09, 2017 2:35 am
by RaelB
Thanks. Interesting strategy. Not instantaneous if there are a lot of images, but still works well.

Thanks
Rael