Crawls through urls extracting unique Email addresses as it goes.
This version only follows href tags, and does
limited relative path to full path conversion
i.e it does not properly folow relative links having ../
I will post a new version soon that extracts all
urls and has better relative path support.
Code is heavily commented and Shows how to use:
Original Author: RegX
Just give it a starting URL and press go
Must have reference to
Regular expresions and Scripting runtime
I recomend regular expresions 5.5 which you will find the download link to in the global declarations
A list of email addresses that can then be saved/appended to a text file
although I limit cashed URLS to 5000 this script can cunsume quite a bit of memory.
It would be much better to use a DB to Cash URLs
and Emails (less memory) and this would also allow
cach to exist between program invocations, but for this simple demo I used listboxes.
About this post
Viewed: 70 times
Posted: 9/3/2020 3:45:00 PM
Size: 5,528 bytes
No comments have been added for this post.
You must be logged in to make a comment.