dea.albums
Class ChkSite

java.lang.Object
  extended byjavax.servlet.GenericServlet
      extended byjavax.servlet.http.HttpServlet
          extended bydea.albums.BaseServlet
              extended bydea.albums.ChkSite
All Implemented Interfaces:
IDebugLevels, java.io.Serializable, Servlet, ServletConfig

public class ChkSite
extends BaseServlet

See Also:
Serialized Form

Field Summary
 
Fields inherited from class dea.albums.BaseServlet
display
 
Fields inherited from interface dea.common.IDebugLevels
BASIC, DEBUG_ALL, ERROR, METH_DETAIL, METH_ENTER, METH_EXIT, METH_EXT, METH_GET, METH_SET, METH_VARS, SHOW_PASS
 
Constructor Summary
ChkSite()
           
 
Method Summary
 java.lang.String chkLine(java.lang.String tag, java.lang.String line, java.net.URL thisConn)
          If line contains the tag then returns a string of the URL pointed to.
 java.lang.String getBase(java.net.URL thisConn)
          returns String representing the parent dir of the URL
 java.lang.String getConn(java.net.URL thisConn)
          returns String representing the protocal, host and port of the URL
 int getDownloaded()
           
 int getHad()
           
 java.lang.String getLine(java.io.BufferedReader reader)
          Wait up to 100 * maxRetries for data to be available
 int getLinks()
           
 int getOffSite()
           
 int getScanned()
           
 void getSite()
          This method loops through the queue till it has processed every URL encountered.
 void logBad(LinkInfo li, java.lang.String msg)
          write bad links to log and
static void main(java.lang.String[] args)
           
 boolean mkdirs(java.lang.String s)
          Creates subdirectories as needed to complete path.
 void parse(java.lang.String[] args)
          parses the command line arguments.
 java.lang.String performTask(HttpServletRequest request, HttpServletResponse response)
          This servlet load loads a site and makes note of which links are bad
 void process(java.lang.String[] args)
           
 boolean scanSite(java.lang.String url, int curLvl)
          Save the file pointed to by url to disk.
 void setup()
          Used to setup vars, files and directories.
 
Methods inherited from class dea.albums.BaseServlet
callPage, emptyToNull, getParameter, init, service
 
Methods inherited from class javax.servlet.http.HttpServlet
doDelete, doGet, doHead, doOptions, doPost, doPut, doTrace, getLastModified, service
 
Methods inherited from class javax.servlet.GenericServlet
destroy, getInitParameter, getInitParameterNames, getServletConfig, getServletContext, getServletInfo, getServletName, init, log, log
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ChkSite

public ChkSite()
Method Detail

getLinks

public int getLinks()

getScanned

public int getScanned()

getDownloaded

public int getDownloaded()

getHad

public int getHad()

getOffSite

public int getOffSite()

process

public void process(java.lang.String[] args)

performTask

public java.lang.String performTask(HttpServletRequest request,
                                    HttpServletResponse response)
                             throws java.io.IOException,
                                    ServletException,
                                    java.lang.Exception
This servlet load loads a site and makes note of which links are bad

Parameters:
request - HttpServletRequest
response - HttpServletResponse
Throws:
java.io.IOException
ServletException
java.lang.Exception

mkdirs

public boolean mkdirs(java.lang.String s)
               throws java.io.IOException
Creates subdirectories as needed to complete path.

Parameters:
s - String representing the directory path that needs to be created.
Returns:
sucess / failure
Throws:
java.io.IOException - if unable to create directory

parse

public void parse(java.lang.String[] args)
           throws java.lang.Exception
parses the command line arguments.

Parameters:
args - A string array of the command line arguments.
Throws:
java.lang.Exception

logBad

public void logBad(LinkInfo li,
                   java.lang.String msg)
write bad links to log and


setup

public void setup()
           throws java.lang.Exception
Used to setup vars, files and directories.

Throws:
java.lang.Exception

getLine

public java.lang.String getLine(java.io.BufferedReader reader)
Wait up to 100 * maxRetries for data to be available

Returns:
string read from Reader

getSite

public void getSite()
This method loops through the queue till it has processed every URL encountered. It retries (up to RETRIES) on NoRouteToHostException, UnknownHostException, FileNotFoundException and SocketException exceptions. If it reaches RETRIES attempts or gets another type of exception it logs the url as unreachable.


scanSite

public boolean scanSite(java.lang.String url,
                        int curLvl)
                 throws java.lang.Exception
Save the file pointed to by url to disk.
If URL is of type text/html also scan the text for links and push those links onto the queue.

Parameters:
url - String represnting the URL to get / scan
curLvl - int of how many levels deep we are currently at.
Returns:
true if url downloads OK
Throws:
java.lang.Exception - any exception encountered to be handeled by caller

getBase

public java.lang.String getBase(java.net.URL thisConn)
returns String representing the parent dir of the URL

Parameters:
thisConn - URL of the page this link came from.
Returns:
String representing the parent dir of the URL

getConn

public java.lang.String getConn(java.net.URL thisConn)
returns String representing the protocal, host and port of the URL

Parameters:
thisConn - URL of the page this link came from.
Returns:
String representing the protocal, host and port of the URL

chkLine

public java.lang.String chkLine(java.lang.String tag,
                                java.lang.String line,
                                java.net.URL thisConn)
                         throws java.io.NotActiveException
If line contains the tag then returns a string of the URL pointed to. For example searching for href with line of <A HREF=&qt;dir/file.name&qt;> returns http://host/parent.path/dir/file.name
Note assumes data is in form of <tag parm= "val" param =val> Also for now links to /, .. and # are ignored since they are generally back buttons and internal page links.

Parameters:
tag - tag / parm string to search for as in `A HREF` or `src`. tag is converted to lower case before compare
line - string to search in. line is converted to lower case before compare
thisConn - URL of the page this link came from.
Throws:
java.io.NotActiveException - if it can not resolve a URL starting with ../

main

public static void main(java.lang.String[] args)


Copyright © 2001-2005 Round Mountain Rescue Ranch. All Rights Reserved.