Class ConnectionManager

java.lang.Object
org.htmlparser.http.ConnectionManager

public class ConnectionManager extends Object
Handles proxies, password protected URLs and request properties including cookies.
  • Field Details

    • mDefaultRequestProperties

      protected static Hashtable mDefaultRequestProperties
      Default Request header fields. So far this is just "User-Agent" and "Accept-Encoding".
    • mRequestProperties

      protected Hashtable mRequestProperties
      Request header fields.
    • mProxyHost

      protected String mProxyHost
      The proxy server name.
    • mProxyPort

      protected int mProxyPort
      The proxy port number.
    • mProxyUser

      protected String mProxyUser
      The proxy username name.
    • mProxyPassword

      protected String mProxyPassword
      The proxy user password.
    • mUser

      protected String mUser
      The username name for accessing the URL.
    • mPassword

      protected String mPassword
      The user password for accessing the URL.
    • mCookieJar

      protected Hashtable mCookieJar
      Cookie storage, a hashtable (by site or host) of vectors of Cookies. This will be null if cookie processing is disabled (default).
    • mMonitor

      protected ConnectionMonitor mMonitor
      The object to be notified prior to and after each connection.
    • mRedirectionProcessingEnabled

      protected boolean mRedirectionProcessingEnabled
      Flag determining if redirection processing is being handled manually.
    • mFormat

      protected static SimpleDateFormat mFormat
      Cookie expiry date format for parsing.
  • Constructor Details

    • ConnectionManager

      public ConnectionManager()
      Create a connection manager.
    • ConnectionManager

      public ConnectionManager(Hashtable properties)
      Create a connection manager with the given connection properties.
      Parameters:
      properties - Name/value pairs to be added to the HTTP request.
  • Method Details

    • getDefaultRequestProperties

      public static Hashtable getDefaultRequestProperties()
      Get the current default request header properties. A String-to-String map of header keys and values. These fields are set by the parser when creating a connection.
      Returns:
      The default set of request header properties that will currently be used.
      See Also:
    • setDefaultRequestProperties

      public static void setDefaultRequestProperties(Hashtable properties)
      Set the default request header properties. A String-to-String map of header keys and values. These fields are set by the parser when creating a connection. Some of these can be set directly on a URLConnection, i.e. If-Modified-Since is set with setIfModifiedSince(long), but since the parser transparently opens the connection on behalf of the developer, these properties are not available before the connection is fetched. Setting these request header fields affects all subsequent connections opened by the parser. For more direct control create a URLConnection massage it the way you want and then set it on the parser.

      From RFC 2616 Hypertext Transfer Protocol -- HTTP/1.1:

       5.3 Request Header Fields
      
          The request-header fields allow the client to pass additional
          information about the request, and about the client itself, to the
          server. These fields act as request modifiers, with semantics
          equivalent to the parameters on a programming language method
          invocation.
      
              request-header = Accept                   ; Section 14.1
                             | Accept-Charset           ; Section 14.2
                             | Accept-Encoding          ; Section 14.3
                             | Accept-Language          ; Section 14.4
                             | Authorization            ; Section 14.8
                             | Expect                   ; Section 14.20
                             | From                     ; Section 14.22
                             | Host                     ; Section 14.23
                             | If-Match                 ; Section 14.24
                             | If-Modified-Since        ; Section 14.25
                             | If-None-Match            ; Section 14.26
                             | If-Range                 ; Section 14.27
                             | If-Unmodified-Since      ; Section 14.28
                             | Max-Forwards             ; Section 14.31
                             | Proxy-Authorization      ; Section 14.34
                             | Range                    ; Section 14.35
                             | Referer                  ; Section 14.36
                             | TE                       ; Section 14.39
                             | User-Agent               ; Section 14.43
      
          Request-header field names can be extended reliably only in
          combination with a change in the protocol version. However, new or
          experimental header fields MAY be given the semantics of request-
          header fields if all parties in the communication recognize them to
          be request-header fields. Unrecognized header fields are treated as
          entity-header fields.
       
      Parameters:
      properties - The new set of default request header properties to use. This affects all subsequently created connections.
      See Also:
    • getRequestProperties

      public Hashtable getRequestProperties()
      Get the current request header properties. A String-to-String map of header keys and values, excluding proxy items, cookies and URL authorization.
      Returns:
      The request header properties for this connection manager.
    • setRequestProperties

      public void setRequestProperties(Hashtable properties)
      Set the current request properties. Replaces the current set of fixed request properties with the given set. This does not replace the Proxy-Authorization property which is constructed from the values of setProxyUser(java.lang.String) and setProxyPassword(java.lang.String) values or the Authorization property which is constructed from the setUser(java.lang.String) and setPassword(java.lang.String) values. Nor does it replace the Cookie property which is constructed from the current cookie jar.
      Parameters:
      properties - The new fixed properties.
    • getProxyHost

      public String getProxyHost()
      Get the proxy host name, if any.
      Returns:
      Returns the proxy host.
    • setProxyHost

      public void setProxyHost(String host)
      Set the proxy host to use.
      Parameters:
      host - The host to use for proxy access. Note: You must also set the proxy port.
    • getProxyPort

      public int getProxyPort()
      Get the proxy port number.
      Returns:
      Returns the proxy port.
    • setProxyPort

      public void setProxyPort(int port)
      Set the proxy port number.
      Parameters:
      port - The proxy port. Note: You must also set the proxy host.
    • getProxyUser

      public String getProxyUser()
      Get the user name for proxy authorization, if any.
      Returns:
      Returns the proxy user, or null if no proxy authorization is required.
    • setProxyUser

      public void setProxyUser(String user)
      Set the user name for proxy authorization.
      Parameters:
      user - The proxy user name. Note: You must also set the proxy password.
    • getProxyPassword

      public String getProxyPassword()
      Set the proxy user's password.
      Returns:
      Returns the proxy password.
    • setProxyPassword

      public void setProxyPassword(String password)
      Get the proxy user's password.
      Parameters:
      password - The password for the proxy user. Note: You must also set the proxy user.
    • getUser

      public String getUser()
      Get the user name to access the URL.
      Returns:
      Returns the username that will be used to access the URL, or null if no authorization is required.
    • setUser

      public void setUser(String user)
      Set the user name to access the URL.
      Parameters:
      user - The user name for accessing the URL. Note: You must also set the password.
    • getPassword

      public String getPassword()
      Get the URL users's password.
      Returns:
      Returns the URL password.
    • setPassword

      public void setPassword(String password)
      Set the URL users's password.
      Parameters:
      password - The password for the URL.
    • getCookieProcessingEnabled

      public boolean getCookieProcessingEnabled()
      Predicate to determine if cookie processing is currently enabled.
      Returns:
      true if cookies are being processed.
    • setCookieProcessingEnabled

      public void setCookieProcessingEnabled(boolean enable)
      Enables and disabled cookie processing.
      Parameters:
      enable - if true cookie processing will occur, else cookie processing will be turned off.
    • setCookie

      public void setCookie(Cookie cookie, String domain)
      Adds a cookie to the cookie jar.
      Parameters:
      cookie - The cookie to add.
      domain - The domain to use in case the cookie has no domain attribute.
    • getMonitor

      public ConnectionMonitor getMonitor()
      Get the monitoring object, if any.
      Returns:
      Returns the monitor, or null if none has been assigned.
    • setMonitor

      public void setMonitor(ConnectionMonitor monitor)
      Set the monitoring object.
      Parameters:
      monitor - The monitor to set.
    • getRedirectionProcessingEnabled

      public boolean getRedirectionProcessingEnabled()
      Predicate to determine if url redirection processing is currently enabled.
      Returns:
      true if redirection is being processed manually.
      See Also:
    • setRedirectionProcessingEnabled

      public void setRedirectionProcessingEnabled(boolean enabled)
      Enables or disables manual redirection handling. Normally the HttpURLConnection follows redirections (HTTP response code 3xx) automatically if the followRedirects property is true. With this flag set the ConnectionMonitor performs the redirection processing; The advantage being that cookies (if enabled) are passed in subsequent requests.
      Parameters:
      enabled - The new state of the redirectionProcessingEnabled property.
    • getLocation

      protected String getLocation(HttpURLConnection http)
      Get the Location field if any.
      Parameters:
      http - The connection to get the location from.
    • openConnection

      public URLConnection openConnection(URL url) throws ParserException
      Opens a connection using the given url.
      Parameters:
      url - The url to open.
      Returns:
      The connection.
      Throws:
      ParserException - if an i/o exception occurs accessing the url.
    • encode

      public static final String encode(byte[] array)
      Encodes a byte array into BASE64 in accordance with RFC 2045.
      Parameters:
      array - The bytes to convert.
      Returns:
      A BASE64 encoded string.
    • fixSpaces

      public String fixSpaces(String url)
      Turn spaces into %20. ToDo: make this more generic (see RFE #1010593 provide URL encoding/decoding utilities).
      Parameters:
      url - The url containing spaces.
      Returns:
      The URL with spaces as %20 sequences.
    • openConnection

      public URLConnection openConnection(String string) throws ParserException
      Opens a connection based on a given string. The string is either a file, in which case file://localhost is prepended to a canonical path derived from the string, or a url that begins with one of the known protocol strings, i.e. http://. Embedded spaces are silently converted to %20 sequences.
      Parameters:
      string - The name of a file or a url.
      Returns:
      The connection.
      Throws:
      ParserException - if the string is not a valid url or file.
    • addCookies

      public void addCookies(URLConnection connection)
      Generate a HTTP cookie header value string from the cookie jar.
         The syntax for the header is:
      
          cookie          =       "Cookie:" cookie-version
                                  1*((";" | ",") cookie-value)
          cookie-value    =       NAME "=" VALUE [";" path] [";" domain]
          cookie-version  =       "$Version" "=" value
          NAME            =       attr
          VALUE           =       value
          path            =       "$Path" "=" value
          domain          =       "$Domain" "=" value
      
       
      Parameters:
      connection - The connection being accessed.
      See Also:
    • addCookies

      protected Vector addCookies(Vector cookies, String path, Vector list)
      Add qualified cookies from cookies into list.
      Parameters:
      cookies - The list of cookies to check (may be null).
      path - The path being accessed.
      list - The list of qualified cookies.
      Returns:
      The list of qualified cookies.
    • getDomain

      protected String getDomain(String host)
      Get the domain from a host.
      Parameters:
      host - The supposed host name.
      Returns:
      The domain (with the leading dot), or null if the domain cannot be determined.
    • generateCookieProperty

      protected String generateCookieProperty(Vector cookies)
      Creates the cookie request property value from the list of valid cookies for the domain.
      Parameters:
      cookies - The list of valid cookies to be encoded in the request.
      Returns:
      A string suitable for inclusion as the value of the "Cookie:" request property.
    • parseCookies

      public void parseCookies(URLConnection connection)
      Check for cookie and parse into cookie jar.
      Parameters:
      connection - The connection to extract cookie information from.
    • saveCookies

      protected void saveCookies(Vector list, URLConnection connection)
      Save the cookies received in the response header.
      Parameters:
      list - The list of cookies extracted from the response header.
      connection - The connection (used when a cookie has no domain).