Sunday, May 15, 2011

Tomcat Architecture (4.x) Detailed


Tomcat is a feature complete servlet container that Java servlets and JSP can be run. In many cases tomcat is used with Apache server to serve dynamic content to a web page.

J2EE
Java 2 Platform Enterprise Edition (J2EE) defines a group of Java-based code libraries (called API in the Java world) that are suited to creating web applications for the enterprise (i.e a large company).J2EE is design to cope with large complex configuration in large companies. J2EE is used in a distributed system environment where specific servers do specific tasks. There are many J2EE API's and here is a list of some key ones:
Enterprise JavaBeans (EJB) EJB technology provides a simple mechanism for creating distributed business logic components.
Java Message Service (JMS) Provides asynchronous messaging capability to J2EE applications
Java Naming and Directory Service (JDNI) Enables J2EE applications to communicate with registries and directories services i.e LDAP
Servlets Servlets work with special servers called servlet containers to process HTTP requests and send HTTP responses
JavaServer Pages (JSP) JSP is an alternative, HTML-like interface for creating servlets. At runtime, the servlet container converts a JSP into a servlet
Tomcat
Tomcat is a servlet container and as a servlet container it is only required to implement the servlet and JSP API's and thus not consider a J2EE application server. The reason to use Tomcat is that it is a cut down version of the full J2EE environment as the other API's are not required.
JSP and Servlets
Servlets are portions of logic written in Java that have a defined form and which are invoked to dynamically generate content. A servlet has a defined lifecycle and were created by Sun microsystems due to the problems with CGI.
JSP is a technology to provide dynamic content, the programmer will insert code that can personalise and secure a web page, the Java code will be executed on the server rather than the browser. JSP pages are compiled intoservlets which are then kept in memory or on the filesystem indefinitely, until either the memory is required back or the server is restarted. This method increases response times as the document as already be parsed and compiled, the result works like a CGI program. The difference between JSP and sevlets is that sevlets are held within the private area of the server, while JSP are held within the public area. Jasper and Javac are the compilers that convert JSP pages into servlets.
JSP are often used like templates where many JSP pages make up a web page - see below fig.


JSP tag extensions encapsulate entire functions which makes the code more readable. Each tag has a corresponding Java class that contains the code that would otherwise appear on the page.
Web application architecture
The set of all the servlets, JSP pages and other files that are logically related composes a web application. The servlet specification defines a standard directory hierarcy where all of these files must be placed.
/ All pubicly files are placed in this directory i.e HTML, JSP and GIF
/WEB-INF Files in this directory are private. A single file, web.xml called the deployment descriptor contains configuration options for the web application.
/WEB-INF/classes web application classes are placed here
/WEB-INF/lib Class files can be archived into a single file, called a JAR file and placed into this directory.
Tomcat installation
Follow the link for the installation guide
Tomcat Architecture
The server is tomcat itself, an instance of the web application server. It is possible to run two tomcat servers on the same machine using two different port numbers.
A service groups a container (usually of type engine) with that container's connectors and is the top-level component.
Connectors connect the applications to client. Ther are a number of connectors, HTTP, SSL, JDBC, WARP, etc
Engine is a request-processing component that represents the Catalina servlet engine. By examining the HTTP header it knows what engine to pass the request to.
Realms will reside inside the engine and manage authenication and authorisation. It applies across the entire engine within the container. So each engine will have its own realm.
Valves are components that intercept requests and pre-process it. Valves are commonly used to enable single sign-on for all hosts as well as log request patterns, client IP addresses and server usage patterns.
Listeners listen for significant events in the component they are configured in, for an example , a Javabean could send an email when an event requiring administration is recorded. In other words when a particular event occurs a certain action can be taken.
Loggers report on the internal state of a component. The default logger can be overridden thus given you seperate log files.
Hosts minics the popular Apache virtual host functionality.
Context is the web application itself.
Using the above information and the diagram below you should have a understanding of the tomcat architecture.
Fig 1

Main files that make up the tomcat server
server.xml This is the main config file and is read at startup.
server-noexamples.xml This file contains a blank template of the server.xml, it is ideal to use for your own main config file
tomcat-users.xml This file contains user authenication and role mapping info for setting up a memory realm
web.xml This file is the default deployment discriptor file for any web application that are running on the tomcat server instance.
catalina.policy java 2 has a fine grain security model that enable the administrator to control in detail the accessibility of system resources.
How server.xml and web.xml work together
The URL below is parsed by the various components of tomcat.
https://www.datadisk.co.uk/bookstore/buybook/proApache
1. https:// - in this case the SSL connector is used and the request passed to the engine. (processed by server.xml)

2. www.datadisk.co.uk - is parsed by the engine and one of its hosts is selected. (processed by server.xml)

3. /bookstore - is matched agaist the entire context that it contains and the bookstore web application is selected. (processed by server.xml)

4. /buybook - is matched agaist the BookPurchase servlet. (processed by web.xml)
5. /proApache - is processed by the servlet (processed by the servlet)
Advanced Tomcat features
The list below contains some of the more common used advanced features of the tomcat server
  • Access log administration
  • Request Filtering
  • Single sign-on across web application
  • JNDI and JBDC
  • Realms
Valves are used to intercept clients requests (see fig 1), these components are placed inside the <engine>, <host> or <context>.
The standard valves are:
Access logging Enable the logging of requests
Single sign-on Enhance the user experience by requesting for the password only once.
Request Filtering Enable selective filtering of incoming request based on IP or hostname
Request Dump Prints the headers and cookies of incoming requests and outgoing responses to a log
JNDI and JDBC is a API used for looking up information either from a Directory service or a Database.
Realms help web application developers in implementing and enforcment of specific security policies.
Class loaders
When a java class is instantiated as an object, that class must be loaded into memory. There are three distinct class loaders:
bootstrap This class loader is used by JVM to load those java classes that are necessary for the JVM to function. It is responsible for loading all Java core classes.
extension Is repsonsible for loading all the classes in one or more extension directories. The extension directory on sun's JVM is /jdk/jre/lib/ext.
system loader This loader locates its classes in those directories and JAR files specified in the CLASSPATH enviroment variable.
When a class is required to be loaded it is normally delegated to the parent to fulfil the request. So the bootstrap loader will try to fulfil first and then the extension loader and lastly the system loader.
None of the above class load preload all classes in there search paths but perform what is know as lazy loading in other words it will only load the class when requested, this allows for faster performance, efficiency and flexbility.
Tomcat Class loaders
Tomcat build on the Java class loaders by adding its own class loaders.

Common class loader Loads classes that are used by Tomcat and are also pubicly available to all web applications
Locations:
$CATALINA_HOME/common/lib
$CATALINA_HOME/common/classes
Catalina class loader Loads all Tomcat classes that are only specific to Tomcat (that is not publicly defined API's an so on)
Locations:
$CATALINA_HOME/server/lib
$CATALINA_HOME/server/classes
Shared class loader This is were you place your own public classes
Locations:
$CATALINA_HOME/shared/lib
$CATALINA_HOME/shared/classes
Web App class loader This is were each web application loads its own classes
Locations:
$CATALINA_HOME/webapps/<webapp>/WEB-INF/lib
$CATALINA_HOME/webapps/<webapp>/WEB-INF/classes
An important note is that the web application class loader does not use the delegation process that class loaders are encouraged to use. Instead, it tries to load classes first before the delegation. This allowsweb application to override classes in the shared and common class loaders on a per-web application basis.
HTTP connectors
The HTTP connectors are Java classes that implement the HTTP protocol. The connector class has code to parse the HTTP request and take the required action of either serving up static content or passing the request through the Tomcat servlet engine.
The default HTTP/1.1 coyote connect below shows the class used and the port number that tomcat listens for cleint requests, the min/max processors specifies the request processing threads basically start with 5 and maximum running would be 75. enable lookup resolves the IP address via DNS (turn this off for a increase in performance). If any request come in that are notHTTP/1.1 they will be redirected to port 8443. The rest of the option are self explaining.
<Connector className="org.apache.coyote.tomcat4.CoyoteConnector"
                    port="8080"
                    minProcessors="5"
                    maxProcessors="75"
                    enableLookups="true"
                    redirectPort="8443"
                    acceptCount="100"
                    debug="0"
                    connectionTimeout="20000"
                    useURIValidationHack="false"
                    disableUploadTimeout="true" />

AJP Connector
The apache JServ protocol is a packet-orientated TCP/IP based protocol. It provides comunication between the apache web server process and the running tomcat instances. The are various version of the AJP protocol connector 1.2, 1.3 (commonly used) and 1.4.
Some noticeable features of AJP 1.3 are:
  • Good Performance
  • Support for SSL
  • The factility to get information about encryption and client certificates.
The mod_jk module is the component that will redirect all clients requests to the tomcat server.
The Workers
A worker is a Tomcat instance that is running to serve JSP/Servlet requests coming from another web server. There can be many workers whcih could be used to load balance the requests especially a heavy loaded site.
The file workers.properties consists of entries that will convey information to the web server plug-in about any available workers. When load balancing a worker is created that uses a weighted round-robin alogritham for load balancing with support for seamless sessions. The loadbalance worker will redirect traffic to other available workers. This has the added bonus that if one worker was to offline the remaining workers still serve the requested JSP/Sevlets requests.
The worker.properties file below creates two workers and a load balancing worker which uses the two workers to balance load. The lbfactor can be adjusted to proportionally distribute the load, this may be used when one server is more powerful than the other server or if one server is more heavy loaded than other.

# Tomcat Worker 1
worker.tomcat1.port=8009
worker.tomcat1.host=localhost
worker.tomcat1.type=ajp13
worker.tomcat1.lbfactor=10
worker.tomcat1.cachesize=5
# Tomcat Worker 2
worker.tomcat2.port=8010
worker.tomcat2.host=localhost
worker.tomcat2.type=ajp13
worker.tomcat2.lbfactor=10
worker.tomcat2.cachesize=5
# Load Balancer Worker
worker.loadbalancer.type=lb
worker.loadblancer.blanced_workers=tomcat1, tomcat2

No comments:

Post a Comment