Creating a new RecordSet Type.

In this tutorial the process of creating a completly new type of RecordSet will be explained. This tutorial will be written while actually implementing the JDBCRecordSet. Hopefully this will result in a document that addresses the questions that come about when acutally implementing estensions to DocHelper.


Step 1.

Define what resource you want to access.

Define what resource your record set should access and how the resource should separate items into Records.

For this RecordSet we want to be able to access jdbc compliant databases and access a specific table to read individual Rows(Records). Each Record is a row in the table. For each row on the table we want to render a line that may or may not show each of the fields. This may seem like a futile exercise, given that we could simply execute a select query that outputs a string and is saved into a file. Its application integrated into a larger documentation process however could prove valuable. Regardless the purpose of the tutorial is to address the processes of extending DocHelper.

The main goals is integrating data in a database table, to documentation that is created using resources outside of the database. It can also prove useful once the rendering mechanism is extracted and made generic. This will allow the user to extract info from a DB to create HTML, a PDF, a set of XML files or a format specified by a custom renderer (like the file renderer)

Step 2.

Create a new Class that implements the RecordSet interface.

This class is in charge of establishing the connection to the DB we will need and of extracting the records when needed. It will also need be in charge of turning rows into records at the time of its choosing. The election on whether to create Records from rows during initialization or during execution should be based on the expected number of rows. For this example we will choose to fetch and turn rows into records at initialization time.

Lets look at the begining of the class, before however, we should be aware that the need to access non jdbc tables later on, should be addressed:

                    /*
* AbstractTableRecordSet.java
*/
package dochelper;

import ...

/**
* The AbstractTableRecordSet is the base class for defining
* Database table record sets. It implements common methodology but lets
* subclasses define connection creation and record retrieval.
*
* @author gchristen
*/
public abstract class AbstractTableRecordSet implements RecordSet {

The class we have made is an abstract class. It is foreseeable that RecordSets may vary depending on the Database type and vendor. Since some of the functionality required by DocHelper will remain identical, this code should be made available to all Table Record Sets. For this exercise we will connect to a MySql table. To connect to our database we need to figure out what is needed: a connection string, a db driver... etc.

Step 3.

Write AbstractTableRecordSet.

To create any new RecordSet, the dochelper.RecordSet interface needs to be implemented.
The methods to implement are:


public void clearPatterns();
public void addPattern(DocPattern pattern);
public void setName(String name);
public ResultNode execute(ResultNode doc) throws ResourceUnavailableException, Exception;


The RecordSet will also need paramters to be set in order for it to work properly. Parameters can be set, checked and the containing object initialized for any object that implements the Configurable interface. We can see the RecordSet interface extends the Configurable interface. Thus, the following methods need to be implemented as well:


public void setParam(String paramName, Object value) throws IllegalParamterTypeException;
public void checkParams() throws InvalidValueException, MissingArgumentException, ResourceUnavailableException;
public void init() throws InitializationException, ResourceUnavailableException;

All of these methods are intended to indicate how to setup and execute a RecordSet within the DocHelper architecture. DocHelper assures that the execution flow follows the order : first you set params, then you add patterns, then you check the params, then the class is initialized and finally its executed.

Each record set should implement how to add and check parameters, add patterns and initialize and execute a RecordSet. For the vast majority of cases adding parameters and patterns should be very similar between record sets. Because of this a params variable is defined within the interface. This provides a public final variable for the implementors to use.

So lets take a look at a possible implementation for our intial methods: 1. setParam(String paramName, Object value)



public void setParam(String paramName, Object value) throws IllegalParamterTypeException {

if(value instanceof String || value instanceof DocumentSection)
params.put(paramName, value);
else
throw new IllegalParamterTypeException(paramName);
}

Parameters in the context of DocHelper, refer to values specified in the <params> section of the Document Definition markup (XML) for this specific Record Set. Here we will want to set the values that we need to establish a connection to the DB, and the values needed to create the query which will fetch the records.

Setting parameters is straight forward. Each <param name="[[name]]">[[value]]<param> in the markup will be guaranteed to result in a call to the setParam method: setParam([[name]],[[value]]);. We want to keep these in a class scope variable to use them later on. The same should apply for AddPattern. To look at the implementation for this method please look at the AbstractTableRecordSet.java file.

2. init()



public void init() throws InitializationException, ResourceUnavailableException {
//call the concrete method
try {
initImpl();
}
catch(SQLException sqe) {
throw new InitializationException(sqe.getMessage());
}
}

The initialization method of a Record Set is meant to prepare it for execution. In this case it is possible that different DB provders and connection mechanisms will require different behaviour. Because of this, this method should delegate the call to inheriting classes. The code above also catches a new type of exception that initImpl methods may throw: SQLException. Exceptions during initialization should be digested and thrown as InitializationExceptions.
3. execute(String doc)

It is debatable wether the execute method should be left as an abstract method in order for inheriting methods to implement. It could be argued that each concrete class could wish to perform different actions during execution. I think that the execution path is well established for most record sets and should remain as it is. I have placed no specific restrictions through architecture or scope because I think this is still subject to debate, a class may validaly wish to override the execution mechanism. For this class the best option is to implement an overridable method in case concrete classes wish to re-write the implementation.


public ResultNode execute(ResultNode doc) throws ResourceUnavailableException {

List<Record> records = this.getRecords();

//Make the RecordSet Node
ResultNode result = new ResultNode(null, params, "RecordSet");

if (records != null) {
//Loop through all the records for this record set
for (Record record : records) {
//add the subtree created by executing this record
result.addChild(record.render(patterns));
}
}

return result;
}

4. getRecords() and checkParams()


The check params method will vary according to implementation. Different connection mechanisms will require deferent parameters. Thus, checking the validity of the passed parameters should be left up to concrete classes.



protected abstract List getRecords() throws ResourceUnavailableException, SQLException;
public void checkParams() throws InvalidValueException, MissingArgumentException, ResourceUnavailableException {
checkParamsImpl();
}



One final method discussed here is getRecords(). This method will fetch the records for the record set for internal use. This method was considered as an addition to the RecordSet interface, however was not included because it is purely for internal use. In DocHelper there is no real need to fetch the record list outside the scope of the RecordSet. Thus adding this method signature to the interface, rather than place an effective restriction, adds a suggestion that is debatable. A RecordSet may perform record fetching in its execute method and then dispose of this list. The records temselves for a particular recordset are not an element that should be reachable from outside of the RecordSet class.<

For the implementation of the AbstractRecordSet however, it makes sense to place the restriction discussed above: make getRecords an abstract method. The AbstractRecordSet class implements its own execute method that uses a call to getRecords to fetch the record list. Each inheriting class may wish to fetch the records in its own way, however initially, all records will be rendered in a similar manner. Regardless of how records are fetched, inheriting sub-classes are required to imlement this method.

Step 4.

Write TableRecord class.

One of the requirements we have placed on inheriting classes of AbstractTableRecordSet is that they be able to fetch Records and convert them into dochelper.Records. From this requirement it is clear that a TableRecord class that implements dochelper.Record interface is needed. This class is very straight forward and is thus presented in full. Its purpose is simply to deliver each item in the resource (in this case a database table) in a helpful manner.



/*
* TableRecord.java
*
* Created on March 16, 2009, 08:32 AM
*
* Copyright 2009 Online Nederland Breedband B.V.
* See the COPYRIGHT file for redistribution and use restrictions.
* Authors: Guillermo Christen
*/
package dochelper;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;

/**
*
* @author gchristen
*/
public class TableRecord implements Record {

/**
* HashMap containing each field within this Record (Row) and its value.
*/
protected HashMap _fieldValues;

/** HashMap containing each field within this Record (Row) and its value. */
protected HashMap _fullFieldValues;

/** HashMap containing each field within this Record (Row) and its value.*/
protected String _name;


/** Creates a new instance of TableRecord
* @param fieldValues
*/
public TableRecord(HashMap fieldValues) {
this._fieldValues = fieldValues;
}

/**
* Creates a New Instance of a Table Record,
* given a set of values and a Document Section
*
* @param fieldValues
* hashMap containing field names and field values for Searched Fields
*
* @param fullFieldValues
* HashMap containing all field names and its values
* @param name
*/
public TableRecord(HashMap fieldValues, HashMap fullFieldValues, String name) {
this(fieldValues);
this._fullFieldValues = fullFieldValues;
this._name = name;
}


/**
* Renders a record using the specified Pattern list.
*
* This method will iterate through all patterns passed in if any, and
* execute the pattern passing the contents of the row this record represents.
*
* In this case the value of the record will not be fetched using getValue.
* It will be fetched directly from the fieldValues has map.
*
* @return
* String containing the contents of the pattern rendering and the
* encapsulation.
**/
public ResultNode execute(ArrayList patterns) {

ResultNode root = new ResultNode(null, _fullFieldValues, _name);

//match all the patterns
if(patterns != null && _fieldValues != null && _fieldValues.size() > 0) {
for (DocPattern pattern : patterns) {
//look for pattern in each of the fields
for(Iterator iter = _fieldValues.keySet().iterator(); iter.hasNext();) {
String value = _fieldValues.get((String)iter.next()).toString();
root.addChild(pattern.execute(value));
}
}
}

return root;

}

/**
* Return the string value of this record. For table record sets this value
* is not used as the string to search against. Each field in a row is checked
* indiidually.
*
* @return String value of the entire record.
*/
public String getValue() {
//loop through all of the keys and values: key:value
return "";
}

}



The TableRecord will look for the given patterns within each of the specified fields. If found, each of the matches will be transformed according to the available TransformDefinition. This means that if within one row (record) two fields have matches, both will be considered as results. For this exercise we will not create a new TransformDefinition in order to handle this in a significant way. This is left as an exercise to the reader.

DocumentSections used with TableRecords have been given access to all of the fields for a specific record, this allows a record to be identified in a useful way. Please look at the Execute step to see an example.

Step 5.

Write JdbcTableRecordSet as concrete implementation.

Now that we have an abstract class to use in varying environments, and that we have a Table Record class to contain record data we can write a concrete class to provide DocHelper with access to jdbc compliant DBs. This method should impement the following methods:

                
/** Subclass should implement this method */
protected abstract void initImpl() throws SQLException, ResourceUnavailableException;

/** Subclass should implement this method */
protected abstract void checkParamsImpl() throws MissingArgumentException, InvalidValueException, ResourceUnavailableException;

/** Subclass should implement this method */
protected abstract List getRecords() throws ResourceUnavailableException, SQLException;


The concrete class should: extablish the connection to the DB, check all required params are present and are valid, and fetch the Rows and convert them into Records. For this we can already see a set of parameters that will be required for jdbc DBs: user, pass, probably url, dirver and components that make up the query. Ideally we would like to search for a pattern within specific fields of a specific table.

The following might be a possible start to our class:


/**
*
* @author gchristen
*/
public class JdbcTableRecordSet extends AbstractTableRecordSet {

/** Param Keys: USER. */
protected String USER = "user";

/** Param Keys: PASS. */
protected String PASS = "pass";

/** Param Keys: URL. */
protected String URL = "url";

/** Param Keys: DRIVER. */
protected String DRIVER = "driver";

/** Param Keys: TABLE. */
protected String TABLE = "tablename";

/** Param Keys: FIELDS. */
protected String FIELDS = "fields";

/** Param Keys: WHERE. */
protected String FILTER = "whereclause";

/** ArrayList containing Records. */
private ArrayList<Record> records = new ArrayList();


Here we declare the parameter names which will hold our info. The <params> section of the markup for our record set should contain <param>s for most of these value. (The where clause is optional). The DB may also allow non authenticated access, so pw may be optional.

Since we now know which params are required we can implement the checkParamsImpl method:


protected void checkParamsImpl() throws MissingArgumentException, InvalidValueException, ResourceUnavailableException {

//Validate the user is set
if(!params.containsKey(USER))
throw new MissingArgumentException(USER);

//Validate the url is set
if(!params.containsKey(URL))
throw new MissingArgumentException(URL);

//Validate the dirver is set
if(!params.containsKey(DRIVER))
throw new MissingArgumentException(DRIVER);

//Validate the table is set
if(!params.containsKey(TABLE))
throw new MissingArgumentException(TABLE);

//Validate the table is set
if(!params.containsKey(FIELDS))
throw new MissingArgumentException(FIELDS);

//check the dirver is accessible
try {
System.out.println("Trying to load class:" + (String)params.get(DRIVER));
Class.forName((String)params.get(DRIVER));
}
catch (ClassNotFoundException cnfe) {
String error = "The driver class specified was not found.";
throw new ResourceUnavailableException(error + cnfe.getMessage());
}

}


In the last section we also want to make sure the driver specified to connect to this DB is loadable. The Class loader may not have access to this driver if it has not been added to the classpath properly.

Once the parameters are checked the Record Set should be initialized. To initialize the class, we need to implement the initImpl method. Here the connection should be established. Also, here we will fetch the records. As mentioned before this action could be taken either upon initialization or when fetching the records. For this example I have opted to fetch the records upon initialization. The initImpl method thus looks like this:



protected void initImpl() throws SQLException, ResourceUnavailableException {

String url = (String)params.get(URL);
String pass = (String)params.get(PASS);
String user = (String)params.get(USER);

Connection conn = DriverManager.getConnection (url, user, pass);

//fetch the records
fetchRecords(conn);

}



The first thing to do is to estabish the connection. The parameters we need are available in the params hashmap. They were placed there when the Document Definition xml markup was processed. Using the user,pass and url params a new connection is created. Next the fetch records method is called:



protected void fetchRecords(Connection conn) throws SQLException {

//Create the query
Statement s = conn.createStatement();
String query = "SELECT * FROM " + (String)params.get(TABLE);

String where = (String)params.get(FILTER);
if (where != null)
query += " WHERE " + where;

System.out.println("Query: " + query);

//execute the query
s.executeQuery(query);


ResultSet rs = s.getResultSet();
String[] fields = ((String)params.get(FIELDS)).split(",");

//make each Record by fetching the fields within it
while (rs.next()) {
//For each row, create a field,value hashmap
HashMap fieldValues = new HashMap();
HashMap fullFieldValues = new HashMap();

ResultSetMetaData rsMeta = rs.getMetaData();

//Create Searched field map
for (int i = 0;i < fields.length;i++) {
fieldValues.put(fields[i], rs.getObject(fields[i]));
}

//create the full fields map
for (int i = 1;i < rsMeta.getColumnCount();i++) {
Object value = rs.getObject(i);
if (value == null) value = new String("null");
fullFieldValues.put(rsMeta.getColumnName(i), value);
}

//make the records
TableRecord record = new TableRecord(fieldValues, fullFieldValues, this.getName() + "Record");

records.add(record);
}

//Close Record Set
rs.close();
s.close();

}



The fetchRecords private method creates the query that will fetch the rows and then proceeds to create each TableRecord, making sure each TableRecord gets a name. For each table record two maps are provided, one with all values, and one with fields to be searched.

At this point the only method we have not yet implemented is the getRecords method. This needs only return the records ArrayList, since the actual fetching happens at initialization time. This is left as an exercise to the reader.

Step 6.

Compile and build.

We now have a JDBC Record Set available for use. Compile the DocHelper by running the _compile ant task. This should just attempt to compile our changes. Once this is done, make sure to place the appropriate jar file in the lib directory that will make the driver you are planning to use available to DocHelper. Run the build_exec_jar ant task.

Step 7.

Use JDBC Record Set.

Lets look at example definition markup to use the new Record Set. Let's say that we wish to look through the following table in a Mysql database called dochelper:


+------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------+------+-----+---------+-------+
| id | int(11) | NO | PRI | 0 | |
| fname | char(50) | YES | | NULL | |
| lname | char(50) | YES | | NULL | |
| years_exp | int(11) | YES | | NULL | |
| currentTask | int(11) | YES | | NULL | |
| currentBook | varchar(255) | YES | | NULL | |
| currentBookTheme | varchar(855) | YES | | NULL | |
+------------------+--------------+------+-----+---------+-------+


In the documentation we wish to group book themes for Developers who have more than 4 years of experience. Here we should distiguish doc helper's role as a tool to separate and format content and not a way to filter info in a DB. (We can and should use the DB to do this)

Our filter (more than 4 years of experience) should be given in the where clause. An example document definition markup could look like this:



<?xml version="1.0" encoding="UTF-8"?>
<document>

<!-- Document Definition -->
<documentdef>
<recordsetref name="developerBooks" >
<patternref name="mathSearch" />
<patternref name="socialScienceSearch" />
</recordsetref>

<renderef name="Default" />
</documentdef>

<!-- pattern: Display developers who are working on mathematics and algebra themes -->
<pattern name="mathSearch">
<expression>[Aa]lgebra|[Mm]athematics</expression>
<nodedef class="RenderDefinitionImpl"> Specific Theme: $0\n</nodedef>
</pattern>

<!-- pattern: Display developers who are working on Social Sciences -->
<pattern name="socialScienceSearch">
<expression>[Hh]istory|[Ll]iterature</expression>
<nodedef class="RenderDefinitionImpl"> Specific Theme: $0\n</nodedef>
</pattern>


<!-- Analyze DocHelper DB, table developers [for example use only] -->
<recordset class="JdbcTableRecordSet" name="developerBooks">
<!-- Params for Record Set -->
<params>
<param name="user">testuser</param>
<param name="pass">testpass</param>
<param name="url">jdbc:mysql://localhost/DocHelper</param>
<param name="driver">com.mysql.jdbc.Driver</param>
<param name="tablename">developers</param>
<param name="fields">currentBookTheme</param>
<param name="whereclause">years_exp > 4</param>
</params>
</recordset>

<!-- Use File Renderer -->
<renderer class="FileRenderer" name="Default">

<params>
<param name="outputfile" >/home/gchristen/Temp/out.txt</param>
</params>
<docsections>
<!-- The template to use to Create Output Document -->
<docsection ref="developerBooks">
<start> Developer are working on themes: \n</start>
<end> </end>
</docsection>
<docsection ref="developerBooksRecord">
<start> Developer (current book): $fname, $lname - $currentBook \n</start>
<end></end>
</docsection>
<docsection name="mathSearch">
<start> - Math: \n</start>
<end></end>
</docsection>
<docsection name="socialScienceSearch">
<start> - Social Sciences: \n</start>
<end></end>
</docsection>
</docsections>

</renderer>

</document>