<?xml version="1.0" encoding="utf-8"?>
			
			<rss version="2.0">
			<channel>
			<title>Wayne Graham&apos;s Blog - ColdFusion</title>
			<link>http://swem.wm.edu/blogs/waynegraham/index.cfm</link>
			<description>ColdFusion Development for Academic Libraries</description>
			<language>en-us</language>
			<pubDate>Sun, 22 Nov 2009 17:48:21 -0500</pubDate>
			<lastBuildDate>Tue, 18 Mar 2008 08:59:00 -0500</lastBuildDate>
			<generator>BlogCFC</generator>
			<docs>http://blogs.law.harvard.edu/tech/rss</docs>
			<managingEditor>wsgrah@wm.edu</managingEditor>
			<webMaster>wsgrah@wm.edu</webMaster>
			
			
			
			
			
			<item>
				<title>Facebook Developer API</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2008/3/18/Facebook-Developer-API</link>
				<description>
				
				It&apos;s been a long while since I posted anything here. Things have been a bit hectic with the launching of a new institutional repository service (using DSpace), taking a computer graphics course (linear algebra, ray tracing, scientific visualization, etc.), and finishing up my book on the &lt;a href=&quot;http://www.facebook.com/pages/Facebook-API-Developers-Guide/8672611380&quot;&gt;Facebook API&lt;/a&gt;. 

&lt;a href=&quot;http://www.amazon.com/gp/product/1430209690/&quot;&gt;
&lt;img src=&quot;http://ecx.images-amazon.com/images/I/51xMIn9ccGL._AA240_.jpg&quot; align=&quot;left&quot; border=&quot;0&quot;&gt;&lt;/a&gt;

This is the first book I&apos;ve written and I have to admit it was far more of a task than I had first thought it would be. Not that the subject matter was very dense, but there were at least four major changes in the API that required an almost total rewrite of the code base for the examples. There were even a couple of sections that had to get cut because Facebook &quot;fixed&quot; their code because of security and user concerns (plus a lot of folks &quot;abuse&quot; some of the things you could do with the API).  

There are some great Coldfusion resources for building Facebook applications in ColdFusion. The &quot;un-official&quot; ColdFusion client library for Facebook apps is at &lt;a href=&quot;http://www.nearpersonal.com/code/&quot;&gt;nearpersonal&lt;/a&gt;. There is also a RIAForge starter project for FBML (FBML is a tag based language for Facebook) named &lt;a href=&quot;http://fbmlstarter.riaforge.org/&quot;&gt;Facebook FBML Starter Kit&lt;/a&gt;.

So, if you&apos;re wanting a short book that highlights some of the more common elements of developing Facebook applications, be sure to check out my book!
				
				</description>
						
				
				<category>Facebook</category>				
				
				<category>ColdFusion</category>				
				
				<pubDate>Tue, 18 Mar 2008 08:59:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2008/3/18/Facebook-Developer-API</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>SolColdfusion Update</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/10/31/SolColdfusion-Update</link>
				<description>
				
				I&apos;ve been knocked out by a really bad cold the last couple of weeks and I&apos;m just starting to get things back to normal. I did want to send a quick post on the status of the SolColdfusion project. After seeing Ray&apos;s Seeker project, it reminded me that I hadn&apos;t set up a project at RIAForge yet, so I took care of that last night. You can now access the official project at &lt;a href=&quot;http://solcoldfusion.riaforge.org/&quot;&gt;http://solcoldfusion.riaforge.org/&lt;/a&gt;.

Since the project site is up-and-running, I also submitted it to the Solr project &lt;a href=&quot;http://issues.apache.org/jira/browse/SOLR-404&quot;&gt;SOLR-404&lt;/a&gt; (I know it&apos;s a coincidence that it&apos;s number 404, but it makes me wonder if it&apos;s some type of bad omen ;) ...). Anyway, if you think the client should get included in the project, be sure to vote for it!

I&apos;ve also been working on a generic interface, much like Erik Hatcher&apos;s &lt;a href=&quot;http://wiki.apache.org/solr/Flare&quot;&gt;Solr Flare&lt;/a&gt;. It&apos;s been a bit slow coming as I&apos;ve not had a lot of time to work on these projects, but hopefully things will settle down shortly so I can devote a bit more time to them.
				
				</description>
						
				
				<category>Solr</category>				
				
				<category>ColdFusion</category>				
				
				<pubDate>Wed, 31 Oct 2007 13:01:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/10/31/SolColdfusion-Update</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>Solr Schema</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/10/5/Solr-Schema</link>
				<description>
				
				If you&apos;ve ever worked on a project that involved Coldfusion&apos;s bundled version of Verity, you&apos;ve no doubt run into the issue of trying to confine your fields into the structure that Verity imposes, and those custom fields are really precious in these instances. About 6 months ago, I ran into an issue with a search project where I had about 125,000 documents to index. Since we also wanted to be able to use the indexes for some other projects, I was a bit nervous to commit almost the entire allotment of indexable objects to one collection. This launched me into writing a custom search engine and indexer using Lucene and slapping Coldfusion around the responses to do things that Verity did. However, once the projects were complete, I never really got around to making it easy to use. It does cool stuff like search across multiple collections, context highlighting, relevancy calculations, term vector calculations, &quot;did you mean&quot;, etc. Essentially everything I think all good search engines need to be able to do. Something this system lacked was an easy way to define the fields that you wanted indexed (along with a knowledge of Java to actually make the changes). 

The ability to create any number of fields to index in different ways (along with faceting) is a real strong point of Solr. Not only can you add fields and choose how that data is analyzed, you can create your own field types that process the information in your index the way you want them to be. 

This is done in the $SOLR_HOME/config/schema.xml file. The first section (&amp;lt;types&gt;) defines the types of fields that you will be using, and how Solr should process them with Lucene. If you look at some of the fieldtypes, you&apos;ll get an idea of what&apos;s possible. For instance, the fieldtype for &quot;string&quot; is an untokenized field that doesn&apos;t normalize the fields and sorts missing information last. 

&lt;code&gt;
&lt;fieldtype name=&quot;string&quot; class=&quot;solr.StrField&quot; sortMissingLast=&quot;true&quot; omitNorms=&quot;true&quot;/&gt;
&lt;/code&gt;

However, if you need a more robust fieldtype, look at the fieldtype for &quot;text&quot;. This uses a whitespace tokenizer (splits words with whitespace) and uses the stopwords defined in the stopwords.txt file. It does some other processing (filters words, converts them to lowercase, runs a porter stemmer, and then removes duplicates). This fieldtype also defines what to do when a query is passed to it (uses the same filters). This is slightly different than the defined &quot;textTight&quot; which does not perform any further analysis on the text when being queried. You&apos;ll probably find that most of these work for most instances, but if you need to, you can build your own fieldtype that has very specific indexing and query filters.

The next section contains the actual fields you want to use in the aptly named &quot;fields&quot; element. This is where you actually define the fields that will be in your index, the type of analysis to perform on the field, if it should be indexed, stored, have term vectors, or be multivalued (have multiple instances of the same field in the index). 

Let&apos;s say you&apos;re wanting to develop an indexing schema for books (hey, I work in a library). At a very basic level, you&apos;d want a field for an id, title, author, reviews, and a set of topics (or tags). Your fields element would contain something along the lines of:

&lt;code&gt;
&lt;field name=&quot;id&quot; type=&quot;string&quot; indexed=&quot;true&quot; stored=&quot;true&quot;/&gt;
&lt;field name=&quot;title&quot; type=&quot;text&quot; indexed=&quot;true&quot; stored=&quot;true&quot; termVectors=&quot;true&quot; /&gt;
&lt;field name=&quot;titleStr&quot; type=&quot;string&quot; indexed=&quot;true&quot; stored=&quot;false&quot; multiValued=&quot;true&quot;/&gt;
&lt;field name=&quot;author&quot; type=&quot;text&quot; indexed=&quot;true&quot; stored=&quot;true&quot;termVectors=&quot;true&quot; /&gt;
&lt;field name=&quot;authorStr&quot; type=&quot;string&quot; indexed=&quot;true&quot; stored=&quot;false&quot; multiValued=&quot;true&quot;/&gt;
&lt;field name=&quot;review&quot; type=&quot;text&quot; indexed=&quot;true&quot; stored=&quot;true&quot; multiValued=&quot;true&quot;/&gt;
&lt;field name=&quot;topic&quot; type=&quot;text&quot; indexed=&quot;true&quot; stored=&quot;true&quot; multiValued=&quot;true&quot; termVectors=&quot;true&quot;/&gt;
&lt;field name=&quot;topicStr&quot; type=&quot;string&quot; indexed=&quot;true&quot; stored=&quot;false&quot; multiValued=&quot;true&quot;/&gt;
&lt;/code&gt;

You&apos;ll notice that I have a couple of extra fields for title, author, and topic, these are for the faceting info and are just untokenized fields to make the calculations for facets a little more efficient.

Now, we&apos;re almost done with creating the schema. We just need to declare a unique key, default search field, and default search operator. 

&lt;code&gt;
&lt;uniqueKey&gt;id&lt;/uniqueKey&gt;
&lt;defaultSearchField&gt;title&lt;/defaultSearchField&gt;
&lt;solrQueryParser defaultOperator=&quot;OR&quot;/&gt;
&lt;/code&gt;

Remember when I made the fields with the &quot;Str&quot; suffix? We can use a really cool feature of Solr called a &quot;copyField&quot; that literally copies the information from one field to another.

&lt;code&gt;
&lt;copyField source=&quot;author&quot; dest=&quot;authorStr&quot;/&gt;
&lt;copyField source=&quot;title&quot; dest=&quot;titleStr&quot;/&gt;
&lt;copyField source=&quot;topic&quot; dest=&quot;topicStr&quot;/&gt;
&lt;/code&gt;

It&apos;s worth mentioning here that Solr indexes are not databases! While there are some similarities in the way that Solr allows you to add, update, select, store, and delete information from the index, Solr isn&apos;t an RDBMS. I&apos;ve seen a few discussions where there is some confusion as to why Solr can&apos;t do the equivalent of a stored procedure, or some other function of a database. 

Now, your index server is ready to receive documents to search against. The server, in with the above example as the schema, will expect information to be in the following format:

&lt;code&gt;
&lt;doc&gt;
	&lt;field name=&quot;id&quot;&gt;1&lt;/field&gt;
	&lt;field name=&quot;title&quot;&gt;Solr Rocks!&lt;/field&gt;
	&lt;field name=&quot;author&quot;&gt;Barr, Foo&lt;/field&gt;
	&lt;field name=&quot;review&quot;&gt;This book rocks!&lt;/field&gt;
	&lt;field name=&quot;review&quot;&gt;This book is horrible!&lt;/field&gt;
	&lt;field name=&quot;topic&quot;&gt;information retrieval systems&lt;/field&gt;
	&lt;field name=&quot;topic&quot;&gt;xml&lt;/field&gt;
	&lt;field name=&quot;topic&quot;&gt;search&lt;/field&gt;
	&lt;field name=&quot;topic&quot;&gt;apache foundation&lt;/field&gt;
&lt;/doc&gt;
&lt;/code&gt;

Next week when I get some time, I&apos;ll write about creating facet queries...
				
				</description>
						
				
				<category>Apache</category>				
				
				<category>Solr</category>				
				
				<category>ColdFusion</category>				
				
				<category>XML</category>				
				
				<pubDate>Fri, 05 Oct 2007 13:27:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/10/5/Solr-Schema</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>Solr and Coldfusion -- Setting Up</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/10/5/Solr-and-Coldfusion--Setting-Up</link>
				<description>
				
				To get up and running with Solr, you&apos;ll need some type of Servlet container. Typically when folks start talking about servlet containers, they&apos;re talking about &lt;a href=&quot;http://tomcat.apache.org/&quot;&gt;Tomcat&lt;/a&gt; or &lt;a href=&quot;http://jetty.mortbay.org/&quot;&gt;Jetty&lt;/a&gt;. In fact, Solr comes with Jetty 6.1.3 (they haven&apos;t upgraded to 6.1.5 yet in the distribution). You may also hear about &lt;a href=&quot;http://www.caucho.com/&quot;&gt;Resin&lt;/a&gt;, but in my experience, it runs a bit slower than Jetty and Tomcat. As a small note, servlet containers are different than J2EE application servers like JRun, Geronimo, GlassFish, and JBoss (which use servlet containers like Tomcat and Jetty, but also have EJB containers and can handle other types of logic). If you have a J2EE application server running, you can easily use Solr, and if not, consider using Jetty or Tomcat as your container server.

Since your environment can be as varied as there are IT departments, I won&apos;t try to cover everything. Essentially you need to have at least the Java 1.5 JRE. However, I would &lt;strong&gt;strongly&lt;/strong&gt; suggest the most current &lt;a href=&quot;http://java.sun.com&quot;&gt;Java JDK (and not the JRE)&lt;/a&gt; as it has performance enhancements to run in server mode (with -server). If you don&apos;t already have this Java version installed on your server (assuming this is the same server running CF), don&apos;t worry, ColdFusion will still work if you install the required Java runtime. 

Essentially the process for deploying Solr, once you have a servlet container up-and-running is to drop the solr.war file into the webapps directory on the server. It won&apos;t do anything at this point as you need to set the configuration files for Solr. The easiest way to do this is copy the files from example/solr into a new directory (which I will refer to now as solr_home). 

You can tell Java about the home directory by setting the solr.solr.home (-Dsolr.sol.home), set the JNDI lookup (&quot;java:comp/env/solr/home&quot;), or just throw it into the JVM&apos;s working directory (the default path is ./solr). Now you just need to make sure everything is running. Just point your browser to http://&amp;lt;server&gt;:&amp;lt;port&gt;/solr/admin. You should then see the administration interface (you may need to restart your servlet container to get everything working properly), but it&apos;s not an administrative interface like you get in CFAdmin. This is more of an informational administration panel. You can make sure everything is running, that there are documents in your index is set up properly, check out the schema and configuration files, and thread information. Really the only thing you can administer here is the log level. 

For some more specific notes on intalling Solr in &lt;a href=&quot;http://wiki.apache.org/solr/SolrTomcat&quot;&gt;Tomcat&lt;/a&gt; and &lt;a href=&quot;http://wiki.apache.org/solr/SolrJetty&quot;&gt;Jetty&lt;/a&gt;, check out &lt;a href=&quot;http://wiki.apache.org/solr/SolrInstall&quot;&gt;Solr&apos;s wiki&lt;/a&gt;. In particular, if you&apos;re going to need multiple instances of Solr to run, pay attention to the sections on Multipe Solr apps on those wiki pages.
				
				</description>
						
				
				<category>Apache</category>				
				
				<category>Solr</category>				
				
				<category>ColdFusion</category>				
				
				<pubDate>Fri, 05 Oct 2007 12:22:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/10/5/Solr-and-Coldfusion--Setting-Up</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>Coldfusion Solr Client - SolColdfusion</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/10/4/Coldfusion-Solr-Client--SolColdfusion</link>
				<description>
				
				As I hinted at yesterday, I was close to having some code in the pipeline to abstract using Solr. I&apos;ve finished the initial code with the following built in. Here&apos;s a brief setup guide to start playing with the code.

First, you&apos;re going to need to grab the latest &lt;a href=&quot;http://www.apache.org/dyn/closer.cgi/lucene/solr/&quot;&gt;release version of Solr (currently 1.2)&lt;/a&gt;. The only real requirement to run this software is that you have a JRE of 1.5 or higher. Untar/zip the file somewhere convenient and open a command prompt. Get to the example directory in the apache-solr.1.2.x folder (cd &lt;path_to_solr&gt;/example). To start up the sample server running Jetty, just issue the following command:

&lt;code&gt;
java -jar start.jar
&lt;/code&gt;

This will start a new instance of the Solr server on your computer on port 8983. You can make sure this is running by navigating to &lt;a href=&quot;http://localhost:8983/solr&quot;&gt;http://localhost:8983/solr&lt;/a&gt; (NOTE: this is a link to your computer. If you get an error, it&apos;s because your computer isn&apos;t running an instance of Solr on port 8983). 

At this point, it&apos;s probably good to send you over to the Solr website to take a look at &lt;a href=&quot;http://lucene.apache.org/solr/tutorial.html&quot;&gt;their tutorial&lt;/a&gt;. Go ahead. I&apos;ll wait...

...

Great, you&apos;re back. 

You&apos;ve seen some basic inserting, deleting, and querying of Solr index data. You may have also noticed that there are clients for PHP, Ruby, Python, and Java...no ColdFusion. I want to do a little more testing on this before I submit the patch, but I&apos;ve added the initial code as an encosure here to do updating, deleting, and searching in Coldfusion.

The CFC SolColdfusion should be in the path org/apache/client (at least that&apos;s where I&apos;m putting in for the purposes of this initial demonstration). The initialization takes one required parameter (the Solr host) and then has two optional parameters (port and path). 

To set this up, create an instance with

&lt;code&gt;
&lt;cfset solr = createObject(&quot;component&quot;, &quot;org.apache.solr.client.SolColdfusion&quot;).init(&quot;http://localhost&quot;, &quot;8983&quot;, &quot;/solr&quot;) /&gt;
&lt;/code&gt;

Now, there are a lot of different parameters you can send to Solr to perform different queries. And, since some of these key names can repeat, I chose to implement sending these parameters as an array. So, let&apos;s set this up.

&lt;code&gt;
&lt;cfset params = arrayNew(1) /&gt;

&lt;cfset params[1][1] = &quot;indent&quot;&gt;
&lt;cfset params[1][2] = &quot;on&quot; /&gt;
&lt;cfset params[2][1] = &quot;wt&quot;&gt;
&lt;cfset params[2][2] = &quot;standard&quot; /&gt;
&lt;cfset params[3][1] = &quot;fl&quot; /&gt;
&lt;cfset params[3][2] = &quot;*,score&quot; /&gt;
&lt;cfset params[4][1] = &quot;qt&quot; /&gt;
&lt;cfset params[4][2] = &quot;standard&quot; /&gt;
&lt;cfset params[5][1] = &quot;wt&quot; /&gt;
&lt;cfset params[5][2] = &quot;standard&quot; /&gt;
&lt;/code&gt;

These parameters are basically what are the defaults that Solr will return back to you. If you want highlighting, you would need to add two additional row vectors with &apos;hl = on&apos; and &apos;hl.fl = &lt;comma_seperated_fields_to_highlight&gt;&apos;. 

Searching is straight forward, taking a query, the start row, number of rows to return, and the array of parameters:

&lt;code&gt;
&lt;cfset results = solr.search(&quot;*:*&quot;, 0, 10, params) /&gt;
&lt;/code&gt;

This searches all fields and all content and returns back an XML document with the search results in it.

&lt;code&gt;
&lt;cfdump var=&quot;#results#&quot; /&gt;
&lt;/code&gt;

In the result node, you&apos;ll see that Solr returns an xmlAttribute of &lt;code&gt;numFound&lt;/code&gt; of 0 (assuming you don&apos;t have anything in the index). Let&apos;s add an example document from the documents that come with Solr.

&lt;code&gt;
&lt;!--- Create a new sample document ---&gt;
&lt;cfxml variable=&quot;sample&quot;&gt;
&lt;doc&gt;
  &lt;field name=&quot;id&quot;&gt;F8V7067-APL-KIT&lt;/field&gt;
  &lt;field name=&quot;name&quot;&gt;Belkin Mobile Power Cord for iPod w/ Dock&lt;/field&gt;
  &lt;field name=&quot;manu&quot;&gt;Belkin&lt;/field&gt;
  &lt;field name=&quot;cat&quot;&gt;electronics&lt;/field&gt;
  &lt;field name=&quot;cat&quot;&gt;connector&lt;/field&gt;
  &lt;field name=&quot;features&quot;&gt;car power adapter, white&lt;/field&gt;
  &lt;field name=&quot;weight&quot;&gt;4&lt;/field&gt;
  &lt;field name=&quot;price&quot;&gt;19.95&lt;/field&gt;
  &lt;field name=&quot;popularity&quot;&gt;1&lt;/field&gt;
  &lt;field name=&quot;inStock&quot;&gt;false&lt;/field&gt;
&lt;/doc&gt;
&lt;/cfxml&gt;

&lt;!--- add this document to the index ---&gt;
&lt;cfset solr.add(sample) /&gt;
&lt;cfset solr.commit() /&gt;
&lt;cfset solr.optimize() /&gt;

&lt;!--- search for the newly added document ---&gt;
&lt;cfset results = solr.search(&quot;id:F8V7067-APL-KIT&quot;, 0, 10, params) /&gt;

&lt;cfdump var=&quot;#xmlParse(results)#&quot; /&gt;
&lt;/code&gt;

You&apos;ll notice I used a commit and optmize statement. Neither of these statements are necessary every time you add a document, but be aware that Solr caches documents and won&apos;t flush the new documents to disk unless you either commit the documents or the mergefactor setting you used in your solrconfig.xml file has been reached. 

Now, let&apos;s delete this document...

&lt;code&gt;
&lt;cfset solr.deleteById(&quot;F8V7067-APL-KIT&quot;) /&gt;
&lt;cfset solr.commit() /&gt;
&lt;/code&gt;

Don&apos;t forget to commit deletions to the index!

There&apos;ll be more soon (add multiple documents, delete by queries). In the mean time, try it out. If you have any comments, questions, concerns, whatever, let me know.
				
				</description>
						
				
				<category>Apache</category>				
				
				<category>Solr</category>				
				
				<category>Lucene</category>				
				
				<category>ColdFusion</category>				
				
				<pubDate>Thu, 04 Oct 2007 15:19:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/10/4/Coldfusion-Solr-Client--SolColdfusion</guid>
				
				<enclosure url="http://swem.wm.edu/blogs/waynegraham/enclosures/solColdfusion.tar.gz" length="4698" type="application/x-gzip"/>
				
			</item>
			
		 	
			
			
			<item>
				<title>ColdFusion and Solr</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/10/3/ColdFusion-and-Solr</link>
				<description>
				
				I&apos;ve spent the last few months working on some projects that didn&apos;t really have anything to do with ColdFusion (lots of Java and PHP). One of the projects I&apos;ve been working with (&lt;a href=&quot;http://www.vufind.org&quot;&gt;Vufind.org&lt;/a&gt;) uses &lt;a href=&quot;http://lucene.apache.org/solr/&quot;&gt;Solr&lt;/a&gt; as it&apos;s indexing/search engine. That&apos;s starting to get picked up by some pretty big companies (Netflix just relaunched their search using Solr this week). 

I&apos;ve been working with Solr in Java for a bit now, and I wanted to start to build an interface for using it as a search engine (my Lucene code is stuck in open source limbo) in Coldfusion. One of the cool things about Solr is that it returns results back through HTTP (in XML, JSON, or ruby). 

As soon as I get the code finished, I&apos;ll post it as a patch in Solr.
				
				</description>
						
				
				<category>Lucene</category>				
				
				<category>Solr</category>				
				
				<category>ColdFusion</category>				
				
				<pubDate>Wed, 03 Oct 2007 15:38:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/10/3/ColdFusion-and-Solr</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>ColdFusion and Lucene</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/5/8/ColdFusion-and-Lucene</link>
				<description>
				
				It seems that every couple of years someone has need for some aspect of an information retrieval (IR) system with features that ColdFusion&apos;s bundled Verity IR doesn&apos;t have (see &lt;a href=&quot;http://www.cflucene.org/cflucene/index.cfm?event=showContent&amp;ident=faq&quot;&gt;cflucene&lt;/a&gt; and &lt;a href=&quot;http://cephas.net/blog/index.php?s=lucene&quot;&gt;Aaron Johnson&apos;s blog&lt;/a&gt;). I too ran into a situation that called for investigating alternatives to Verity.

Our Special Collections Research Center has used 3x5 index cards to catalog their archives and manuscript collections. There are myriad problems with a hard-copy index catalog (that&apos;s why we use computers right?), so we started a process of scanning the cards and running them through OCR software (we&apos;re using ABBYY). My original thought was just to dump these all in a location and index them with Verity. All was going well until I realized we had a little more than 110,000 of these individual files which was pushing the 150,000 document limit for our version of ColdFusion. I also knew that there were some other projects to digitize back-issues of student newspapers that would push this document count higher. 

&lt;more /&gt;

We had a few choices to make, upgrade our current license to the Enterprise Edition which has a 250,000 document limit, purchase a Google Search Appliance, or find a relatively easy-to-implement IR engine. Each had its own pros and cons, and we initially leaned toward purchasing the Enterprise license. I also have a forthcoming project (hopefully) that will take traditional structured data and pair it with unstructured data reports. For that, I need something that is capable of indexing pretty much anything. However, as funding in academic libraries is challenging at times, I started poking around with Lucene. 

I remembered that a few years ago, Macromedia had released some code called lindex in its DRK 3. I found the CD and tried it out. I noticed that it was built on the 1.2 release of Lucene and used an old version of PDFBox to extract text from PDFs. Since there have been some significant improvements to Lucene since version 1.2 (the most current version is 2.1), I thought I would try replacing the lucene-core jar file with the more recent one, but that just led to a heap of problems. I kept looking around, but most of the projects out there haven&apos;t kept their code up to date with Lucene, so I figured I&apos;d start playing around with it.

I&apos;ll be doing a series of posts on indexing, implementing different parts of not only &lt;a href=&quot;http://lucene.apache.org/&quot;&gt;Lucene&lt;/a&gt; and some of its contributed modules, but some third party software to help create thematic categories in the search results (http://demo.carrot2.org/demo-stable/main) and index management using ColdFusion.

For the time being, &lt;a href=&quot;http://swem.wm.edu/beta/flathat/&quot;&gt;here&apos;s a demo of the search engine&lt;/a&gt; I&apos;m working on. It is an index of William and Mary&apos;s student newspaper, The Flat Hat, from 1939 to 1950.
				
				</description>
						
				
				<category>Lucene</category>				
				
				<category>Java</category>				
				
				<category>ColdFusion</category>				
				
				<pubDate>Tue, 08 May 2007 11:27:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/5/8/ColdFusion-and-Lucene</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>Her Royal Majesty</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/4/30/Her-Royal-Majesty</link>
				<description>
				
				Ok, it&apos;s been a while. I&apos;ve been working on some non-ColdFusion projects so I&apos;ve been a bit remiss in keeping things up-to-date here. But big news (at least for William and Mary). Her Royal Majesty, Queen Elizabeth II will be visiting William and Mary this Friday as part Jamestown&apos;s 400th anniversary. 

Normally this wouldn&apos;t be related to ColdFusion at all, but last week it was announced that she would be coming here and thus ensued a mad rush to get an &lt;a href=&quot;http://swem.wm.edu/exhibits/queen/&quot;&gt;online exhibit of the Queen&apos;s 1957 visit up&lt;/a&gt; that would make available pretty much everything we have stored in our Special Collections from her previous visit. I got to use Model-Glue after a long time of not looking at it. I had almost forgotten how painless Model-Glue makes putting projects like this together.

I just wanted to give a big thanks to Joe Rinehart and Doug Hughes  for bringing the &quot;rapid&quot; back into rapid application development!
				
				</description>
						
				
				<category>ColdFusion</category>				
				
				<category>modelglue</category>				
				
				<pubDate>Mon, 30 Apr 2007 08:30:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/4/30/Her-Royal-Majesty</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>Adding COinS for BlogCFC</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/3/7/Adding-COinS-for-BlogCFC</link>
				<description>
				
				A few months ago, I was up at George Mason&apos;s &lt;a href=&quot;http://chnm.gmu.edu/&quot;&gt;Center for History and New Media&lt;/a&gt; talking to them about how they were approaching their online projects. One of the things they introduced me to was &lt;a href=&quot;http://www.zotero.org/&quot;&gt;Zotero&lt;/a&gt;, a Firefox extension to &quot;collect, manage, and cite your research sources.&quot; While it was mostly developed to help researchers manage citations online, I started using it to start organize blog entries, adding my own notes, tags, and relating them to other blog entries. 

Unfortunately, I&apos;ve been adding these pages manually. And, with about two lines of code, you can add meta-data to not only open your entries in BlogCFC up to Zotero, but other harvesters. 

First, a quick note about COinS.
 
&lt;more /&gt;

&lt;a href=&quot;http://ocoins.info/&quot;&gt;COinS&lt;/a&gt; is an acronymn for ContextObjects in Spans), it&apos;s basically a micro-format for embedding bibliographic information in HTML for the &lt;a href=&quot;http://www.niso.org/standards/standard_detail.cfm?std_id=783&quot;&gt;NISO OpenURL Standard Z39.88-2004&lt;/a&gt;. What makes this really nice is that you no longer have to hunt down how to cite information for your specific format (Chicago, APA, etc.), the information is embedded in the HTML and software then properly formats the citations for you. 

A blog entry is a reasonably straight-forward citation. You want the URL, title of the entry, the author, and date. So, to add to BlogCFC, open the index.cfm page and where the loop for the categories is located (around line 82), simply add a new String of the categories:

&lt;code&gt;
	&lt;cfloop item=&quot;catid&quot; collection=&quot;#categories#&quot;&gt;
		&lt;!--- added line ---&gt;
		&lt;cfset cats = cats &amp; &quot;&amp;amp;rft.subject=&quot; &amp; #categories[currentRow][catid]# /&gt;
		
		&lt;cfoutput&gt;&lt;a href=&quot;#application.blog.makeCategoryLink(catid)#&quot;&gt;#categories[currentRow][catid]#&lt;/a&gt;&lt;cfif catid is not lastid&gt;,&lt;/cfif&gt;&lt;/cfoutput&gt;
	&lt;/cfloop&gt;
&lt;/code&gt;

Now, you just need to add the span. I added mine directly under the close of the above loop. 

&lt;code&gt;
&lt;cfoutput&gt;
	&lt;span class=&quot;Z3988&quot; title=&quot;&quot;ctx_ver=Z39.88-2004&amp;amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;amp;rft.title=#replace(title, &quot; &quot;, &quot;+&quot;, &quot;ALL&quot;)##cats#&amp;amp;rft.creator=#application.blog.getProperty(&quot;owneremail&quot;)#&amp;amp;rft.date=#dateFormat(posted, &quot;yyyy-mm-dd&quot;)#&amp;amp;rft.type=blogPost&amp;amp;rft.format=text&amp;amp;rft.identifier=#application.blog.makeLink(id)#&amp;amp;rft.source=#application.blog.getProperty(&quot;blogDescription&quot;)#&amp;amp;rft.language=English&quot;&gt;&lt;/span&gt;
&lt;/cfoutput&gt;
&lt;/code&gt;

You&apos;ll notice there&apos;s nothing in the span, so it doesn&apos;t actually display on the screen, but is there for OpenURL aware programs like Zotero. If you refresh your cache, you&apos;ll then see a button in the address bar (assuming Zotero is installed) that will then add the blog entry to your collection.

Not using BlogCFC? There&apos;s &lt;a href=&quot;http://dev.zotero.org/docs/wordpress?s=wordpress&quot;&gt;a few WordPress plugins&lt;/a&gt; too, though I&apos;ve not seen anything for Blogger yet.
				
				</description>
						
				
				<category>Blog</category>				
				
				<category>Web</category>				
				
				<category>ColdFusion</category>				
				
				<pubDate>Wed, 07 Mar 2007 08:24:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/3/7/Adding-COinS-for-BlogCFC</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>MS Access via JDBC</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/2/2/MS-Access-via-JDBC</link>
				<description>
				
				We recently made the move from an IIS Windows web server to an Apache *nix based web server as part of our efforts to consolidate our library&apos;s server infrastructure. And for reasons I won&apos;t expound upon, we had one MS Access DSN that didn&apos;t get migrated to MSSQL and that needed to be used still. Since ColdFusion uses a Windows only driver for MS Access, I needed to figure out a way around this. I found a couple of JDBC drivers for Access (Easysoft&apos;s &lt;a href=&quot;http://www.easysoft.com/products/data_access/jdbc_odbc_bridge/index.html&quot;&gt;JDBC-ODBC Bridge&lt;/a&gt; and HXTT&apos;s &lt;a href=&quot;http://www.hxtt.com/access.html&quot;&gt;Access Pure Java JDBC Drivers&lt;/a&gt;), but these seemed to be a bit on the expensive side for the short amount of time that I&apos;d need to keep Access in production. 

I did notice on Easysoft&apos;s website that they were using the JdbcOdbc bridge, so after a little bit more digging, I found the syntax to use configure ColdFusion to use MS Access through the JdbcOdbc Bridge; the JDBC URL is

&lt;code&gt;
	jdbc:odbc:Driver={Microsoft Access Driver (*.mdb)};DBQ=/path/to/datasource.mdb;DriverID22;
&lt;/code&gt;

and the Driver Class

&lt;code&gt;
	sun.jdbc.odbc.JdbcOdbcDriver
&lt;/code&gt;

For the very basic inserting of data from a seldom-used web form into a single table, this band aid fix has been doing pretty good!
				
				</description>
						
				
				<category>Java</category>				
				
				<category>Server</category>				
				
				<category>Linux</category>				
				
				<category>ColdFusion</category>				
				
				<pubDate>Fri, 02 Feb 2007 15:23:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2007/2/2/MS-Access-via-JDBC</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>Form Validation</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2006/9/21/Form-Validation</link>
				<description>
				
				I&apos;ve been playing with some form validation stuff for CF. I had been usign &lt;cfform&gt;, but I wanted the HTML interface to act a bit more like the Flash interface, but I don&apos;t really want to use Flash. I&apos;ve also been doing a lot more work with some of the DHTML libraries that AJAX has made popular, so I figured there had to be a relatively elegent way to do form validations with something like &lt;a href=&quot;http://prototype.conio.net/&gt;prototype&lt;/a&gt;.

I remembered seeing something on &lt;a href=&quot;http://ajaxian.com/archives/really-easy-field-validation-with-prototype&quot;&gt;Ajaxian&lt;/a&gt; about easy form validation and decided to give it a try. The article on &lt;a href=&quot;http://tetlaw.id.au/view/blog/really-easy-field-validation-with-prototype/&quot;&gt;Dexagogo&lt;/a&gt; shows how they created a library to handle form-validations that doesn&apos;t require any other work than creating a form. This was just what I was looking for!

Basically, you just need the latest files from &lt;a href=&quot;http://script.aculo.us/&quot;&gt;script.aclo.us&lt;/a&gt; with the latest prototype version (the 1.5 release candidate is included in the latest script.aculo.us lib folder), and the validation library. Convienently, they&apos;re all included &lt;a href=&quot;http://tetlaw.id.au/upload/dev/validation/validation1.5.3.zip&quot;&gt;in the demo file on the site&lt;/a&gt;.

&lt;more /&gt;

To use this, you really only need prototype and the validation library (script.aculo.us adds a nice effect &amp;ndash; much like the Flash format in cfform). For me, I made these calls:

&lt;code&gt;
&lt;head&gt;
	&lt;script type=&quot;text/javascript&quot; src=&quot;/scripts/scriptaculous/lib/prototype.js&quot;&gt;&lt;/script&gt;
	&lt;script type=&quot;text/javascript&quot; src=&quot;/scripts/validator.js&quot;&gt;&lt;/script&gt;
	&lt;script type=&quot;text/javascript&quot; src=&quot;/scripts/scriptaculous/scriptaculous.js?load=effects&quot;&gt;&lt;/script&gt;
&lt;/head&gt;
&lt;/code&gt;

This is slightly different than the example on their page; they load the effects.js file directly, I&apos;m calling the library via script.aculo.us with the load parameter. This isn&apos;t really a big deal for one library, but it is convenient when you want to use several, but not all, of the libraries (e.g. scriptaculous.js?load=effects,dragdrop,slider).

Anyway, to actually use this, you need to create a form with an id attribute:

&lt;code&gt;
&lt;form name=&quot;feedback&quot; id=&quot;feedback&quot; action=&quot;#cgi.script_name#&quot; method=&quot;post&quot;&gt;
...
&lt;/form&gt;
&lt;/code&gt;

Now, we add some fields and use the class attribute to call the validator:

&lt;code&gt;
Name: &lt;input type=&quot;text&quot; name=&quot;name&quot; id=&quot;name&quot; class=&quot;required&quot; /&gt;&lt;br/&gt;
Email: &lt;input type=&quot;text&quot; name=&quot;email&quot; id=&quot;email&quot; class=&quot;required validate-email&quot; /&gt;
&lt;input type=&quot;submit&quot; /&gt;
&lt;/code&gt;

There are 11 options for use in the validation library (this is directly off their page):

&lt;ul&gt;
	&lt;li&gt;required (not blank)&lt;/li&gt;
	&lt;li&gt;validate-number (a valid number)&lt;/li&gt;
	&lt;li&gt;validate-digits (digits only)&lt;/li&gt;
	&lt;li&gt;validate-alpha (letters only)&lt;/li&gt;
	&lt;li&gt;validate-alphanum (only letters and numbers)&lt;/li&gt;
	&lt;li&gt;validate-date (a valid date value)&lt;/li&gt;
	&lt;li&gt;validate-email (a valid email address)&lt;/li&gt;
	&lt;li&gt;validate-url (a valid URL)&lt;/li&gt;
	&lt;li&gt;validate-date-au (a date formatted as; dd/mm/yyyy)&lt;/li&gt;
	&lt;li&gt;validate-currency-dollar (a valid dollar value)&lt;/li&gt;
	&lt;li&gt;validate-one-required (At least one textbox/radio element must be selected in a group)&lt;/li&gt;
&lt;/ul&gt;

This is really nice, because if you want to allow an optional field, but validate it, you can do:

&lt;code&gt;
&lt;input type=&quot;text&quot; name=&quot;email&quot; class=&quot;validate-email&quot; /&gt;
&lt;/code&gt;

There&apos;s one more piece of the pie...to call the validation library. At the bottom of your page add:

&lt;code&gt;
&lt;script type=&quot;text/javascript&quot;&gt;
	new Validation(&apos;feedback&apos;, {immediate:true});
&lt;/script&gt;
&lt;/code&gt;

The first argument is the id attribute of the form you&apos;re wanting to validate. The second tells the Validator object what to do. This particular example enables validation on each field as you leave it (which I find useful). Some of the other options are:

&lt;ul&gt;
	&lt;li&gt;stopOnFirst (boolean): Stop on the first validation failure; default: false&lt;/li&gt;
	&lt;li&gt;onSubmit (boolean): Override the default behavior of adding an even listener to the onsubmit event (set to false if you want to make sure your onsubmit method gets called no matter what); default: true&lt;/li&gt;
	&lt;li&gt;immediate (boolean): validate when the cursor leaves the field; default: false&lt;/li&gt;
	&lt;li&gt;focusOnError (boolean): place the focus on the first field with an error; default: true&lt;/li&gt;
	&lt;li&gt;useTitles (boolean): make field validators use form element title attributes as error advice message; default: false&lt;/li&gt;
	&lt;li&gt;onFormValidate (string function): call a function when the form is validated&lt;/li&gt;
	&lt;li&gt;onElementValidate (string function): call a function when an element is validated&lt;/li&gt;
&lt;/ul&gt;

What I thought was really cool was the ability to add custom validation types via an API. Say you only want folks to use capital letters for their names, you simply add a new validation type like:

&lt;code&gt;
&lt;script type=&quot;javascript&quot;&gt;
	Validation.add(&apos;validate-ucase&apos;, &apos;Please only use upper-case letters (A-Z) in this field.&apos;, function(v){
		return Validation.get(&apos;IsEmpty&apos;).test(v) || /^[A-Z]+$/.test(v);
	}
&lt;/script&gt;
&lt;/code&gt;

Want to add several? You can do that too:

&lt;code&gt;
&lt;script type=&quot;javascript&quot;&gt;
	Validation.addAllThese([
		[&apos;validate-lcase&apos;, &apos;Please only use lower-case (a-z) letters in this field&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^[a-z]+$/.test(v);
		}],
		[&apos;validate-zip&apos;, &apos;Please check your zip code&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^(\d{5})(( |-)?(\d{4}))?$/.test(v);
		}],
		[&apos;validate-phone&apos;, &apos;Please check your phone number&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^(([0-9]{3}-)|\([0-9]{3}\) ?)?[0-9]{3}-[0-9]{4}$/.test(v);
		}],
		[&apos;validate-ssn&apos;, &apos;Please check the Social Security Number. It should follow the format 999-99-9999&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^([0-9]{3}(-?)[0-9]{2}(-?)[0-9]{4})$/.test(v);
		}],
		[&apos;validate-ip&apos;, &apos;Please check the IP address&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])$/.test(v);
		}], 
		[&apos;validate-uuid&apos;, &apos;Please check the UUID&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{16}$/.test(v);
		}],
		[&apos;validate-guid&apos;, &apos;Please check the GUID&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^[0-9a-f]{8,8}-[0-9a-f]{4,4}-[0-9a-f]{4,4}-[0-9a-f]{4,4}-[0-9a-f]{12,12}]$/.test(v);
		}],
		[&apos;validate-float&apos;, &apos;Please only use floating point numbers in this field&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^(\b[0-9]+\.([0-9]+\b)?|\.[0-9]+\b)$/.test(v);
		}],
		[&apos;validate-visa&apos;, &apos;Please check your credit card number&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^4\d{3}-?\d{4}-?\d{4}-?\d{4}$/.test(v);
		}],
		[&apos;validate-mastercard&apos;, &apos;Please check your credit card number&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^5[1-5]\d{2}-?\d{4}-?\d{4}-?\d{4}$/.test(v);
		}],
		[&apos;validate-discovery&apos;, &apos;Please check your credit card number&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^6011-?\d{4}-?\d{4}-?\d{4}$/.test(v);
		}],
		[&apos;validate-amex&apos;, &apos;Please check your credit card number&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^3[4,7]\d{13}$/.test(v);
		}],
		[&apos;validate-diners&apos;, &apos;Please check your credit card number&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^3[0,6,8]\d{12}$/.test(v);
		}],
		[&apos;validate-time&apos;, &apos;Please only use time in this field&apos;, function(v){
			return Validation.get(&apos;IsEmpty&apos;).test(v) || /^\d{1,2}[:]\d{2}([:]\d{2})?( [aApP][mM]?)?$/.test(v);
		}]
]);
&lt;/script&gt;
&lt;/code&gt;

This should be all of the normal items included in &lt;cfform&gt; (plus a couple extra for good measure). Now the only thing left is to make it look pretty. One of the nice things about the Flash format in &lt;cfform&gt; is that it color codes required fields with the different halo effects. To obtain a similar effect in the forms, we&apos;ll use style sheets instead.

This is a rather light stylesheet, but it&apos;ll give you something to start with (based on the default haloGreen skin):
 
 &lt;code&gt;
 input.required, textarea.required {
	border: 1px solid #ffbf2b;
 }
 input.validation-failed, textarea.validation-failed{
	border: 1px solid #ff3300;
	color: #ff3300;
 }
 input.validation-passed, textarea.validation-passed{
	border: 1px solid #00cc00;
	color: #000;
 }
 .validation-advice {
	margin: 5px 0;
	padding: 5px;
	background-color: #FF3300;
	color: #fff;
	font-weight: bold;
}
.custom-advice {
	margin: 5px 0;
	padding: 5px;
	background-color: #c8aa00;
	color: #fff;
	font-weight:bold;
}
&lt;/code&gt;
 
I made a short example of some of the validations at &lt;a href=&quot;http://swem.wm.edu/blogs/waynegraham/examples/validation/&quot;&gt;http://swem.wm.edu/blogs/waynegraham/examples/validation/&lt;/a&gt;. I have to say that I&apos;ve found this to be a bit better solution (at least for my needs) than using &lt;cfform&gt;!
				
				</description>
						
				
				<category>JavaScript</category>				
				
				<category>Web</category>				
				
				<category>ColdFusion</category>				
				
				<category>AJAX</category>				
				
				<pubDate>Thu, 21 Sep 2006 09:17:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2006/9/21/Form-Validation</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>Real Life XSLT 2.0 transformations</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2006/8/23/Real-Life-XSLT-20-transformations</link>
				<description>
				
				I ran into a bit of a situation that was really blowing my mind. I have a rather large XML file (around 20,000+ lines) marked up in TEI that I wanted to do some transformations on (a day book and ledger from the 1850s). Essentially the code follows the format

&lt;code&gt;
...
&lt;figure&gt;
	&lt;head&gt;Page 12&lt;/head&gt;
	&lt;graphic url=&quot;0023_p12&quot;/&gt;
&lt;/figure&gt;

&lt;fw type=&quot;header&quot; place=&quot;top-center&quot;&gt;
	&lt;name type=&quot;place&quot; key=&quot;7022220&quot;&gt;Williamsburg&lt;/name&gt;,
	&lt;date value=&quot;1850&quot;&gt;1850&lt;/date&gt;,
&lt;/fw&gt;

&lt;table&gt;
	&lt;row&gt;
		&lt;cell&gt;
			&lt;date value=&quot;1850-10-03&quot;&gt;&lt;choice&gt;&lt;abbr&gt;Oct&lt;hi rend=&quot;sup;underline&quot;&gt;r&lt;/hi&gt;&lt;/abbr&gt;&lt;expan&gt;October&lt;/expan&gt;&lt;/choice&gt; 3&lt;hi rend=&quot;sup&quot;&gt;th&lt;/hi&gt; 1850&lt;/date&gt;
		&lt;/cell&gt;
		&lt;cell&gt;
			&lt;name type=&quot;person&quot; key=&quot;griffss01&quot;&gt;Doct&lt;hi rend=&quot;sup;underline&quot;&gt;r&lt;/hi&gt; S S Griffin&lt;/name&gt;
		 &lt;/cell&gt;
		 &lt;cell&gt;&amp;nbsp;&lt;/cell&gt;
	&lt;/row&gt;
	...
&lt;/table&gt;
&lt;pb/&gt;
...
&lt;/code&gt;

What I wanted to accomplish was group all this together in separate divs for HTML output (ok, I actually need to write each page to its own file, but this is pretty much just one more step). 

I just could not find a way to group this info this way using XSLT 1 without wrapping each page within its own div structure. I didn&apos;t really want to go back and do this, so I asked the TEI-L list. David Sewell pinged me back with some XQuery code that recursively recalls the document structure for a given node. 

He also mentioned that it would be pretty easy to write an XSLT 2 transformation that groups these nodes together. I did a little bit of digging and came up with

&lt;code&gt;
&lt;xsl:template match=&quot;tei:div&quot;&gt;
	&lt;xsl:for-each-group select=&quot;*&quot; group-ending-with&quot;tei:pb&quot;&gt;
		&lt;div class=&quot;page&quot;&gt;
			 &lt;xsl:apply-templates select=&quot;current-group()&quot; /&gt;
		&lt;/div&gt;
	&lt;/xsl:for-each-group&gt;
&lt;/xsl:template&gt;
&lt;/code&gt;

This transformed the pages to what I was wanting

&lt;code&gt;
&lt;div class=&quot;page&quot;&gt;
	&lt;img src=&quot;0023_12.png&quot; alt=&quot;Page 12&quot; /&gt;
	
	&lt;h1 class=&quot;fw&quot;&gt;Williamsburg, 1850,&lt;/h1&gt;
	
	&lt;table&gt;
		&lt;tr&gt;
			&lt;td&gt;
			&lt;span class=&quot;abbr&quot;&gt;Oct&lt;sup&gt;&lt;u&gt;r&lt;/u&gt;&lt;/sup&gt;&lt;/span&gt;&lt;span class=&quot;expan&quot;&gt;October&lt;/span&gt; 3&lt;sup&gt;th&lt;/sup&gt; 1850&lt;/date&gt;
		&lt;/td&gt;
		&lt;td&gt;
			&lt;a href=&quot;javascript:getName(&apos;griffss01&apos;);&gt;Doct&lt;sup&gt;&lt;u&gt;r&lt;/u&gt;&lt;/sup&gt; S S Griffin&lt;/a&gt;
		 &lt;/td&gt;
		 &lt;td&gt;&amp;nbsp;&lt;/td&gt;
	&lt;/row&gt;
	...
&lt;/table&gt;
&lt;/div&gt;

&lt;div class=&quot;page&quot;&gt;
	...
&lt;/div&gt;
&lt;/code&gt;

The XSLT processor for ColdFusion doesn&apos;t support XSLT 2.0 (it&apos;s still a draft spec). However, Saxon does (specifically Saxon 8). For more on doing XSLT transformations, see &lt;a href=&quot;http://swem.wm.edu/blogs/waynegraham/index.cfm/2005/11/21/XSLT-20-in-ColdFusion&quot;&gt;XSLT 2.0 in ColdFusion&lt;/a&gt;.
				
				</description>
						
				
				<category>XSLT</category>				
				
				<category>Web</category>				
				
				<category>ColdFusion</category>				
				
				<category>XML</category>				
				
				<pubDate>Wed, 23 Aug 2006 11:56:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2006/8/23/Real-Life-XSLT-20-transformations</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>Getting XML from MSSQL Server</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2006/8/21/Getting-XML-from-MSSQL-Server</link>
				<description>
				
				I&apos;ve been playing with &lt;a href=&quot;http://openrico.org/&quot;&gt;all&lt;/a&gt; &lt;a href=&quot;scriptaculous&quot;&gt;the&lt;/a&gt; &lt;a href=&quot;http://labs.adobe.com/technologies/spry/&quot;&gt;AJAX&lt;/a&gt; &lt;a href=&quot;http://dojotoolkit.org/&quot;&gt;stuff&lt;/a&gt; &lt;a href=&quot;http://developer.yahoo.com/yui/&quot;&gt;that&apos;s&lt;/a&gt; &lt;a href=&quot;http://code.google.com/webtoolkit/&quot;&gt;been&lt;/a&gt; &lt;a href=&quot;http://mochikit.com/&quot;&gt;coming&lt;/a&gt; &lt;a href=&quot;http://www.aflax.org/&quot;&gt;out&lt;/a&gt; &lt;a href=&quot;http://jquery.com/&quot;&gt;lately&lt;/a&gt;. I suppose that like a lot of folks, I was creating a query, then having a generic function that created the XML in a proxy file for the JavaScript (&lt;a href=&quot;http://ray.camdenfamily.com/index.cfm/2006/7/13/ToXML-Update&quot;&gt;Ray Camden has a really nice function for transforming a query to XML&lt;/a&gt;). 

Last week I was doing some research to find a way to do some XML searching and stumbled upon the &lt;a href=&quot;http://msdn2.microsoft.com/en-us/library/ms190922.aspx&quot;&gt;FOR XML&lt;/a&gt; statement. I knew that most RDBMSs were capable of dealing with XML record sets, but it&apos;s been years since I&apos;ve even looked at any of the XML stuff for MSSQL. 

The FOR XML statement returns a query result and transforms rows into XML elements. There are three arguments that this can take:

&lt;ul&gt;
	&lt;li&gt;&lt;a href=&quot;http://msdn2.microsoft.com/en-us/library/ms175140.aspx&quot;&gt;RAW&lt;/a&gt;: Transforms each row into an element with a generic identifier (&amp;lt;row/&gt;) as the element tag.&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;http://msdn2.microsoft.com/en-us/library/ms188273.aspx&quot;&gt;AUTO&lt;/a&gt;: Returns the results in a simple nested XML tree&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;http://msdn2.microsoft.com/en-us/library/ms175140.aspx&quot;&gt;EXPLICIT&lt;/a&gt;: Allows you to define the XML tree returned&lt;/li&gt;
&lt;/ul&gt;
				 [More]
				</description>
						
				
				<category>ColdFusion</category>				
				
				<category>AJAX</category>				
				
				<category>XML</category>				
				
				<pubDate>Mon, 21 Aug 2006 11:49:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2006/8/21/Getting-XML-from-MSSQL-Server</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>MG Unity</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2006/6/23/MG-Unity</link>
				<description>
				
				It&apos;s been a while since I&apos;ve written on this blog...I&apos;m attempting to keep this one on CF. Anyway, MG Unity has come out and I have to admit that I&apos;m REALLY glad it has. 

I&apos;ve been working on a project with five academics from four different institutions looking at the vernacular architecture of the Colonial Chesapeake to 1720 for an article in the William and Mary Quarterly for the 2007 celebration of Jamestown. The objective was to write an scholarly article that looks at everything that has been excavated and see if the arguments of past scholars still hold. Not only were the scholars looking at individual sites as an aggregate, but wanted to track changes to the structures over time (additions, fireplaces, cellars, etc.). 

I knew this was going to be a challenging project from the beginning, so I attempted to set the expectations early for the application development cycle. I thought it a fluke that I actually got them all to aggree on a set of important fields and tables before I started coding anything (a first for me). However, as I got into the project a bit more, requests for additional fields here, moving this data to this table, all while attempting to support constant input into the application got to be a bit more than was really feasible for a &quot;spare-time&quot; project. 

I had set out to use as good of coding practices as I could. Each table had its own DAO, gateway, TOs, validators, etc. However, each change to these fields made me keep mucking around in these files and the forms calling the objects. After a while of making constant changes, I fell back on some old &quot;bad&quot; practices and kind of strong-armed some of the solution with spaghetti code...and I hated myself for it because I knew that I would have to come back at a later date to fix it.

About this time Joe started doing some work with Arf! And Doug Hughes started work on Reactor. I continued to code in my bad style, but since I knew what I had done, I just kept doing it.

After the project members presented their paper, I set the project down for a while since my daughter had just been born. Since Joe brought out Unity, I decided now was as good a time as any to pick up the refactoring of the project. 

All I can say is that what I&apos;ve done in half-a-day with Unity would take me three- to four-times longer my old way. The scaffolds (once I figured out how they were working) have been an absolute godsend for the rather complex relationships between phases of construction and the overall archaeological and architectural record. The Reactor syntax is so easy (especially since ColdSpring seperates all the configuration) that everything just clicks. 

The entire framework is just so intuitive (at least compared to my previous experiences with frameworks). The ability to not have to worry any more about building the basic web pages, forms, CRUD, and displays makes Unity a pleasure. Also, changes will be a breeze compared to what they entailed a couple of months ago, which will allow me to do some of the cool stuff I had planned with Google Maps and Google Earth to map out the locations of these archeaological/historical sites!
				
				</description>
						
				
				<category>Web</category>				
				
				<category>ColdFusion</category>				
				
				<category>modelglue</category>				
				
				<pubDate>Fri, 23 Jun 2006 14:27:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2006/6/23/MG-Unity</guid>
				
			</item>
			
		 	
			
			
			<item>
				<title>Copyright and Intellectual Property</title>
				<link>http://swem.wm.edu/blogs/waynegraham/index.cfm/2006/1/17/Copyright-and-Intellectual-Property</link>
				<description>
				
				Recently there has been a lot of noise about Ray Camden&apos;s BlogCFC being re-branded and redistributed without attribution (really the only thing he actually asks for in his documentation). This has been an issue I&apos;ve been wrestling with on a couple of projects here at the College as we continue to move toward digitizing different collections.

The first issue to deal with is the idea of copyright. There are specific criteria a work must show to qualify for copyright status (taken from &lt;a href=&quot;http://fairuse.stanford.edu/Copyright_and_Fair_Use_Overview/chapter0/0-a.html#1&quot;&gt;http://fairuse.stanford.edu/Copyright_and_Fair_Use_Overview/chapter0/0-a.html#1&lt;/a&gt;).

&lt;ol&gt;
&lt;li&gt;The work must be &quot;fixed in a tangible medium of expression&quot;&lt;/li&gt;
&lt;li&gt;The work must be original (e.g. independently created by the author). On this point, &quot;it doesn&apos;t matter if an author&apos;s creation is similar to existing works, or even if it is arguably lacking in quality, ingenuity or aesthetic merit&quot; &lt;/li&gt;
&lt;li&gt;The work must be the result of at least some creative effort on the part of its author.&lt;/li&gt;
&lt;/ol&gt;

An additional point to make is the fact that since 1989, a copyright notice has not been necessary to hold the copyright of pretty much anything you create (called the &lt;a href=&quot;http://en.wikipedia.org/wiki/Berne_Convention_for_the_Protection_of_Literary_and_Artistic_Works&quot;&gt;Berne copyright convention&lt;/a&gt;). While you hold copyright as soon as the work is created, you must register the copyright in order to bring suite (&lt;a href=&quot;http://www.fplc.edu/tfield/copynet.htm#aut&quot;&gt;http://www.fplc.edu/tfield/copynet.htm#aut&lt;/a&gt;). In registering copyright for most websites and software, you would actually apply &quot;literary works&quot; as they are compilations of ideas (see &lt;a href=&quot;http://www.copyright.gov/circs/circ1.html#wwp&quot;&gt; http://www.copyright.gov/circs/circ1.html#wwp&lt;/a&gt;).  

I really like Brad Templeton&apos;s explanation in &lt;a href=&quot;http://www.templetons.com/brad/copymyths.html&quot;&gt;10 Big Myths about copyright explained&lt;/a&gt;  (though there are 11) summarizes copyright issues, it&apos;s definitely worth the read. Templeton also points out that the Digital Millennium Rights Act (DMRA) gave some real teeth to copyright law.

Stanford University Libraries has a great guide to &lt;a href=&quot;http://fairuse.stanford.edu/index.html&quot;&gt;Copyright and Faire Use&lt;/a&gt; with a lot of great resources, including a very interesting point about &lt;a href=&quot;http://fairuse.stanford.edu/Copyright_and_Fair_Use_Overview/chapter0/0-f.html&quot;&gt;automatic copyright&lt;/a&gt;. Once someone has copied your work, your copyright is no longer automatic. It&apos;s then up to you to convince a Federal Judge that you are the copyright owner.

While the offending party has removed his bundle of software, I think it important to remember what copyrighted material is, and how not to use it when given a reasonably open license to do what you want with the software. So, if you want to show Ray (Camden) some love, check out his &lt;a href=&quot;http://www.amazon.com/o/registry/2TCL1D08EZEYE&quot;&gt;Amazon Wish List&lt;/a&gt;.
				
				</description>
						
				
				<category>ColdFusion</category>				
				
				<category>Copyright</category>				
				
				<pubDate>Tue, 17 Jan 2006 12:52:00 -0500</pubDate>
				<guid>http://swem.wm.edu/blogs/waynegraham/index.cfm/2006/1/17/Copyright-and-Intellectual-Property</guid>
				
			</item>
			
		 	
			</channel></rss>