= acts_as_background_solr Rails plugin
This plugin extends the functionality of the acts_as_solr plugin to
provide for a disconnected background job that synchronizes data with:
Solr in batch. acts_as_solr works by sending changes do Solr for each
object immediately following any change. While this is nice as changes
are immediately viewable, it has a few drawbacks:

  * Invoking commit on solr requires a new searcher to be opened which
    is slow

  * There is no way to keep track of an object that was saved when
    notification to solr failed (or when the database transaction
    rolled back)

Acts as background solr extends the acts_as_solr plugin to focus on
background processing.

There is one other major changes. Acts as solr calls Model.find for
each result from the result set. Acts as background solr will
reconstitute your objects from the attributes stored in solr
completely avoiding any required database hits when searching against
solr. This requires that you modify schema.xml to store the fields you
are indexing.

== Installation
Use this in place of acts_as_solr as in your models, e.g. 

  acts_as_background_solr

The options :background, :if and :auto_commit will be automatically
overridden.

Each model can track changes in one of two ways:

  * Default: Explicitly log changes from the model using listeners

  * Database triggers: You can instead use db triggers on your model
    tables to track changes. If you use this method, set the option
    :db_triggers => true

Example:

class User < ActiveRecord::Bae

  acts_as_background_solr :additional_fields => [:first_name, :last_name], :exclude_fields => ['encrypted_password'], :db_triggers => true 

end

This plugin depends on the following table structure (this is written
for postgresql):

create sequence solr_sync_records_seq start with 1;
create table solr_sync_records (
  id                integer constraint solr_sync_records_id_pk primary key default nextval('solr_sync_records_seq'),
  model             varchar(50) not null,
  model_id          integer not null,
  created_at        timestamp default now() not null
);

-- common access path
create index solr_sync_records_model_id_idx on solr_sync_records(model, model_id);

To update the actual data stored in Solr, you need to invoke the
following method:

  SolrBatch.process_all

We're using openwfe to schedule this job to run every few minutes, but
any scheduler should work.

This method updates records in bulk, issuing a single commit when
records are updated. The current algorithm updates up to 5,000 records
per model in a single call to this method. The way this works is each
call to SolrBatch.process_all has a default batch size of 500. Each
call will attempt up to 10 iterations, w/ each iteration updating up
to 500 records.

If you want to change the batch size, provide your own implementation
of SolrBatch.process_all

== Authors
Michael Bryzek
mbryzek<at>alum.mit.edu

== Release Information
Released under the MIT license.
