Introducing acts_as_fulltext_indexed

View blog reactions Written on May 1, 2007 by Chris Heald

My first Rails plugin is working! I won’t quite say finished, as it includes no formal tests or migrations, but I’ll get that packaged up in due time.

I recently found myself wanting to move my MySQL database from MyISAM tables to InnoDB tables. The primary reasons for this are:

  1. InnoDB supports transactions
  2. InnoDB supports row-level locking, as opposed to table-level locking
  3. InnoDB supports foreign key constraints

All very, very good things to have. Unfortunately, InnoDB does not support MyISAM’s FULLTEXT indexes. For those of you unfamiliar with MySQL’s fulltext searching, it’s a really slick little piece of work, especially the boolean mode fulltext searches. I wanted to have this functionality available across my database, but it seemed I couldn’t have my transactional, row-level constrained cake and search it, too.

Enter acts_as_fulltext_indexed.

The idea is this: You have a model, and you specify which fields you want indexed. after_save and before_destroy hooks are installed on the model which creates and indexable string and inserts it into a MyISAM table, and then helper methods are provided that let you perform searches for matched content.

Before I go any further, let me specify that this breaks Rails’ database-agnostic approach - it’s a solution specifically tailored for MySQL. Don’t try using this with non-MySQL databases. It just ain’t gonna work. Also, I’ve licensed it under an MIT license, so use, extend, and distribute it at will.

First, install the plugin:

script/plugin install http://svn.digitalsentience.com/svn/rails/acts_as_fulltext_indexed/

I don’t have a migration for it yet, but you’ll need a table like so:

CREATE TABLE `fulltext_indices` (
  `id` int(11) NOT NULL auto_increment,
  `indexable_type` varchar(60) NOT NULL,
  `indexable_id` int(11) NOT NULL,
  `tokens` text NOT NULL,
  PRIMARY KEY  (`id`),
  KEY `indexable_type` (`indexable_type`,`indexable_id`),
  FULLTEXT KEY `tokens` (`tokens`)
) ENGINE=MyISAM;

Then, install acts_as_fulltext_indexed on your model:

class Post < ActiveRecord::Base
    # This causes the :title and :body fields on this model to automatically be indexed
    acts_as_fulltext_indexed [:title, :body]
end

Now you can perform the following operations:

# Will return all posts that have words matching "foobar*"
Post.search("foobar")

# Will return all posts that have words matching "foo* AND bar*"
Post.search("foo bar")

If you need more complex indexing - say you want to be able to index association data on the parent - then you need to override build_index_string. In this example, I’m setting the indexed string for the thread to the concatenation of all the bodies of the posts on the thread, as well as the thread title.

class Thread < ActiveRecord::Base
    has_many :posts
    acts_as_fulltext_indexed

    def build_index_string
        self.title + posts.collect {|post| post.body}.join(" ")
    end
end

Now we can perform:

Thread.search("foo bar")

Which will return all Threads that contain posts whose bodies contain “foo* AND bar*“, or whose title matches “foo* AND bar*“.

Search also takes options such as :order, :limit, and :conditions

Thread.search("foo bar", {:include => [:posts], :order => "threads.created_at asc", :limit => 10})

Finally, search() takes a third parameter. A boolean “transform” specifies whether or not to transform your search string into MySQL boolean search syntax. It’s true by default, but you can turn it off if you want to specify a mode complex search.

# Finds all Threads that contain "foo*" but do NOT contain "bar"
Thread.search("+foo* -bar", {:include => [:posts]}, false)

Happy searching!

17 Responses to “Introducing acts_as_fulltext_indexed”

  1. Hendrik said:

    Nice! Would you say it’s at least somewhat ready-ish for use in a production application?

  2. Chris Heald said:

    Hendrik: I think so, but it’s going to need to be benchmarked against an index table with several million rows to say for sure. I think it’s going to be pretty solid, but I don’t have empirical evidence of it.

  3. Roman Mackovcak said:

    I was excited by the MySQL fulltext index too, but then I observed some difficulties. I did describe them in my article: http://blog.zmok.net/articles/2006/08/14/full-text-search-in-ruby-on-rails. It is almost one year old, not sure if the limitations are still there.

  4. Pau said:

    Nicely done! Unfortunately, I’m having trouble running my tests. When the test db gets built, it fails trying to create an index on the token column. Is anybody else having the same problem?

  5. Nick Poulden said:

    Yep, I’m getting the test database problem too :-/

  6. Nick Poulden said:

    Here’s the fix: put

    config.activerecord.schemaformat = :sql

    in your environment.rb file. Ruby migrations don’t support fulltext indexing

  7. Mika Tuupola said:

    Plugin is missing default code for remove_indexes method. Something like following should work.

    def removeindexes index = FulltextIndex.findbyindexabletypeandindexableid(self.class.tos, self.id) index.destroy if index end

  8. Mika Tuupola said:

    It seems <code> blocks did not get formatted. See code here:

    http://pastie.caboo.se/117867

  9. Rodrigo Etcheto said:

    great job, plugin works very well

  10. muriel said:

    Cool plugin, it only took me two minutes to make it work! Just one question, how can I get the relevance of a result? As in

    SELECT entryID, title, MATCH (title, entry) AGAINST ('foo bar') AS relevance FROM blog_entries WHERE MATCH (E.title, E.entry) AGAINST ('foo bar');

    Thanks, Muriel

  11. The RasterJam Blog » Blog Archive » acts_as_fulltext_indexed said:

    […] text searching, but it only works on MyISAM tables. But by default, Rails uses InnoDB. What to do? Actsasfulltext_indexed to the rescue. In exceedingly few lines of code, it makes easy work of keeping a separate shadow […]

  12. mark said:

    Nice plugin. I love its simplicity (far less code than actsasfulltextable). I wrote a brief blog entry on it. I hit a couple issues and would be happy to provide a patch for your review.

  13. Russ Jones said:

    Any change of modifying this to support single table inheritance (STI)?

  14. mark said:

    I found a pretty bad bug if you use actsasfulltext_indexed in more than one model at a time. Luckily the fix is simple. Details in this blog entry.

  15. Chris Heald said:

    Thanks for the heads up, Mark.

  16. The RasterJam Blog » Blog Archive » Acts_as_fulltext_indexed multi-model bug & fix said:

    […] hit one more issue with the actsasfulltext_indexed plug-in. It results in a subtle data-loss bug so I think it’s worth calling out. It strikes […]

  17. Russ Jones said:

    @Chris Heald

    1. could you put this on github?
    2. could you also post a version history for this plugin?
    3. do you have any plans to add support for STI?
    4. did you already address the issue mark pointed out above?
    5. I love your plugin :)

Leave a Reply

You may use these HTML tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>