Posts Tagged “Rails”

I just discovered the paper The End of an Architectural Era (It’s Time for a Complete Rewrite), about re-designing database software from the ground up.  It contains some unsurprising predictions — “the next decade will bring domination by shared-nothing computer systems, often called grid computing” — and some interesting ideas:

  • Any database smaller than 1 TB will fit entirely in main memory, distributed across multiple machines.
  • We should scrap SQL in favor of “modifying little languages [Ruby, Python, etc.] to include clean embeddings of DBMS access.”  (CouchDB is a good example of this.)
  • A database shouldn’t require an expert to tune and optimize it; instead it should all be automated to “produce a system with no visible knobs.”

In their implementation, H-store, they claim to run over 70,000 transactions per second on a standard benchmark on a modest server, compared to 850/second from a commercial DB tuned by an expert.  They also plan to “move from C++ to Ruby on Rails as our stored procedure language.”  (!)

Comments No Comments »

I finally found out how to do this, from the Rails Routing shortcut by David Black. In the Rails console, do this:

include ActionController::UrlWriter
default_url_options[:host] = 'whatever'

Then you can call your named route methods directly from the console.

Comments 8 Comments »

I am happy to report that AltLaw.org‘s switch to Solr has worked very well. Solr is a RESTful search engine, built on Lucene. The setup was more complicated than just using a search library, but the rewards were worth it.

Before, I was using Ferret, which I still like. It’s a great library, and Dave Balmain has done incredible work producing a fast search engine that integrates well with Ruby. I still use it on other sites. But Solr was a better fit for AltLaw.

With Ferret, I was trying to shoe-horn large, unstructured documents into a system — ActiveRecord, MySQL, and acts_as_ferret — that is better suited to small, structured records. Now I use Solr as both a search engine and a document store, eliminating MySQL from the picture. That, combined with Solr’s built-in caching, has dramatically decreased the server load (from around 2.00 to under 0.30) while visibly improving search performance.

Also, I think it helps that Solr is not integrated with Ruby. The solr-ruby gem is not well documented, but easy to figure out, as it’s just a thin wrapper over the Solr APIs. Having the search engine in a separate process made it easier to separate the indexing & searching part of the code from the web application. As a result, the Rails code shrunk to one-fourth its former size.

Comments No Comments »

Answer: you’re using the latest version of Rails (1.2.3), which slightly changes the syntax of its SQL statements. cached_model relies on a regular expression to match that SQL statement.

To fix: Dive into the source of the cached_model gem, find the file lib/cached_model.rb, and change the first line after def self.find_by_sql to this:

    return super unless args.first =~ /^SELECT \* FROM #{table_name} WHERE \(#{table_name}\.`?#{primary_key}`? = '?(\d+)'?\) *(?:LIMIT 1)?/

And it works!

Comments No Comments »

I decided to run AltLaw.org under a “/v1″ URL prefix. It’s still beta, and the URL structure will likely change in the future. I don’t want to break 160,000 links when that day comes. Fortunately, Mongrel makes this pretty easy with the –prefix option to mongrel_rails.

I added --prefix '/v1' to my mongrel_rails command line. After removing absolute URLs (those not using {:controller=>...}) from my views, everything worked great.

The only problem is, this technique prevents Apache from serving static files without hitting the Mongrel server. A typical mod_rewrite directive like this:

RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f
RewriteRule (.*) $1 [L]

doesn’t work, because the REQUEST_URIs now all start with “/v1″ and the files in “MyRailsApp/public” don’t.

Here’s a simple workaround: make a symlink like this:

cd /path/to/MyRailsApp/public
ln -s . v1

That’s right, “v1″ is a symlink that points back to the “public” directory. I was afraid this might cause an infinite loop, but it doesn’t. Keeping the same mod_rewrite rule above, Apache will serve the static files under “public” even if the request is prefixed with “/v1″. Pretty cool.

Comments No Comments »

I’ve used Darcs as my only version-control system for a while now. When I got into Rails, I naturally wanted to use Capistrano. Unfortunately, Darcs and Capistrano don’t get along too well. Darcs’ file-based repositories don’t mesh well with Capistrano’s assumption that the repository is accessed through a server, a la Subversion.

I ran into problems with cap deploy:update_code because Darcs couldn’t find the repository. After a discussion on the Capistrano mailing list I decided to go my own route.

I have two Darcs repositories, one on my development machine and one on the server. I darcs push patches from development to the server. Then I have my own custom Capistrano tasks to get the latest code from the server’s copy of the repository. This is adapted from the Capistrano sources:

namespace :deploy do
  task :update_code, :except => { :no_release => true } do
    on_rollback { run "rm -rf #{release_path}; true" }

    run("darcs get --partial /home/myapp/repo --repo-name=#{release_path}")
    run("rm -rf #{release_path}/_darcs")

    finalize_update
  end
end

I also had to write a custom finalize_update task to add the links to the shared Rails directories:

namespace :deploy do
  task :finalize_update, :except => { :no_release => true } do
    run <<-CMD
      ln -s #{shared_path}/log   #{latest_release}/log &&
      ln -s #{shared_path}/tmp   #{latest_release}/tmp &&
      ln -s #{shared_path}/data  #{latest_release}/data &&
      ln -s #{shared_path}/index #{latest_release}/index &&
      ln -s #{shared_path}/config/database.yml #{latest_release}/config/database.yml &&
      ln -s #{shared_path}/public/system #{latest_release}/public/system
     CMD
  end
end

Comments No Comments »

Trying to set up Apache2 as a proxy for mongrel on my new Ruby on Rails server, following the instructions on the mongrel site. I kept getting “403 Forbidden” errors on every request. I found a comment that seemed to describe the same situation. Sure enough, editing /etc/apache2/mods-enabled/proxy.conf to change “Deny” to “Allow” fixed the problem. I kept ProxyRequests turned off, which is supposed to prevent my site from becoming an open proxy, but I’m still nervous about removing that “Deny” directive. I wish there were a more authoritative source of information on this.

Comments 3 Comments »

Here’s a question that’s been bugging me for a while: what’s the best way to store information that is a mixture of highly- and loosely-structured data? For example, a collection of documents like Project Posner. Certain attributes of each document like the title, date, and citation fit easily into a normalized relational database model. But the body can only be described with some kind of markup.

I could just use HTML, except for one problem: my documents have to handle footnotes, for which HTML does not provide a tag. (As an aside, footnotes are a pain whether you’re doing web design or typesetting.)

On Project Posner, I compromised: everything is stored in a MySQL database, and the documents table has a “body” column that contains my own made-up XML syntax.

I could, in theory, normalize everything, even individual paragraphs. But that would be a nightmare to code and deadly slow. I could also store everything as XML documents. But then I’d have to reinvent all the facilities that MySQL (and ActiveRecord) provide, like transaction handling, auto-incrementing IDs, and so on.

For another project, I’m trying to create a pseudo-database that stores everything as XML files and uses Ferret for searching. I was going to use Ferret for full-text search anyway, so my original thought was to save overhead by not bothering with MySQL indexes. It works, but looking over it I realize that most of the data could be normalized to fit into the standard relational model. I’d still need a blob of XML data somewhere, but it could be in the database as easily as a file. What have I really gained, besides an impressively large and complex pile of code?

Comments 3 Comments »