Posts Tagged “Ruby”

I was at LispNYC last night listening to Anton van Straaten discuss his work on R6RS, the new Scheme standard. One surprising change from R5RS is that eval is defined in a library.

Eval, in a library? Holy scopes! The Common Lispers in the audience were aghast. Even the Schemers were a tad confused. Anton explained. The goal of Scheme, he said, has always been to incorporate as much dynamic behavior as possible without sacrificing efficient compilation. Towards this end, the R6RS eval is more limited than Common Lisp’s eval.

As I understand it, eval was central to McCarthy’s original design for LISP. Eval is LISP, and LISP is eval. Of course, as others reminded me later, LISP’s definition of eval with dynamic scope led to the 30-year “funarg” bug. Eval is also a thorn in the side of anyone trying to generate a standalone Common Lisp program — the possibility of a call to eval means the compiler has to include the entire Common Lisp runtime (up to 20 MB, depending on the implementation) in the final executable.

This got me thinking about Ruby, too. While Common Lisp and Scheme actually discourage the use of eval, Ruby is pretty casual about it. In a purely interpreted, dynamic language, that’s not a problem. But it would make it tough to implement a static Ruby compiler.

Comments 1 Comment »

Here’s a question that’s been bugging me for a while: what’s the best way to store information that is a mixture of highly- and loosely-structured data? For example, a collection of documents like Project Posner. Certain attributes of each document like the title, date, and citation fit easily into a normalized relational database model. But the body can only be described with some kind of markup.

I could just use HTML, except for one problem: my documents have to handle footnotes, for which HTML does not provide a tag. (As an aside, footnotes are a pain whether you’re doing web design or typesetting.)

On Project Posner, I compromised: everything is stored in a MySQL database, and the documents table has a “body” column that contains my own made-up XML syntax.

I could, in theory, normalize everything, even individual paragraphs. But that would be a nightmare to code and deadly slow. I could also store everything as XML documents. But then I’d have to reinvent all the facilities that MySQL (and ActiveRecord) provide, like transaction handling, auto-incrementing IDs, and so on.

For another project, I’m trying to create a pseudo-database that stores everything as XML files and uses Ferret for searching. I was going to use Ferret for full-text search anyway, so my original thought was to save overhead by not bothering with MySQL indexes. It works, but looking over it I realize that most of the data could be normalized to fit into the standard relational model. I’d still need a blob of XML data somewhere, but it could be in the database as easily as a file. What have I really gained, besides an impressively large and complex pile of code?

Comments 3 Comments »

I used to be a big fan of Perl. It was the first programming language I really liked. I felt like it didn’t get in my way. CPAN was and still is the best collection of open-source libraries ever assembled.

Then I got into Ruby, and was very happy with the way it cleaned up Perl’s syntax but still didn’t get in my way. There aren’t nearly as many Gems as CPAN modules, but it’s a solid foundation.

Recently I went back to Perl for a small project. Aside from the predictable mistakes — I kept trying to put colons before my hash keys to make them like Ruby symbols — I realized how nice Ruby really is. Perl is definitely powerful, but once you get beyond the text munging it was designed for it can get irritating, especially the ass-backwards way it does OOP. And I found myself missing Ruby’s code blocks. Perl has code blocks too, but understanding them takes a whole book.

I also realized why Ruby is fairly unique among semi-mainstream programming languages: unlike Perl, Python, JavaScript, or VB, it is based around a small set of core concepts: objects, methods, and code blocks. They’re used everywhere, even for control structures and fancy meta-programming. This is good, because it reduces the number of concepts one has to think about to use the language and, paradoxically, makes it easier to add new features. Lisp started the same way with its core concepts: S-expressions and macros.

Comments 2 Comments »

I’ve just dived into Rails and Ruby in the past couple of months, but I’ve already benefited from it, so here’s my entry in the How has Ruby on Rails made you a better programmer contest.

 

1. I finally get Model-View-Controller

I’ve seen MVC before, once long ago in the Microsoft C++ Foundation Classes, later for the web in Perl’s Maypole and Catalyst frameworks. But I never quite saw the point. Sure, it’s a nice idea, and I tried to separate my data from my views, but to make everything work I had to tie the models and views so tightly together that they could never be separated. The “controller” didn’t seem to have any role to play. It was just a shadowy background figure, perhaps a GUI engine or a web server, not something that I as a humble application programmer would ever implement. Code examples often omitted it entirely. Maybe those were bad examples, but they were what I had at the time.

I wrote my first real controller classes in Rails, where they suddenly make sense. The direct mapping of a URL to a method makes it obvious how the controller, not the view, is the real public interface to my whole application. It defines areas of operation and what actions the user is allowed to carry out in each area. So I think about what URLs I want to handle, and that tells me what my action methods should be. This leads me to think about my application in terms of its API, about what some other programmer accessing the application via HTTP might need.

2. I finally get Object-Oriented Programming

No, I’m not kidding. Rails taught me Object-Oriented Programming. I first learned OOP in C++, not the greatest introduction. After spending one whole summer on a string-handling class, I hated it. I knew there was more out there: I played around with CLOS and read about Smalltalk.

But it was with Ruby that I said, “Oh, this is how OOP is supposed to work!” Being able to add new methods to built-in classes like Integer or String made me feel for the first time like OOP was helping me rather than getting in the way. Rails’ heavy use of this technique really opened my eyes to the possibility of thinking about numbers not as dumb literals but instead as intelligent entities that can tell me things, like the date 5.days.ago.

3. I write code for maximum legibility

I have read “source code is for people, not computers” often enough, but I didn’t follow the advice. I tried to think of the best way to represent a problem abstractly, in the domain of the computer, rather than the best way to represent it syntactically, for a human reader. Even worse, in the spirit of protecting my own carpals, I abbreviated everything. As a result, my code was unreadable even to me an hour after I wrote it. The problem is, I was thinking about the abstraction behind the code rather than the surface of what it actually said. If I couldn’t remember what I was thinking at the time I wrote a piece of code, it would be gibberish when I went back to it. Rails has shown me good examples of “writing on the surface.” Even simple practices like giving plural names to plural variables (arrays, tables) and using plain English names go a long way to bringing my abstract thinking closer to the surface of what I write.

My new goal is to write for maximum legibility, to write code that any competent programmer could read through once and immediately understand. I need to rewrite things a lot to achieve that, and I’m sure I fall short of the goal, but the extra effort is worth it for just being able to read my own code.

4. I learned how to deal with a database properly

Migrations were truly a revelation. They would have saved me a lot of headaches on past jobs, and now I would never attempt anything database-related without them.

5. My good habits are encouraged

One thing Rails didn’t have to sell me on is automated tests — I was already sold. But having them built into the framework is great validation of that belief. Now I feel guilty when I don’t have enough tests instead of guilty for spending valuable time writing them.

I’ve also learned the value of conventions, both having and following them. Rails’ conventions, particularly for naming, are flexible enough that I don’t feel a perverse desire to be different. So my code actually looks and behaves a lot like the documented examples!

Conclusion

I still have a lot to learn, both about programming and about Ruby on Rails, but I’m learning new things that make me a better programmer without taking the fun out of it. I’m enjoying programming again, the way I did when I wrote my first real GUI, my first Perl script, or even my first BASIC on a Timex Sinclair 1000.

Comments No Comments »

I like Lisp’s prefix syntax. It’s consistent, has natural structure, and makes code-manipulation macros possible. But it’s not always the easiest to read or write. For example, I often want to apply several successive transformations to the same chunk of text. In Perl, I could use the default variable $_ and then just write a bunch of regular expressions:

s/this/that/g;
s/old/new/g;
s/foo/bar/g;

Very succinct, but a tad cryptic. But the equivalent in Common Lisp, using the CL-PPCRE regular expression library, is much worse:

(regex-replace-all "foo"
		   (regex-replace-all "old"
				      (regex-replace-all "this" string "that")
				      "new")
		   "bar")

CL-PPRCE’s regex-replace-all function puts the original string in between the regex and replacement string in its argument list, which makes the syntax awkward. I usually avoid writing nested expressions like the one above and instead factor each replacement out into a separate function:

(defun replace-foo (string)
  (regex-replace-all "foo" string "bar"))

(defun replace-old ...)

(defun replace-this ...)

(replace-foo (replace-old (replace-this string)))

But who wants to define three extra functions just for one expression?

Now I’m exploring Ruby, and was pleased to find how easy it is to write this:

string.gsub('this','that').gsub('old','new').gsub('foo','bar')

This is succinct and reads easily from left to right. This sort of procedure is where the classic object.method(arguments) syntax really shines. At least for me, it makes sense because it’s how I tend to think about a problem: “Take this object, do this to it, then do something else to it, then give me back the result.”

The trouble I have with prefix syntax is that it feels backwards. To read Lisp code, even my own, I have to dig through the parentheses to find the innermost expression, then work my way back out again. Of course, that’s basically what a Lisp interpreter or compiler does.

I like to think it would be possible to combine the flexibility of Lisp’s S-expressions with the left-to-write readability of object.method, but I don’t know what that would be. I have little experience with Forth-style postfix syntax, but it seems even less readable. But I think this just goes to show that syntax does matter.

Comments 5 Comments »

Well, a new year, and (finally) a new post. In the past two weeks I have undertaken a complete rewrite of Project Posner from Common Lisp to Ruby on Rails. Now, before the Lispniks descend upon me with their sharp parenthetical barbs, allow me to explain. The Common Lisp version was never anything more than a cheap hack: a few hundred lines of code that crawls through a few tens of megabytes of plain-text documents and spits out about the same amount of HTML. It’s completely off-line, static, Web 0.5 stuff. For a search engine I used ht://Dig, whose last release was in 2004. All that being said, Common Lisp was a great language for doing it, and definitely made the process easier and more fun than it would have been in any other language.

But I want to move on with a more sophisticated search, more dynamic features (highlighting search terms and personal search histories to name just two), and, of course, AJAX! I could do all that in Common Lisp. Several people have successfully done so. But many hundreds more have done so with Ruby. With Rails, half the work is already done for me. I don’t have to think about how to connect to a database or even how to name my files. Someone else has already done that work. When I had a problem with Rails dropping MySQL connections on Ubuntu, Google delivered a one-command solution on the first try. Compare that with the endless speculation and one-upsmanship that might accompany such a query on comp.lang.lisp.

So any distaste I may have for Ruby’s syntax is completely overcome by my delight at Rails’ helpfulness. I’d still rather be working in Lisp, but Ruby is good enough, and Rails is better. It is not the path of least resistance — would that be PHP? — but it is the path of least work. As someone wrote, if Lisp’s audience had been harried sysadmins rather than AI researchers, it’d rule the world by now.

Comments No Comments »

Amazon has a beta up of an interesting little app called UnSpun. It’s a way to create and vote on “best of” lists for any subject. It’s a little like Reddit, but less news-oriented. Ruby currently leads Best Programming Language by a 7-to-1 margin, not surprising given that the site’s built on Rails. I’m glad to see that Lisp made it to number 6, but why is it right below APL??

Comments No Comments »

Perhaps I was premature worrying about how slow Ruby is. John Wiseman was benchmarking Montezuma, his Common Lisp port of Ferret/Lucene, and found out in the process that Ferret is 10 times faster than Java Lucene! As he says, Ferret gets help from about 65,000 lines of C code.

I’ve heard this before, perhaps not often enough to make a generalization, but at least enough to identify a trend: if you want performance from Ruby code, rewrite it in C. (The same is sometimes said of Python, or really any interpreted language.) The basic approach seems to be to extract the most performance-critical parts of your dynamic, interpreted language program and rewrite them in a static, compiled language, thus retaining most of the benefits of both.

It’s an interesting contrast to what I see as the Common Lisp approach to optimization, which is to keep everything in Lisp but add compiler declarations in hopes of speeding it up. Trouble is, unless you’re an expert on the inner workings of your compiler (or can read the disassembled code) it’s hard to know exactly what effects a particular declaration will have.

Eventually, I think manual optimization will become unnecessary. Experimental compilers like Stalin have been shown to produce faster machine code than hand-coded C. Stalin compiles a subset of Scheme down to a subset of C, making heavy use of type-inferencing and static analysis. If it can be done with Scheme, surely it can be done with Python, Ruby, or any other dynamic language.

Comments No Comments »

I continue to sweat (see previous entry) over the question of language choice for future iterations of Project Posner (and some as-yet-unnamed similar projects). Ruby on Rails is the obvious mainstream choice, mainstream at least compared to Lisp. But a part of me really wants to do it in Common Lisp, just to prove I can.

One concern I do have speed. Ruby is pooh-poohed for being slow, which, its true, is not really fair for a 1.x version scripting language, but the Programming Language Shootout does support the accusation. I tried comparing Ruby and SBCL on the Shootout. As I expected, SBCL is up to several hundred times faster than Ruby, but I did not expect that Ruby would use two to five times less memory.

Maybe Ruby’s data structures are very close to their C analogs, lacking the extra padding that Lisp needs for type identification? But no, Ruby is dynamically typed, too, so surely it needs just as many tag bits. Ah, I know: The test must be counting the large size of the SBCL runtime (over 20MB, I recall reading somewhere) compared to Ruby’s (less than 2MB). For a limited-duration algorithmic test, this would certainly dominate the results.

I wonder, though: over longer run times, which language would use less memory for actual data storage? I suspect that carefully optimized Lisp arrays would win, but Ruby’s arrays, the standard way to represent lists in Ruby, might fit in less space than a linked list structure, the standard way to represent lists in Lisp.

Comments 1 Comment »

Perl was the first programming language I really liked, the first language that made programming fun.

Perl has three basic types: “scalars” for atomic values, arrays for ordered sets, and hash tables for unordered sets. (Yes, there are others, but those are the popular ones.) I quickly discovered that these three types can be combined to produce most any data structure you might need. Need an ordered list of records? Use an array of hashes. Need a tree of named elements with attributes (e.g. XML)? Use nested arrays with hashes in them.

These basic types can also be conveniently mapped to external data. A CSV file can be represented as an array of hashes. A database table can be an array of arrays, an array of hashes, or a hash of hashes, whichever you prefer.

Python and Ruby both followed in Perl’s footsteps here. (Python calls them “lists” and “dictionaries.”) Lisp, predating all these new-fangled “scripting” languages, includes lists, arrays, hash tables, plus a whole raft of other built-in types. This is one of those areas that makes Common Lisp difficult for the beginner to grasp. When should you use a plist, an alist, or a hash table? When should you use arrays and when should you use lists? The answers to these questions delve into details of how the various structures are implemented. The only obvious criteria for choice is speed at handling a given data set, something a beginning programmer doesn’t want to worry about when designing a new piece of code.

At least one hacker has implemented generic get/set functions for all of Common Lisp’s data types, but to my knowledge no one has implemented abstract ordered/unordered set types that don’t care about their implementation. CL-Containers is a good foundation, but it further complicates the issue by adding a bunch of new data types.

What I want is a general-purpose “collection” class, of which instances can be declared ordered or unordered, numerically-indexed or key-indexed. Something like this:


(define-collection my-set
  :ordered t
  :index string)

Then, based on the data that I feed in to that class and the operations I perform on it, the compiler decides what sort of data structure to use for maximum efficiency. Or, if that’s too much magic to ask, at least let me change the underlying implementation without affecting any of the code that uses the collection:

(implement-collection my-set
  (array :resizable
         :elements (cons string object)))

Comments 4 Comments »