Google recently released its Protocol Buffers as open source. About a year ago, Facebook released a similar product called Thrift. I’ve been comparing them; here’s what I’ve found:
Thrift | Protocol Buffers | |
---|---|---|
Backers | Facebook, Apache (accepted for incubation) | |
Bindings | C++, Java, Python, PHP, XSD, Ruby, C#, Perl, Objective C, Erlang, Smalltalk, OCaml, and Haskell | C++, Java, Python (Perl, Ruby, and C# under discussion) |
Output Formats | Binary, JSON | Binary |
Primitive Types | bool byte 16/32/64-bit integers double |
bool
32/64-bit integers “repeated” properties act like lists |
Enumerations | Yes | Yes |
Constants | Yes | No |
Composite Type | struct | message |
Exception Type | Yes | No |
Documentation | So-so | Good |
License | Apache | BSD-style |
Compiler Language | C++ | C++ |
RPC Interfaces | Yes | Yes |
RPC Implementation | Yes | No |
Composite Type Extensions | No | Yes |
Overall, I think Thrift wins on features and Protocol Buffers win on
documentation. Implementation-wise, they’re quite similar. Both use
integer tags to identify fields, so you can add and remove fields
without breaking existing code. Protocol Buffers support
variable-width encoding of integers, which saves a few bytes. (Thrift
has an experimental output format with variable-width ints.)
The major difference is that Thrift provides a full client/server RPC
implementation, whereas Protocol Buffers only generate stubs to use in
your own RPC system.
Update July 12, 2008: I haven’t tested for speed, but from a cursory examination it seems that, at the binary level, Thrift and Protocol Buffers are very similar. I think Thrift will develop a more coherent community now that it’s under Apache incubation. It just moved to a new web site and mailing list, and the issue tracker is active.
Thrift was written by engineers at Facebook who’d worked at Google and missed using Protocol Buffers, hence the expanded feature set.
Helpful comparison !
I was trying to get more information on Protocol Buffers, and Wikipedia linked me to Thrift.
I found Protocol Buffers pretty well documented, while on the other hand Thrift looks like an alpha project from the documentation point of view…It is a pity because Thrift seems to have quite a few interesting features. Considering that Thrift has been published like one year ago, I wonder if there is really a community backing it up and any enthusiasm around it ..
It is quite an important point, because if you start using any of these libraries to communicate between different servers/services, you will probably have to stick to it for a few years …
In addition, it would be interesting to see if these libraries could be used to store hierarchical information in a database and see what would be the performances/limitations compared with storing XML or JSON. They should work better for a basic load/save strategy but what if you just want a subset of the data, do you still need to read the whole message ? The same applied if you want to redirect a message depending on one particular field without reading the whole message …
I have not compared them in detail myself, but if you believe Google’s description of Protocol Buffers, it’s primary feature is one you have not included in your consideration. They claim that their binary format is (1) especially small and (2) especially quick to serialize and deserialize. I am not particularly familiar with Thrift, but I never had the impression that these particular optimizations were of great importance to that project. The Google version is optimized for the case where you are dealing with truly massive amounts of data (the RARE case where efficiency trumps readability).
— Michael Chermside
nice and short summary
i like that a lot, thanks for the effort!
How do they compare in speed?
Density: What makes you say that Thrift is directly a Protocol Buffers clone? According to the Thrift whitepaper (see acknowledgment section) it is the successor to a project called Pillar, written by a fB employee while in college.
Link to paper:
http://developers.facebook.com/thrift/thrift-20070401.pdf
Mark Slee interned at Google, and the paper specifically cites Protocol Buffers after Pillar.
http://www.cricketschirping.com/weblog/?p=1210
mattrepl: The Facebook employee in question was an intern at Google while in college ;)
That’s not to say Thrift is a copy of protocol buffers–but given the similarities it’s likely it was inspired by it.
There’s currently two Haskell implementations of protocol buffers, one of which you can find here. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/protocol-buffers-0.0.5 the other is under development.
A few quick comments:
1/ Data size and serialization performance are definitely of great importance to Thrift. Huge data sets are definitely one case where this matters, but don’t forget about high-throughput low-latency services (at Facebook, like Google, every millisecond counts). Thrift is much quicker than typical XML/RESTful service implementations, even with relatively small data sizes. This is one of the primary use cases. When you’re dealing with millions of users and thousands of servers, efficiency really starts to matter.
2/ We do have to admit Thrift isn’t yet as fully-featured in the documentation department as it could (and probably should) be, but we do have an active and enthusiastic community. Thrift is currently being used and contributed to by Powerset, Rapleaf, iMeem, AmieStreet, the reCaptcha project, as well as a number of independent developers.
3/ Both Thrift and Protocol Buffers are great candidates for serializing data into databases — both are more compact and quicker to read/write than XML/JSON. Another common persistent use case is the storage of replayable logfiles.
4/ Last point, though Thrift currently only has implementations for binary/JSON, it’s designed so that the encoding format is extensible. Thrift could easily support XML or human-readable ASCII — so the trade-off of efficiency vs. readability is left up to the developer.
[…] July 12, 2008 at 10:52 pm · Filed under Programming [From Thrift vs. Protocol Buffers – Digital Digressions by Stuart Sierra] […]
> We do have to admit Thrift isn’t yet as fully-featured in the documentation department as it could
Please don’t release code without decent documentation. If you are making effort then do it right. Code of this complexity needs good doc or it’s just a complete waste of time and totally frustrating.
[…] Thrift vs. Protocol Buffers (tags: api code network programming reference rpc) […]
[…] Until then, Stuart has provided a nice overview of Thrift’s features by comparing it to Google’s recently released Protocol Buffers. If you’re interested in either, read Stuart’s article. […]
[…] Thrift vs. Protocol Buffers Overall, I think Thrift wins on features and Protocol Buffers win on documentation. Implementation-wise, they’re quite similar. (tags: protocolbuffers google facebook thrift programming api) […]
[…] Some interesting links which might be worth checking in more detail: open source projects on facebook wiki, the portal for developers on Facebook code (interesting!), Project Cassandra: Facebook’s Open Source Alternative to Google BigTable, the fact Google recently released its Protocol Buffers as open source, Facebook did it much earlier with Thri…. […]
[…] recently became an Apache project. Nice stuff. Stuart Sierra has a nice comparison on his blog, Thrift vs. Protocol Buffers. Another worthy contender but not a big enough advantage to stop the internal momentum protobuf […]
Less mature perhaps than protcol buffers, but seems superior in important respects:
http://eigenclass.org/R2/writings/extprot-extensible-protocols-intro
HTH?
I would offer that the Thrift community is lacking an official site where a newcomer (to Thrift) can download a running Thrift compiler for their system.
I very much want to try Thrift. I’ve been looking for a ready-to-run binary of the compiler and C# generators (on Windows/XP) but cannot find them anywhere. The ‘make’ scripts (as documented) are pretty useless in the Windows world.
I dare say that the barrier to entry is having to build the compiler on one’s own machine before anything else can be done. I bet most people give up and go elsewhere.
-David
David: Unfortunately, since Thrift has not made an Apache release yet and is in incubation at Apache, the project members are unable to provide official binaries that get distributed. If you had said something on the mailing list, I could have sent you binaries if you’re having difficulty getting up and running. Otherwise, http://wiki.apache.org/thrift/ThriftInstallationWin32 does a decent job of describing how to get going, but could probably be improved.
The license information is swapped in the table. Thrift is Apache while protobuffers are BSD.
“The license information is swapped in the table.”
Fixed. Thanks, Jacob.
I have compared thrift and protobuf serialization performance in Java, thrift is 1.2-2 times faster than protobuf.
http://timyang.net/programming/thrift-protocol-buffers-performance-java/
[…] of this table was originally compiled by Stuart Sierra but has been edited to include additional information relevant to my own […]
Check this out: http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking
[…] There is another Thrift vs. Protocol Buffers compare non-performance […]
[…] If Thrift is still very new and lacks documentation – why bother learning it? Especially when there are more mature alternatives like Google’s Protocol Buffer and SOAP. The answer depends on your requirements. I write web-based financial applications. I use PHP as my server-side scripting language, and C++ for the heavy-lifting. As of this writing, Protocol Buffer does not support PHP, and the SOAP protocol has too much overhead for the high-speed finance world. Furthermore, I think Thrift has more potential than both alternatives and will gain in popularity as more people learn it. Also apparently Thrift was developed by Google engineers working at Facebook who missed using Protocol Buffer, so they re-built it and expanded its feature set. You can find more information on the differences between Thrift and Protocol Buffer at: http://stuartsierra.com/2008/07/10/thrift-vs-protocol-buffers. […]
[…] recently became an Apache project. Nice stuff. Stuart Sierra has a nice comparison on his blog, Thrift vs. Protocol Buffers. Another worthy contender but not a big enough advantage to stop the internal momentum protobuf […]
It looks like a great library, but not much good if your dev environment is Visual Studio. Probably you’d need something more Posix and cross-platform (perhaps MinGW). The library relies heavily on Posix threads, which are not available in win32 unless you’re prepared to incorporate a GPL or LGPL library, and I’m not. The protocol buffer library is cross-platform, but then again it doesn’t include a transport mechanism and is therefore less powerful.
For the sake of completeness, I wanted to point folks to my attempts at improving the Thrift documentation:
http://diwakergupta.github.com/thrift-missing-guide/
[…] http://stuartsierra.com/2008/07/10/thrift-vs-protocol-buffers http://en.wikipedia.org/wiki/Thrift_(protocol) Share this:TwitterFacebookLike this:LikeBe the first […]