星期四, 6月 28, 2007

Discovering a world of Resources on Rails

Here is a very good presentation by David Heinemeier Hansson about RESTful in Rails.


星期日, 6月 10, 2007

Installing RMagick on Windows

Tried out RMagick demo script by installing on Windows (Why on Windows? My project needs this platform.). At first shot, the demo script demo.rb simply doesn't work. It gave an error like this.

Can't convert String into Integer (TypeError)
Got a few iterations by uninstalling and reinstalling Ruby and RubyGem. Finally found out that it's the mismatch with RubyGem starting from version 0.9.3 onwards. So, to this moment, what I could do is to keep my gem version to 0.9.2 (Ruby at 1.8.6 is alright).

星期四, 6月 07, 2007

I'm Ruby on Rails, I'm Java

I really love this video from railsenvy.com. The idea is come from Get a Mac TV Ads by Apple.com

Some interesting comparison between PHP, Rails and Java

During the International PHP Conference 2006, Tim Bray, who - among many other things - co-edited the XML 1.0 and XML namespace definitions, gave a keynote about "How to combine PHP technology with Java based on Enterprise Systems". During his keynote, he presented some very interesting comparison between the popular development "frameworks" PHP, Ruby on Rails (RoR, Rails) and Java


  • Scaling : Load balancing, Shared-nothing, CPU , DBMS, File I/O, Observability
  • Development Speed : Compilation step, Deployment step, Code Size, Configuration process
  • Development Tools : IDE, Templating, How many tools, O/R Mapping, Performance, Documentation
  • Maintainability : MVC, OO, Readability, Language count, Code size
It seems Java can't get a upper hand from the eyes of Sun people.

Download the presentation file

Why I love RoR?

I agree most of those from analysis and comparison. In fact, I just want to point out that there are several good IDE are coming out e.g. Aptana IDE (RadRails has integrated into Aptana)

Pros

  • An MVC framework for Ruby helps separate presentation from business logic
  • Unit tests are built right in.
  • Don’t repeat yourself (DRY) principle makes for less painful development
  • Agile no compile development
  • Convention over configuration and meta-programming does away with all the configuration you have in a Java framework like Struts.
  • Active Record Object Relational Mapper needs very little mapping configuration.
  • Heavy Ajax Support, Prototype and Scriptaculous helpers are in Rails
  • Built in XML Web Services
  • The Ruby language is pretty easy to learn. Everything is an object. Its clean and easy to read.
  • Easy model validations
  • Very quick to develop applications from the ground up when you have control over the database schema
  • Share nothing architecture. Session data is stored on disk or in the database so it could be shared amongst many nodes
  • Ruby is dynamically typed, which allows for easier development in my opinion since one can just check to see if an object has a method instead of worrying about what kind of type it actually is.
  • Classes are never closed so you can easily add methods to any class, even other apis. This allows active record to dynamically create method names for each database columns at run time. There is no need to update a mapping schema when you add a column to the database, the method is “magically” there.
  • Closures allow for dynamic behavior that would take much more code in Java.
  • It protects developers from common mistakes. It follows best practices which should be adhered too in new development. Also, since it scales by process you don’t have to worry about developers trying to create threads in the application container which could bring down the container.
  • Mongrel Server gives a Rails application its own container and speeds production requests greatly.
  • Ruby has been around for some time, first developed in 1993. Ruby is a language in itself was was not created to be web specific.
Cons
  • Limited legacy schema support. You pay in added code for someone else using composite primary keys in the database schema. Not a show stopper, but you lose the quick development features of the ActiveRecord database model each time you have to account for this.
  • Freely available api documentation is seriously lacking. Books are fairly decent but they are having a hard time keeping up with the rapidly evolving framework. It makes it hard to know about deprecated practices.
  • If you cannot follow the best practices for some reason you are forced to solve the issues by hand.
  • Generated Scaffold code is pretty useless when working with a database that doesn’t follow the rails specs. You have to customize a great deal, then again the word is that we are not supposed to be using scaffolds for real development, which was one of the main selling points of rails.
  • Scaling by process can eat a lot of memory on the server. Memory is cheap though.
  • No built in internationalization support.
  • No plan to include better legacy database support.
  • SQL Server(all versions) support is very lacking and has the same problems just as PHP. I would not use it for production against SQL Server.
  • Rails is a fairly new framework compared to PHP and Java, so there are not as many developers, but many are eager to learn.
  • Limited IDE environments. Textmate is the best but it only runs on a Mac.

星期三, 6月 06, 2007

Acts_As_Ferret Tutorial

Here is a list of good tutorials on acts_as_ferret

  • From Gregg Pollack, he covers all the important features of Ferret/ActsAsFerret - from simple searches to custom fields to match highlighting. Click Here
  • From Roman Mackovcak, a nice introduction to Acts_as_ferret including info on how to do paging across search results. Click Here

星期二, 6月 05, 2007

acts_as_ferret: Rails全文搜尋快速上手(與中日韓文支援)

相信許多鐵道迷都聽過雪貂(Ferret)。雪貂是一套根據Lucene所開發的全文搜尋引擎。裝上了「化身為雪貂」(acts_as_ferret)這套plug-in之後就更厲害了,任何ActiveRecord model只要加上輕量之人最愛的神秘一行,瞬間就具有了全文搜尋能力。

Ferret是用C寫成的,用語和基本觀念與Lucene一致。因此對Lucene有認識的朋友應該很容易上手。雖然說化身為雪貂很好用,不過O’Reilly的Ferret一 書仍有一讀的必要。該書最後還介紹如何配合其他plug-in來index諸如PDF, JPEG EXIF等metadata,幾乎可以寫一套小型的Mac OS X Spotlight。而該書對於Ferret的構成、內部運作原理、performance tuning的介紹,也相當實用(而且,不需要先學Lucene;我也還在研讀這一部份就是了……)。

以下是我用的 RegExpAnalyzer,僅僅很簡單的把歐語的單詞拆開、數字拆開,中日韓文則以字元方式來索引。這種簡單的中日韓文tokenizing在搜尋精 確度不要求高的場合,大體能用。要更好的搜尋結果,或是要做到同音字搜尋、簡繁搜尋,當然就需要更複雜的 Analyzer。

請找個地方填入以下的 regex 跟 Analyzer:

GENERIC_ANALYSIS_REGEX = /([a-zA-Z]|[\xc0-xdf][\x80-\xbf])+|[0-9]+|[\xe0-\xef][\x80-\xbf][\x80-\xbf]/
GENERIC_ANALYZER = Ferret::Analysis::RegExpAnalyzer.new(GENERIC_ANALYSIS_REGEX, true)

然後在想要加入搜尋的 model 裡加入:

acts_as_ferret({:fields => [ FIELDS_YOU_WANT_TO_INDEX ] }, { :analyzer => GENERIC_ANALYZER })

之所以不把 GENERIC_ANALYZER 放在 acts_as_ferret 裡,除了可重用性的原因,另外還避免掉在 Mongrel + Rails development mode 時可能造成的 bus error / segmentation fault (原因不明)。

總之,只要做完這件事,就可以:

Model.find_by_contents("hola")

acts_as_ferret很聰明,如果是第一次使用,會幫你把這個data model所用table全部讀一遍,建立必要的全文索引。之後所有的 CRUD 動作只要透過這個 model ,「化身為雪貂」會幫你做完所有該做的 Ferret indexing 動作。

先前有一些中文論壇提到用 /./ 來處理中文斷字(說「斷詞」或「分詞」有誤導之嫌)。雖然 Ruby 的 regex engine 在 $KCODE 設為 utf-8 時,可以正確地以 /./ 來掃描 Unicode 字元,但是這樣的作法是有問題的。英文詞因此會被斷成一個字元一個字元。而,單純用 [a-zA-Z] 則忽略了歐語,這是不夠的。

偏偏 Ruby 的 Unicode 支援只做了一半,不像 Perl 可以用 /\x{80}-\x{7ff}/ 的方式來表達 Unicode range,所以我們得祭出 jcode.rb 裡處理 UTF-8 的 regex (也就是利用 UTF-8 的特性),來找出實際上為 U+80 ~ U+7FF 以及 U+800 ~ U+FFFF 的字元。當然,> U+FFFF 的字元這裡並沒有處理,而且這個方式其實過於簡化。

但總之這是可以用的方法。如果要測試或改善 regex ,可以使用Ferret一書中第65頁所列的以下方法來測試:

def test_token_stream(token_stream)
puts "Start | End | PosInc | Text"
while t = token_stream.next
puts "%5d |%4d |%5d | %s" % [t.start, t.end, t.pos_inc, t.text]
end
end

然後在irb中:

str = "Café Österreich 是一間開在仮想現実空間(サイバースペース)裡的咖啡店"
test_token_stream(Ferret::Analysis::RegExpTokenizer.new(str, GENERIC_ANALYSIS_REGEX))

就可以看到 RegExpTokenizer 執行的效果。

* Update: lingr.com has released their multilingual analyzer, click here for detail

Unicode issue in MySQL

If you want to create a database with Unicode support, please use the following command to create the database

create database [database-name]
character set utf8 collate utf8_unicode_ci;
After that, in you Rails application, inside the database.yml, please add
encoding: utf8

星期日, 6月 03, 2007

1 minute Ruby on Rails lecture

You have 60 seconds to read it. Tic...Tac...Tic...Tac.... Time's Up. Got it!


Ruby on Rails
Originally uploaded by WanCW.