Monday, November 25, 2002

Perl vs. Python vs. Ruby

I’m evaluating Python and Ruby as replacements for Perl. I’ve been using Perl for several years and am very comfortable with it, although I’m definitely not an expert. Perl is a powerful language, but I think it’s ugly and encourages writing bad code, so I want to get rid of it. Python and Ruby both come with Mac OS X 10.2, both have BBEdit language modules, and both promise a cleaner approach to scripting. Over the past few weeks I read the Python Tutorial and the non-reference parts of Programming Ruby, however as of this afternoon I’d not written any Python or Ruby code yet.

Here’s a toy problem I wanted to solve. eSellerate gives me a tab-delimited file containing information about the people who bought my shareware. I wanted a script to extract from this file the e-mail addresses of people who asked to be contacted when I release the new versions of the products.

I decided to solve this problem in each language and then compare the resulting programs. The algorithm I chose was just the first one that came to mind. I coded it first in Ruby, and then ported the code to Python and Perl, changing it as little as possible. Thus, the style is perhaps not canonical Python or Perl, although since I’m new to Ruby it’s probably not canonical Ruby either. If I were just writing this in Perl, I might have tried to avoid Perl’s messy syntax for nested arrays and instead used an array of strings.

Here’s the basic algorithm:

  1. Read each line of standard input and break it into fields at each tab.
  2. Each field is wrapped in quotation marks, so remove them. Assume that there are no quotation marks in the interior of the field.
  3. Store the fields in an array called record.
  4. Create another array, records and fill it with all the records.
  5. Make a new array, contactRecords, that contains arrays of just the fields we care about: SKUTITLE, CONTACTME, EMAIL.
  6. Sort contactRecords by SKUTITLE.
  7. Remove the elements of contactRecords where CONTACTME is not 1.
  8. Print contactRecords to standard output, with the fields separated by tabs and the records separated by newlines.

And here’s the code:

Perl

#!/usr/bin/perl -w

use strict;

my @records = ();

foreach my $line ( <> )
{
    my @record = map {s/"//g; $_} split("\t", $line);
    push(@records, \@record);
}

my $EMAIL = 17;
my $CONTACTME = 27;
my $SKUTITLE = 34;

my @contactRecords = ();
foreach my $r ( @records )
{
    push(@contactRecords, [$$r[$SKUTITLE], $$r[$CONTACTME], $$r[$EMAIL]]);
}

@contactRecords = sort {$$a[0] cmp $$b[0]} @contactRecords;
@contactRecords = grep($$_[1] eq "1", @contactRecords);

foreach my $r ( @contactRecords )
{
    print join("\t", @$r), "\n";
}

The punctuation and my’s make this harder to read than it should be.

Python

#!/usr/bin/python

import fileinput

records = []

for line in fileinput.input():
    record = [field.replace('"', '') for field in line.split("\t")]
    records.append(record)

EMAIL = 17
CONTACTME = 27
SKUTITLE = 34

contactRecords = [[r[SKUTITLE], r[CONTACTME], r[EMAIL]] for r in records]
contactRecords.sort() # default sort will group by sku title
contactRecords = filter(lambda r: r[1] == "1", contactRecords)

for r in contactRecords:
    print "\t".join(r)

I think the Python version is generally the cleanest to read—that is, it’s the most English-like. I had to look up how join and filter worked, because they weren’t methods of list as I had guessed.

Ruby

#!/usr/bin/ruby

records = []

while gets
    record = $_.split('\t').collect! {|field| field.gsub('"', '') }
    records << record
end

EMAIL = 17
CONTACTME = 27
SKUTITLE = 34

contactRecords = records.collect {|r| [r[SKUTITLE], r[CONTACTME], r[EMAIL]] }
contactRecords.sort! # default sort will group by sku title
contactRecords.reject! {|a| a[1] != "1"}

contactRecords.each {|r|
    print r.join("\t"), "\n"
}

This is actually the shortest version, and I think it’s the easiest to read if you aren’t put off by the block syntax. I like how the sequence of operations in the first line of the while isn’t “backwards” as it is in the Perl and Python versions. Also, I correctly guessed which classes “owned” the methods and whether they were mutators.

158 Comments RSS · Twitter

Have some sample input?

The value in any of the three languages is in examples that use the idioms pervasive to that language. Looking at the python [a language I'm intimately familiar with], I can think of a number of ways to do this a bit differently that would come more naturally. Likewise, I would like to see an example from someone deeply versed in Ruby as I find that language to be quite interesting.

(I have a lot of deep experience with Perl and suspect most solutions in Perl will still look like line noise...)

Here’s some sample input. I’d be interested to know how an experienced Python (or Ruby) programmer would rewrite the above code.

I would implement it in Python as something like the following. It could be done in fewer lines and exhibits a couple of idiosyncracies in my style of Python programming; namely, I can't stand the [... for x in y] style of creating arrays and I tend to use dictionaries instead of arrays because it allows me to store extra stuff and only pull that which I need. In this case, it means that records contains all information on each record and, as such, it would be trivial to augment the output to generate different kinds of reports from that single array of dictionaries.

#!/usr/bin/python 

import sys 
import string 

records = [] 

def splitAndStrip(aLine): 
    return map(lambda aField: aField.strip().replace('"', ''), aLine.split('\t')) 

keys = splitAndStrip(sys.stdin.readline()) 

for aLine in sys.stdin: 
    records.append( dict( zip( keys, splitAndStrip( aLine ))) ) 

records = filter( lambda aRecord: aRecord['CONTACT_ME'] == "1", records ) 
records.sort( lambda x,y: cmp(x['SKU_TITLE'], y['SKU_TITLE']) ) 

for aRecord in records: 
    print "%s\t%s\t%s" % (aRecord['SKU_TITLE'], aRecord['CONTACT_ME'], aRecord['EMAIL'])

I'm not an expert in python, but I guess, bbum's example could be clearer using python's string expansion:

print '\n'.join(['%(SKU_TITLE)s\t%(CONTACT_ME)s\t%(EMAIL)s'
    % r for r in records])

Actually it's already clear, but typing aRecord[''] can be quite a turnoff.

But he doesn't like list comprehensions. :-)

Ok, here's a revision then :)

for r in records:
    print '%(SKU_TITLE)s\t%(CONTACT_ME)s\t%(EMAIL)s' % r

Sweet :)

This is what I'd do: head over to the Vaults of Parnassus and see what I can find by searching for 'CSV'. I'll take list comprehesions over lambda, map, and filter just about all the time.

#!/usr/bin/env python

from DSV import DSV
data = open('2002-11-25-sample-input.txt').readlines()
records = DSV.importDSV(data, delimiter='\t')
sku = records[0].index('SKU_TITLE')
opt_in = records[0].index('CONTACT_ME')
email = records[0].index('EMAIL')

contacts = [[r[sku], r[opt_in], r[email]] for r in records[1:] if r[opt_in] == '1']
contacts.sort()

for contact in contacts:
    print '\t'.join(contact)

Reading the perl version,

I was wondering why you kept all the records ?

Since obviously you're just looking for the email addresses printed on screen.

Or did I miss something ??

How about something like that:

use strict;
my $EMAIL = 17;
my $CONTACTME = 27;
my $SKUTITLE = 34;

while( <> )
{
    my @record= split("\t");
    next if($record[$CONTACTME] ne '"1"');
    $record[$EMAIL] =~ s/"//g;
    $record[$SKUTITLE] =~ s/"//g;
    print $record[$SKUTITLE]."\t";
    print $record[$EMAIL]."\n";
}

You can even try something like s/^"(.+)"$/$1/ to try to remove only the external quotes (but then I guess potential quote inside the quotes need to be escaped)

$ perl foo.pl 2002-11-25-sample-input.txt
Product1 foo0@bar.com
Product3 foo2@bar.com
Product2 foo3@bar.com
Product3 foo7@bar.com

I still don't like list comprehensions [is that what they are called?] -- but, as I said, that's just me... I just don't find them terribly intuitive and am much more comfortable with lambda/map/filter.

I keep forgetting about the string composition using keywords. I like that *much* better.

jm: I wanted to try out arrays, sorting, and filtering in the various languages. The whole thing could probably be done with one s///ge in Perl, but that wasn't what I was going for.

Perhaps your time would be better spent becoming a better perl programmer than to learn a new language? :)

Seriously, it's unfair to compare bad perl code that doesn't use common idioms. If you write bad ugly code you've got nobody to blame but yourself.

If it was something I was writing for myself I would use something like this. It's something I probably wouldn't use in production but it's a fairly decent example. The advantage over your example is that the sort comes after the grep, thus avoiding sorting records you are going to reject anyway, the same with quotes.

my @recs =  sort { $a->[0] cmp $b->[0] } grep { $_->[1] eq '1' } 
            map { my @r = split /\t/; [map { s/"//g; $_ } @r[34, 27, 17]] } <>;
print join("\t", @$_), "\n" for @recs;

Gavin: I definitely should have used the array slice idiom rather than writing out [$$r[$SKUTITLE], $$r[$CONTACTME], $$r[$EMAIL]]. I just neglected to make that simplification when porting the code. And maybe using more maps or statement modifiers would make the code more Perlish and closer to the other examples. That was unfair.

On the other hand, your example is both idiomatic and ugly (to me). Why not go all the way and make it a one-liner? :-)

As to learning Perl better, I dislike looking at Perl code, especially when it’s written by Perl experts and is highly idiomatic. I just don’t want to write that kind of code. But I agree that it’s a good idea to use the idioms of the language you are writing in, so I think the solution is for me to learn a language that has a more appealing style. When I look at other people’s code in these three languages, I find it easier and more enjoyable to read Python and Ruby.

I was intending my example to be extreme, it's idiomatic but very functional. That's why I like Perl, it supports a variety of styles of thinking. I think the real key is to find a language that fits in with how you think, whether that's Perl or not isn't an issue for me, I just wanted the comparison to be a fair one.

Here's something which is simpler but still idoimatic, and in my opinion, easier on the eye than your example.

use constant EMAIL => 17;
use constant CONTACTME => 27;
use constant SKUTITLE => 34;

my @recs = ();

while (<>) {
    s/"//g;
    my @rec = split /\t/;
    push @recs, [@rec[SKUTITLE, CONTACTME, EMAIL]] if $rec[CONTACTME];
}

for (sort { $a->[0] cmp $b->[0] } @recs) {
    print join("\t", @$_), "\n" 
}

Fair enough. I like this last version a lot, though a lot of the cleanness comes from using $_ and avoiding temporary variables. That's a real win for short scripts like this one. My example used a two-phase approach, imagining that it would be part of a larger system. The if $rec[CONTACTME] in the while loop goes against that, but could easily be moved into the for loop.

I find it interesting that, particularly with the python and ruby solutions, nobody chose to use classes. I realize that this particular example rather lends itself to a simple procedural approach, but as someone who uses Python and Java regularly, I would have done something very similar to bbum's solution except I would use a class rather than a dictionary or an array of arrays. But really, I would count bbum's solution as pretty normal looking python. Lambda expressions and composition functions like map and filter seem to be more common than list expressions, but most of the python code I've read was written by Java developers, so that may just be a foreign dialect intruding.

#!/usr/bin/python

import sys
import string

def clean(quotedString):
    return quotedString.strip()[1:-1]  #get rid of whitespace and slice the quotes off the end.

def parseLine(line):
    fieldList = line.split('\t')
    return Customer(int(clean(fieldList[27])), clean(fieldList[34]), clean(fieldList[17]))

class Customer:
    def __init__(self, contactMe, skuTitle, emailAddress):
        self.contactMe = contactMe
        self.skuTitle = skuTitle
        self.emailAddress = emailAddress

    def __repr__(self):
        return self.skuTitle + "\t" + str(self.contactMe) + "\t" + self.emailAddress

if __name__ == '__main__':
    customers = []
    lines = sys.stdin.readlines()[1:]  # the slice gets rid of the field names.
    for line in lines:
        customer = parseLine(line)
        if customer.contactMe:
            customers.append( customer)
    customers.sort( lambda x,y: cmp(x.skuTitle, y.skuTitle) )
    for customer in customers:
        print customer

Kerry: I avoided classes because I lacked experience with Python and Ruby (those were my first programs) and because I'm used to languages where there is a higher overhead (in LOC) for introducing a new class.

Alun ap Rhisiart

Both Python and Ruby are very enjoyable languages to write in. Apart from style (I prefer ruby, it is a cleaner design to my view, like Smalltalk, but with cleaned-up Perl things like integrated Regex), there are other things that may sway you for a particular project. Python has more available for it than Ruby at this time, and having a library to do what you want can make a big difference. Both are working towards integration with Cocoa, but again Python is further along. Finally, you can't ship compiled code with Ruby, but with Python you can.

As for classes, I like the fact that both languages make creating classes easy, but don't force it. For your example, creating a class is needless overhead unless this is embedded in a larger context. Java forces you to do it all the time, but it isn't necessary even in pure OO languages (eg Ruby and Smalltalk).

All of the the "my"s in the original Perl script are unnecessary.

Since this is a plain pipe&filter operation, the way to write it in ruby is IMO using filter ops.
There is an example below (not tested). I replaced your regexp replacement with a plain string extract, which supports double quotes inside values, and is faster. Also note that I moved your filter (implemented using reject! in your code; I turned it) as early as possible in the stream.

#!/usr/bin/env ruby -w

EMAIL = 17
CONTACTME = 27
SKUTITLE = 34

readlines.collect { |line|
    line.split('\t').collect { |field| field[1..-1] }
}.find_all { |r|
    r[CONTACTME] == "1"
}.collect { |r|
    r[SKUTITLE], r[CONTACTME], r[EMAIL]
}.sort.each { |r|
    puts r.join("\t")
}

This is an interesting document you have made. Ruby is quite something and I expect people will start using Ruby for scripting. I'm not fond of the ugly perl syntax myself. Ruby made it easy to go from Perl, it has the variable syntax to lure perl programmers over. Anyway, end of comment

In regards to the various comments about the 'my' keywords in the perl versions.

It's true that you could drop the "use strict;" pragma and then avoid having to use "my", but IMHO Perl 5's variable scoping options is one of it's greatest features. Especially Perl > 5.6 with the 'our' scoping option.

The beauty with any good agile language (like these three) is that while you can get started quite easily, the deeper you understand it the more productive you get (and the richer programs you can easily create).

I would reccomend that you pick one and use it forever. Once you are expert at any one of them, there would probably be little reason to ever switch to another.

Another point of view in choosing languages and environments is that "it's all about the API". In this case, Perl's CPAN with mature building blocks such as POE and mod_perl wins me over every time.

Of course if you're building a large collaborative website, choose your environment first (like my personal favourite OpenACS or perhaps Zope). Learning the language behind it (tcl or python) is going to be a lot easier than writing the hundreds of thousands of lines of existing code yourself.

Someone already mentioned how you should be sorting after you've removed the unwanted elements from the array. I'd like to dwell on that for a moment. Perl derived some of its functionality from Lisp, and since Ruby is Perl-inspired, I'd expect it has a similar chained mechanism.

In Perl, you can say:

@list = 
  sort { $a->[0] cmp $b->[1] }
  grep { $a->[1] == 1 }
  something_that_builds_a_list();

In Ruby, I'd expect you can do:

contactRecords = records
  .collect {...}
  .reject! {|a| a[1] != 1}
  .sort!

One-lining things is not so much a display of guruism, but rather a comprehension of how to streamline your code. People familiar with Unix know how pipelines work, and the same idea works here. You send input to A, it sends its output to B, which sends to C, and you only store the end result, not the intermediate results.

I don't know for the rest, but re Perl:

#!/usr/bin/perl -w

use strict;

Incidentally, (nowadays) better

use warnings

instead.

my @records = ();

Incidentally,

my @records;

is just as fine.

foreach my $line ( <> )

Ouch! It may not be actually a problem to do so, but it is generally recommended not to do this, as it will slurp all of your file at once, which is not needed. It is recommendend to do things

while (<>)

instead.

{
    my @record = map {s/"//g; $_} split("\t", $line);
    push(@records, @record);
}

I really do not see the need for the intermediate

@record

variable, and from the description above it seems to me that you may really want

push(@records, \@record);

instead (in which case it's still not really needed either).

Also, bear in mind that

split

really wants a regex as aq first argument, with one major -useful- exception. Passing it a tring and relying on the automatic conversion, while actualy working is not something most perl programmer would regard as a good practice. I also suppose you really want "\t" rather than "\t", as in the latter case it wouldn't work at all (wrt your description).

my $EMAIL = 17;
my $CONTACTME = 27;
my $SKUTITLE = 34;

You may find

constant.pm

useful in cases like this. Otherwise, and more similarly to what you did,

my ($EMAIL, $CONTACTME, $SKUTITLE) = (17, 27, 34);

I'm not commenting the rest of your code because it seems more awkward perl code. Here's one of the possible (simplified) ways I would do it: it should comply with your requirements as described above. Of course it reflects my personal coding preferences:

#!/usr/bin/perl

use strict;
use warnings;

my @records;

push @records, [ (/"(.*?)"/g)[17,27,34] ] while <>;

($\,$")=($/,"\t");

print "@$_" for
sort { $a->[0] cmp $b->[0] }
  grep $_->[1] eq '1', @records;

__END__

Blazar, while this is perfectly reasonable perl code for a seasoned veteran like you or me, it demonstrates perfectly why perl is such a horrible language for writing intuitive and maintainable code.

You cannot look at this code snippet and just "know" what it's meant to do without reading ever single line and mentally parsing/executing it. Meaningful syntax and well named intermediate variables are essential to writing code that is "intuitive". You shouldn't have to comment every second line to make it clear why you are doing something.

You can write easily readable and maintainable perl code, but it's something the language tries hard to prevent you from doing in it's simplest form, as you have demonstrated.

[...] Two (1,2) interesting personal comparasions of Perl vs Ruby vs Python (both evaluating Ruby and Python after thinking about replacing Perl — not suited for Perl addicts ;-) [...]

"Specialization is for insects."

-Robert A. Heinlein

Learn as many languages as you can. The way to improve your programming is not by learning the intracacies of a particular language, but by learning many different paradigms. When I learned Java, my Basic programs became more structured. When I learned Haskell, my Java programs became more interface driven. When I learned Python, all my programs in Java and C# became cleaner (though annoyingly verbose).

Perl, Python, and Ruby all have their strengths. If you know them all, you can

1. Decide which language is best for a particular task.
2. Use what you learned from the other languages to better design something in the one.

i realize this thread is a bit dead, but here's a perl solution that reduces 'line noise', being rather maintainable while still making good use of perlisms (just because something can be expressed on one line doesn't mean you should slurp it onto one line):

#!/usr/bin/perl

use strict;
use warnings;

my @records = ();

while ( my $line =  ) {
    chomp $line;
    my @record = split /\t/, $line;
    s/"//g foreach @record;
    push @records, \@record;
}

my $EMAIL     = 17;
my $CONTACTME = 27;
my $SKUTITLE  = 34;

my @contactRecords 
    = map  { [ @$_[ $SKUTITLE, $CONTACTME, $EMAIL ] ] }
      sort { $a->[$SKUTITLE] cmp $b->[$SKUTITLE] }
      grep { $_->[$CONTACTME] eq '1' } 
      @records;

foreach my $r (@contactRecords) {
    print join( "\t", @$r ), "\n";
}
pavel kudinov
#!/usr/bin/perl
$,="\t";
$\="\n";
print @$_ foreach
	grep { $_->[1] eq "1"      }
	sort { $a->[0] cmp $b->[0] }
	map  { [ @$_[17,27,34] ]   }
	grep { s/"//g for @$_      }
	map  { [ split /\t/ ]      }
	;

intermediate variables it's a rubbish! use variables only for stable phase objects!

we have thousands of perl scripts in our projects, and really like to maintain it!

p.s. sorry for my english

Hi,

Thanks for the comparison.

I decided to redo the Ruby version since that is my language of choice, and have made this much shorter (slightly more than 50% fewer LOC) and quite a bit more efficient (by rejecting unwanted records as they are read, rather than going through all records again later and also by not sorting before rejecting), and made a small change to the output---which was to remove the CONTACTME field, since I would expect that it would not be required any further.

#!/usr/bin/env ruby

EMAIL, CONTACTME, SKUTITLE = 17, 27, 34
records = []
while gets
   record = $_.split('\t').collect! {|field| field.gsub('"', '') }
   records << [record[SKUTITLE], record[EMAIL]] if record[CONTACTME] == '1'
end
records.sort!.each {|r| print(r.join("\t"), "\n") }

Did you eventually make a choice of language? I took a look at your list of posts and couldn't tell if you had.

Thoran: I chose Python. It feels more natural to me, and also I like its Unicode support and PyObjC.

In retrospect, it was probably a mistake to base the Perl and Python versions on the Ruby version that was the simplest thing that could possibly work with my limited Ruby knowledge.

I think Ruby is probably the best language because it has a cool name and my birthstone is the ruby. Perl used to be my favorite because I could say, "I'm programming in Perl. Yes, that's the language I'm using... Perl." My friends would comment on how that Pearl was a cool name for a language. I would then correct them and with a smug little dismissive laugh say, "No, no. You don't understand. Its PERL. Its an acronym." But now the more I think about it, Programming Python, by O'Reilly has a cool-ass snake on its cover. While I'm on the subway reading it on the way to work, I hope people will notice the book, the snake and think, "Now THAT guy is learning something cool." I also like the Monty connection. ;-)

-C

Started 20 years learning basic
tried my had at pascal
.... got out of it (programming) ...
learned Perl in 1995
.... got really into Perl
2003 I started using PHP
until recently when a friend turned me on to Ruby on Rails.

Honestly, I dont think anything compares --or even comes close!

-Mark McDonald

For something that comes better than "close" to Ruby on Rails, have a look at Django. This is *deservedly* now getting a lot of attention, and it actually powers the washington post website. Can someone point me to a well known high traffic website (say 100 hits per second) that is using Rails?

Ben,

I was told Django powers *parts* of the Washington Post website, not the whole thing. Mostly small projects.

That being said, I've used both Django and Rails and both are great frameworks. I prefer Python for several reasons, but either are good.

Hi

Here is my 2c, optimized Python version, less lines, less memory, I hope it is still readable despite list comprehensions.

#!/usr/bin/python
import sys

EMAIL,CONTACTME, SKUTITLE = 17, 27, 34
FIELDS = [SKUTITLE, CONTACTME, EMAIL]

records=[]
for record in [line.split('\t') for line in sys.stdin]:
   contacts = [record[field].strip('"') for field in FIELDS]
   if contacts[1] == '1':
       records.append(contacts)
records.sort()
print '\n'.join(['\t'.join(r) for r in records])
Jerry Spickelmire

Bartek's version "fit's my brain". I can see why the list comprehensions look "backwards", but these are short and clear. Long ones can indeed get out of hand to the point of head explosion. Well, except maybe for LISP coders.

Anyone who hasn't learned C won't take to Python's C-like (%s) string replacement. Those are the least "Pythonic" of Python's legacy features I know of. Thanks for NOT using them, Bartek!

Need I mention regular expressions? Sure, once you go through the insane struggle to memorize random / arbitrary "meanings" tied to unrelated typewriter keyboard symbols, you can parse the things, but why?! Thanks again, Bartek, for leaving out the uglies, and demonstrating that they aren't needed anyhow.

As Jamie Z. said, "A programmer sees a problem, and thinks, 'I know, I'll use regular expressions!'. Now he has two problems."

Walter Kempf (ZA)

Admittedly I haven't written anything in Python and my Ruby experience is very limited. I have much more experience in Perl though (OO Perl... *shudder*).

I've read the Python Tutorial for the first time today in the hopes of discovering some glimmer of justification for the Python hype I see all around me. That's the reason I Googled "python vs ruby" and found this page. :P

To my surprise I found that Python was rather "stuck together" by convention more than anything else. The annoying (although useful) indentation and (seeming) dynamically typed variables is made up for only by the support and development done by the Python community. This is (IMO) still not a good enough reason to invest more than the introductory tutorial's time in Python.

Before I go on, I'd like to respond to Jerry Spickelmire's post above regarding regular expressions. The integrated support for regex in Perl is the number one reason I used Perl for all the scripts/projects I did. It is _the_ most powerful tool you can hope for in any project involving text processing. I, too, have laughed at the quote Jerry quoted above because of the apparent relevance, but if anyone lives by it, it only means that he/she doesn't know how to use it (regex) well enough. If a problem is too complex for one regex, it is usually possible to break it up into more managable sub-patterns.

Like I said: regex is the reason I used Perl. Note that I used the past tense, because I now use only Ruby for _anything_ I would've done in Perl. Not only does Ruby have integrated regex support, but it's easier to code, read and understand. Even if you read the code months after it was written. Compact without sacrificing readability. Yes, the code blocks does seem to be a bit strange in comparison to other languages (especially the ones I know), but once you know what they do and how to use them, the power/flexibility becomes clear. It is by no means a con in evaluating Ruby. I can only hope that the Ruby Gems collection will one day rival Perl's CPAN collection - that's all Ruby still needs.

I have to agree with Q above: Perl may be a perfectly good and usable language, but it wants you to write bad/poorly readable code. One of my friends is a Perl GURU that can write whole applications in

Walter Kempf (ZA)

(previous post continued)

really want to dump Perl, but still need that bit of obscurity in your new scripting language. :P

Having said all that, I would still rather use a simple bash one-liner:
cat input-file.txt | awk '$27 ~ /1/ { print $34 " " $27 " " $17; }' | sed 's#"##g' | sort

Never underestimate the power of the console. :P

Everyone! Quickly! Port a CPAN module to Ruby today! :P

I happened on this page while searching for info to help me decide what to learn next. I've no experience with any of the three.

My conclusion from reading the examples is that Perl code is likely to be a maintenance nightmare because there are so many ways of doing the same thing, most of them cryptic. Seems to be a bit like what I've heard of PL/1, that two programmers could solve the same problem using disjoint subsets of the language.

By contrast both the Python and Ruby examples were readable even to someone who has zero clue in either of the languages (although prior experience with 12+ other languages certainly helped).

Comparing Python or Ruby with Perl is like comparing Toyota or Chevy with BMW. If you dont have class you just can't get it.

Better I'll say Python = BMW 90 Germany, Ruby = Toyota 95 Japan, Perl = Chevy 87 USA but in particular this object is getting oxidized.

Ben:

Penny Arcade (www.penny-arcade.com) runs RoR. I heard somewhere that it gets around 1 million hits a day. Sounds like a pretty high load page to me.

I'm considering learning RoR myself for web scripting because PHP just feels so old to me after learning Python.

Productive Mr. Occam

This is accomplished much simpler with Korn shell, and in much clearer terms. I would also wager that it is faster, although this toy task is also a trivial and very straightforward C application. Please, consider the following:
- /bin/ksh is ATT ksh93, an advanced programming language present in modern Mac OS X (as well as other professional unices). It is worth of your attention if you use perl, python or ruby.
- C programming is not as difficult as you might thing, and for a systems type application such as this (ie, no GUI, operates on file input) it is in fact much less sloppy than your prototype comparisons, although not as clean as ksh.

Unless you are writing classes that fit into larger applications, perl|python|ruby are not worth the obfuscation of simple overall logic that is sacrificed. Worth it for very large app, not worth it for

ruby:

puts ARGF.map { |line|
line.split("\t").values_at(17, 27, 34).map { |val| val[1..-2] }
}.find_all { |email, contactme, skutitle| contactme == '1' }.map { |*arr| arr.join "\t" }

i believe ALL ruby versions should be CLEAR on first read - and

NOT short because you CAN write it short:)

Baishampayan Ghose

I am assuming the data to be of the form:

"SKUTITLE" "CONTACTME" "EMAIL"
"Product1" "0" "foo@bar.com"
"Product2" "1" "foo1@bar.com"
"Product3" "1" "foo2@bar.com"

since the sample data is giving a 404.

So my solution is this:

from csv import reader, QUOTE_NONE
from sys import argv, stdin

data = stdin.readlines()

reader = reader(data,
                delimiter='\t',
                quoting=QUOTE_NONE)

for row in reader:
    if row[1] == '1':
        print row[0], row[2]

The above code can be modified easily to read any form of data.

Perl, Python and Ruby are all pretty cool languages and we have seen some very good solutions to the problem in all three. I like the solutions proposed by bartek (python), thoran (ruby) and brainspun (perl). Why not simply learn all three? They each have an area where they are clearly ahead of the game:
- perl for CPAN, the built-in regex and the fun-factor
- python for things such as Pynum, Django, Turbo Gears slightly better speed and ease of FFI through ctypes, not to mention good unicode suppot
- ruby for code blocks, case expressions, built-in regex and ROR.

IMHO, learning Ruby from a Perl background can only be beneficial in really grokking the language. I really like the code blocks in Ruby and the =~ operator in both Ruby and Perl (although I think the latter is the absolute king in regex support).

Python on the other hand, feels less cluttered due to no automatic variables ($_, $., $@), c++/java like use of the dot operator in class.object.method()/property and (I don't really like using '::' and =>).

The point I am trying to make is that all languages have their strengths and weaknesses. I like and use all three, altough I am currently more proficient in Python...

Oh, I don't mind jj's solution at all but you need to grok functional programming to get it... Also in my previous post I forgot to mention list comprehensions for Python as a really cool thing and -> came out => (which is also used in Ruby) but for building hashes. Ruby drives more like a Lexus than a Toyota to me, although I still like taking out the BMW and the Chevy for a drive!!! ;-)

List comprehensions are by now very idomatic in Python. While Ruby has a number of nice features, its lack of list comprehensions always frustrates me a bit. Here is my solution to this problem in python:

#!/usr/bin/env python
import fileinput
EMAIL, CONTACT, SKU = 17, 27, 34

records = [[f[1:-1] for f in line.split("\t")] for line in fileinput.input()]
contacts = [[r[i] for i in SKU, CONTACT, EMAIL] for r in records if r[CONTACT]=="1"]
contacts.sort()

for r in contacts:
print "\t".join(r)

print '\t'.join(sorted(contacts))

will save you two more lines but will it make the code better??

I think list comprehension are good when used with moderation but when overdosed make code look very cryptic, like Perl maybe... trolling ;)

Bartek

Heh exactly speaking:

print '\n'.join(['\t'.join(r) for r in sorted(contacts)]),

Both bartek's and Baishampayan's versions read the whole file into the memory before running.

In Bartek's solution, changing
for record in [line.split('\t') for line in sys.stdin]:
Into:
for record in (line.split('\t') for line in sys.stdin):
Would change the full list comprehension into a generator, meaning it will only read the next line when the loop iterates.
You also copy the whole list with the second list comprehension, which can be transformed into a for loop after the in-place sort.

Hi,

Interesting Thread, ..
I am a perl lover that just learnt Ruby and indeed I like Ruby a lot, partly because it is close to perl in some sense with a lot of built in stuff and a cpan like network with gem ( though less libraries and a lot of beta/alpha code).
The one thing I find really really really annoying in Ruby (besides the end all over the place ) is the non autovivifying arrays and hashes.

Where in perl you can write :
my %blah;
$blah{dum}[2][7]{verydum}[0]="tralala";
which creates all arrays and hashes along the line its more convoluted in Ruby.
- unless somebody has a nice way to do this ? -

I am doing a Python / Ruby comparison trying do decide which language to choose for a big project that I have before me.

The most important thing, by far, for me is maintainability. I want someone (not necessarily someone conversant in the language!) to be able to go back to the code and quickly grok what it is doing. Readability, English-ness etc. Perl is way too cryptic for my purposes.

Kerry's Python example is the only one that really "fits my brain". I am very very comfortable with the real-life-nature of OOP so the Customer class just fits.

I hate the absence of type declarations in Python, but it appears that PyChecker can help out in this regard.

I have not seen anything in Ruby that compares. I am a RoR fan but that is secondary compared to the issue of maintainability of code.

Mark McDonald

#Where in perl you can write :
#my %blah;
#$blah{dum}[2][7]{verydum}[0]="tralala";
#which creates all arrays and hashes along the line #its more convoluted in Ruby.
#- unless somebody has a nice way to do this ? -

I had the same problem (as I also came from perl to ruby and couldn't figure this one out right away). Somewhere I found a post that did something like this:

data = Hash.new()
HashFactory = lambda { Hash.new {|h,k| h[k] = HashFactory.call} }
data = HashFactory.call

data[:x][:y]['z'][1] = 2121

puts data.inspect

#{:x=>{:y=>{"z"=>{1=>2121}}}}

hope this helps!

Mark

This requirement is most efficiently done in standard Bourne shell...

% cat x
"SKUTITLE" "CONTACTME" "EMAIL"
"Product1" "0" "foo@bar.com"
"Product2" "1" "foo1@bar.com"
"Product3" "1" "foo2@bar.com"

==

% cat y
#!/bin/sh

cat x | # open file and process
tr -d "\"" | # remove quotes
grep "^.* 1 "| # regex to find proper rows (tabs inside quotes)
cut -d" " -f 1,3 | # cut fields 1 & 3 (tabs inside quotes)
sort # sort and display to stdout

==

% sh y
Product2 foo1@bar.com
Product3 foo2@bar.com
%

Interesting discussion though!

Troy.
#

Some more optimisation.

from csv import reader, QUOTE_NONE
from sys import argv, stdin

for row in reader(stdin.readlines(), delimiter='\t', quoting=QUOTE_NONE):
print (ow[1] == '1' and (row[0], row[2]) or '\r',

I know this is outside the scope of your examples, but possibly informative... I did a similar comparison between C vs Perl and Perl vs Python - with the intent of comparing OpenGL performance. I plan to do the same for Ruby shortly. http://graphcomp.com/opengl/benchmarks

Perl provides OpenGL performance comparable to C, and much faster than Python; I suspect that Ruby will also be faster than Python, and close to Perl. Updates will be posted on the POGL site.

This is a really useful post. I've learned a lot today, so my conclusion, as far as Perl is concerned, is this ONE print:

#!/usr/bin/perl
print join("\t", @$_), "\n" for
sort { $a->[0] cmp $b->[0] }
grep { $_->[1] eq '1' }
map { [@$_[34,27,17]] }
map { s/"//g for @$_; [@$_] }
map { [split /[\t\r\n]/] }
;

The perl examples look like jokes... Anyone reading that sort of code at work is probably going to have nightmares with gigantic (^[a-Z])'s...

I find it interesting the number of programmers who assume that shorter code == better code.

Having programmed for a quarter century and professionally for 15 years, having gone through literally dozens of languages, operating systems, and development environments, I've come to a few conclusions:

1. Good programmers are rarely flashy.
2. Good code is rarely impressive; rather, it's clean and functional.
2a. Good code is easy to read.
2b. Good code is easy to understand (variant of 2a.)
2c. Good code is easy to maintain (result of 2a. and 2b.)
3. Good programmers produce good code.
3a. Good code is OPTIMIZED LATE. Early optimization (in other words, optimizing before you know the scope of the problem) is a telltale giveaway of a programmer who thinks he knows more than he does.
4. Good programmers rarely brag about it, rather they just know that it's what they're paid to do, and they do it instinctively.
5. Good programmers are rarely impressed with the newest hype or fad, and good programmers can write good code in any language.
6. The language usually doesn't make anywhere near as much of a difference as the programmer's skill does. I've seen beautiful, clean, maintainable, performant Perl code, and horrible, ugly, slow Python/Ruby code. It's all in the skill of the coder. NB: I'm not bragging; I've written code that falls into both the former and the latter categories.
6a. API/portability differences usually make much more of an impact than syntax. The poster who recommended choosing your featureset/API first, then learning the language required, was spot on. It almost always takes much longer to roll your own than it does to learn a new system. No matter how much better you *think* you can do it.

So while this is an interesting read (esp. wrt syntax differences between the three scripting languages), I don't think it's terribly edifying.

This post and comments has confirmed that for me:

(1) Python code is far easier to read than Ruby or Perl code.

(2) List comprehensions are much clearer to read than map/filter/lambda

(3) the .join string method is a blight upon the otherwise clear Python syntax.

A word on why these excellent languages have not become the mainstream in the last decade.

I was expecting Ruby code to be more easy to read, but definitely Python looks easier in this comparison.

I would rather readable, maintainable code over one line programs while developing in group. If I'm to create an script only for my self, and I do not need to read it again and modify it a month later, a super-optimized-obfuscated on line programs makes me feel very happy and proud ( but I don't share that code, otherwise I have always to add comments like the shell version )

I think that readable code is useful in a development team that has different programming skills. Actually I think this is the reason java is so successful. It provides an strict syntax that allows the compiler to remove THOUSANDS of trivial errors, and allow the novice programmers to have a more readable high level version than something written in C++, plus the portability.

Still, I've seen HUNDREDS of THOUSANDS of java.lang.NullPointerException in the last 8 years.

The reason why I think none of these excellent programming languages have become the mainstream and will never do at least for another 10 yrs. ( and we are 2007 ) Is due the existence of novice programmers. I know that sounds strange, but if everybody would be a good programmer, everybody would have used these dynamic languages in the last decade. But as there is a huge base of young people learning to do things, something as strict as java is needed.

That was aswell the reason why smalltalk, and even things like Objective-C did not had the acceptance that c, C++ and Java had.

These languages ( Python, Ruby, Perl ) have remained to a small amount of experienced programmers. Probably that is also the reason why M$ "languages" ( are they programming languages yet Oh, yes, after .net they did ) have as well that wider acceptance. Still under Java of course.

What do you think?. Do you believe that compile time vs. runtime error check makes a difference with the novice programmers, that is with the 80% of the programmers out there. The kind of programmers that only learns enough to earn some money and have a job, but does not care about forums like this one, or learning anything new?

Nice thread.

Do you think you could put the example text back up? I was interested in your comparison but it is hard to run the scripts without the example stuff.

Thanks.

BKB: Sorry, but it looks like that file’s been missing for a long time, and I don’t have an archive handy.

Who can draw a conclusion then?

There cannot be a perfect language.

The beauty of Perl is what really grabbed me, and that was after I was a professional developer for 10+ years. The $ on variables is actually a good thing. Also, if you were inside a function, the my and our definitions would relate to you the scoping the author was trying to achieve.

I disagree that Perl is hard to maintain on larger projects. If you think that, then you aren't using Modules and other Object Oriented techniques. Perl is easy to maintain, scale, and debug. I'm not saying it is the best, but I reach for Perl as my secret weapon over and over again. Perl really is the swiss army knife from hell.

Also, when I write .NET code for work (luckily .net is only 30% of the work, I yearn for the freedom of Perl, Bash, and C on Linux.

I will point out that all of the above code examples are flawed in that if you want clear, readable code, you need to add many comments. With proper commenting, all 3 language examples would be ideal. Some employers have required me to have a comment on every line. This is not uncommon. So, looking at code and understanding it clearly without comments is only good for you, and not another programmer that comes along.

On another topic. Interpreted languages are definitely better in that they done require a re-compile if you switch operating systems. That alone should make all 3 of these languages a wise choice. I'm not saying there is anything wrong with compiled code, but in many cases, especially with the rise of web applications, this holds true.

I've restored a recreation of the sample data.

Wow. This is literature. I have never programmed a single line in my entire life, but I read this post and its comments with great pleasure. It made me think of rappers freestyling.

Hi,

This is one of interesting articles where experienced developers shared their views on perl,python,ruby without many flames.

Iam new to these three langs and I previoulsy had c/c++ exp.(ofcoursely relatively very small). For few projects in our organization, we are looking for suitable lang among these three. Seems every org. has some share of either of three.

Python does not have braces for blocks. braces are useful to trap errors. If you put one statement in incorrect place, ofcourse wehave to debug.

Ruby and python use too much of english and one poster rightly pointed out that org. ask for comments. With comments, it is just mix of english everywhere and just imagine you write a comment in near line syntax and missed the "#" or "//" whatever. It is another nightmare.

some perl examples given above are not readable but some are readable. So it is not problem with language "perl".

posters like "Tom C" and "Mark affuleck" made valuable commnets. Thanks for that. This helped me to choose lang. based on existing matured capabilities, rather than anticipating x will reach y state IF modules are ported ...etc, which is not guaranteed.

I prefer to go with perl.

I will agree with Tom C that it is more of a matter of style since any language allows you to write in a lengthy or short way. And as he pointed out, the 'right' style is the one that appeals most to readability. For example, I quote the original algorithm written in plain english.

Read each line of standard input and break it into fields at each tab.
Each field is wrapped in quotation marks, so remove them. Assume that there are no quotation marks in the interior of the field.
Store the fields in an array called record.
Create another array, records and fill it with all the records.
Make a new array, contactRecords, that contains arrays of just the fields we care about: SKUTITLE, CONTACTME, EMAIL.
Sort contactRecords by SKUTITLE.
Remove the elements of contactRecords where CONTACTME is not 1.
Print contactRecords to standard output, with the fields separated by tabs and the records separated by newlines.

If a programming language could input this and give the correct results, in my opinion, this is the BEST language.

The best programming language must:
1. Accept a readable style of programming.
2. Encourage a readable style of programming.

Readable and logical to the way we commonly think. To this end, I will choose Java as being my language of choice.

Many thanks to all who haver so far participated in this lengthy, but friendly and thoughtful thread.

Before I came to it, I was an experienced, long-time Perl programmer with major doubts about my chosen language. After the hour of carefully reading through this thread, I am now...drum roll...still a Perl programmer.

Sure 'word[2:4]' is cooler than 'substr($word,2,4)', but 'substr' is literal and helps to see immediately that I am dealing with a scalar and returning a substring. But arguments could be made for either, and really not a good reasons for choosing either.

So, why Perl? Gavin's comment, "Perhaps your time would be better spent becoming a better Perl programmer than to learn a new language?" was the watershed moment. If I was a new coder, never having worked in any of them, I might not chose Perl, but I'm not a new programmer.

Bottom line:
1. I know Perl
2. It has worked well for me
3. It can do anything
4. I already write well-structured, easy-to-read, non-obfuscated code
5. CPAN (one of the real clinchers)
6. Perlmonks

--Brad

Why do you use such complex languages? We use PHP for such programs. Map, grep, filter are too complex to support. IMHO good, supportable program should not have them - it requires knowing language to read them.

I've done quite a lot of Perl, enough Ruby and Python.

I found all versions of all programs above very readable (perhaps I'm gifted? ;).

IMHO Python is the least readable language of the three for maintenance as it's more verbose. Both Perl5 and Ruby are reasonably terse, and can have both good function-point density and comment density in a page of code.

Every other aspect of readability boils down to two things - 1) do you know the language, and 2) does your code follow the project/company standards previously established.

If you have developers/maintainers who can't grasp the language, you've got much bigger problems waiting for you.

To those (like J) who think that languages like Perl5, Ruby and Python are "complex languages"... I'm afraid that you'll find your skill-relevance will have evaporated in about 5 years.

Those features (map/L.C, sort, grep/filter) can be grasped fully by 12 year olds (I've taught them).

Languages (all modern ones) will get more (lower-order) functional in nature, not less. You need to upgrade your brain, not downgrade your language.

Java is slooowly gaining the (lower-order) functional features that C# has been piling on recently.

Perl6 is a quantum-leap toward a full-featured dynamic programming platform.

What I'm looking forward to is a time when:
the programming retards have changed profession
simple applications are programmed using languages like Perl6, Java7, C#4, etc
demanding/important applications are programmed using languages like Haskell, Scala, Erlang, etc.

The likelihood of that mostly rests on education of future programmers (right now we're churning out idiots)... time will tell.

Terrible Sponge

A lot of people complain that Perl doesn't look like English. The solution is simple:

use English;

Seriously. Its that easy. Then you get nice names such as $ARG, $OS_ERROR, $CHILD_ERROR, and $EVAL_ERROR in stead of $_, $!, $?, and $@.

If it were not for this module (its been a core module for a while now, so it works anywhere) I would have switched languages long ago.

In this thread different level of developers shared their views. First time I am planning to use one scripting language for the common administrative purpose. I used to program in old good 'COBOL' language. I know well, I have to learn a fresh.

This thread make me more confused about selecting a correct scripting language. Can some one clearly deciphre it?

[...] Michael Tsai - Blog - Perl vs. Python vs. Ruby … in Perl, I might have tried to avoid Perl [...]

My experience with writing code has always been very project specific. Though I'm not a sophisticaded programmer, as a course six alumn from MIT I like to think of myself not as a complete rookie either. Throughout the years the occasions when I had to actually sit down and write code varied substantially in size and scope. As such, I have had a chance do program in a number of languages, both systemic and dynamic, such as Lisp, C, C++, Fortran, Pascal, Perl, Python, Ruby, .Net, Visual Basic, Sed, Awk, Tcl/Tk, etc. I own a website that has around 200 thousand unique visitors a day, and that was done in PHP (though I only sat down to write the core engine of the site, which is very math and DB intensive; I have rewritten some of it in Python recently). A couple of months ago I had to design a program to assemble a custom-made catalogue with hundred of photos on the fly for our clients, in the format of a pdf file. This task, which required 2 full time employees to assemble the catalogue every fortnight, is now done by the program and through the internet, sparing my employees who can now perform more productive work. In my first job I had to program monte carlo simulations and markov-chain algorithms in C, which after compilation was send to mini computers that required a couple of hours to come back with the results. Currently one my hobbies is to play with math intensive coding that explores the intricacies of the zeta function and the Riemann hypothesis (all programing done in Perl). I mention all this for the sake of showing that I believe that my view on how programming languages compare with each other comes after many years of using and exploring a variety of them in different contexts, and for different needs. I don't claim to write beautiful code, and honestly I suspect I don't. If anything, the fact that I never get to code in any single language for long enought certainly doesn't help in mastering the idiomatic strenghts of any of the languages I used. My focus is always on getting the job done. Fast. And by and large I have succeeded at it. But the bottom line is that when presented with a new problem, if the platform I work with poses no constraints in terms of the language I can use, I will normally choose Perl. It's strength and flexibility makes it perfect for fast development/prototyping. As for those that find it cryptic or hard to maintain, I must say that there were times I went full five years without writing a single line of Perl code and yet was able to go back and re-use thousands of lines I had written long before. CPAN in itself could be an excuse for one to choose Perl (the PDF::API2 comes to mind, for instance, allowing me to do image manipulation and typesetting work with Chinese/Unicode, the latter being something that not even Knuth's TeX would allow me to accomplish). Currently I wondered if I should write the Zeta function test I run in C in order to speed things a bit. The result? Of 5 routines I rewrote in C, one ran faster than the original Perl code, 3 of them ran as fast as... and one ran MORE SLOWLY than the original Perl code!!! I was surprised but that apparently had to do with Perl's dynamic allocation of memory. Anyway, one must be confortable with the tools one choose. In my case -- and it's been 18 years since I wrote my first Perl program -- Perl has time and again proven its value in the must diverse and unsuspect circumstances. My 2 cents.

I started programming with C++ and adopted its efficiency and terseness philosophy. Nowadays though, the programmer's time should be worth more than the computer's, so rapid development time and maintainability are key. I find that Perl/Ruby are more of the old paradigm. Of course you can write clean code in those, but the language structures support it less. There's also the environment ; if you have a huge IDE with tons of automated tasks and refactorings, it won't be a hurdle to clean your code, so your code will be cleaner. That's why I chose Java, and I would be more inclined to use Python as a scripting language. Days of one liners are over ; you may spend 5 minutes writing that one line, and make people(including yourself) lose 1000 hours reading it and expanding its meaning in their head. Something like aString.removeAllWhiteSpaces() is pretty clear, gibberish like s/"//g isn't.

Until you learn what a regular expression is. Once you've done, you will never want to come back to dumb wrappers like .removeAllWhiteSpaces(). Can you also even imagine writing a module containing wrapper methods around the infinity of possibilities regular expressions give you? Talking about a waste of time, finding names to those methods would a huge one. s/\s+//g is perfectly readable. If you find regular expressions difficult, break them down in smaller pieces using qr//.

What an amazing thread. I am impressed with the opinions and insights. I still don't know which language is better to become more familiar with. I guess I need to learn at least two out of the three. I might add that PERL seems to have the most support here and offers up a variety of ways to solve the initial problem.

I always found PERL to rescue me in the toughest of the circumstances :)

I guess once you learn one of them (Perl, Python or Ruby) you will just hold to it.

Recent post from /. http://tinyurl.com/24jou4 shows Python as growing most in 2007. Being widely used at Google helps Python, also Microsoft support in IronPython and Jython integration on Java side of world make Python in my opinion first choice if you haven't got scripting language into you programmer's toolbox yet.

Ruby is very cool.

And Perl well... see http://tinyurl.com/yob2nm

Contrary to /.'s post of Jan 25, I don't code in a given language just cuz I know it. Like I mentioned in my comment (Jan 7), I look at each new problem/project individually, and try to select the tools accordingly. In the case of my website for instance, I decided to rewrite the core engine in Python, even though I had never coded in that language (not Perl, contrary to /.'s argument). The reason being that the team that would maintain it afterwards knew Python the best. So I took it up to learn a new language even though I would be perfectly comfortable writing it in PHP or Perl (or C/C++, for that matter). At this point I defer to Tom C's brilliant comment: "The language usually doesn't make anywhere near as much of a difference as the programmer's skill does. I've seen beautiful, clean, maintainable, performant Perl code, and horrible, ugly, slow Python/Ruby code." And I absolutely agree with David's point when he mentions Python being very verbose, and the need to strike a balance between function point and comment density. In general I find that blindly following the hold-on-to-the-language-you-know approach tends to correlate with poor programming skills.

ARGF.each { |line| p line.split("\t").values_at(17, 27, 34) if line.gsub!('"', '').split("\t").values_at(27).include?("1") }

I think all of you have missed a vital element to writing any program.....the customer. JAPH got the closest when he talked about writing in Python so that the people that would inherit the code would understand it, but ultimately, we have to face the fact that we are generally not going to own the code once it is handed off to the real owners, those that told us to write it in the first place. Granted, sometimes that is ourselves, but in those cases, like so many of us have pointed out, we'll just do things our own way and cleanliness and commenting go out the window, fast. :-)

My point here is that you absolutely must consider who the code will be viewed by and whether they will be able to replace you as their primary programmer after accepting your code. What happens if you get hit by that bus on the way home? People that write one liners will have their name cursed...those of you who write poorly documented code won't fare much better either. People who insist upon using what is most comfortable for them just because of that fact alone aren't taking into consideration the strengths of each language and utilizing them to their best.

I came here looking for a good discussion on what language was the best to use in a given situation and found a very well developed discussion on programming in general, which is good. It has helped me determine that I'm on the right course in that I should consider all aspects of every project and chose the best language for the individual task at hand. No one language is best at everything, even on a web application, so my conclusion is to learn them all and utilize them in their areas of strengths: Perl for text manipulation, shell scripting for simple operations, Ruby (most likely) for database integration. Quite frankly, I'm not exactly sure where Python would fit on the web as there aren't that many tasks that it seems overwhelmingly suited for, but perhaps that's because I don't know it well enough. Bottom line is that a good programmer will even mix his programming languages in a single project when possible to take advantage of their individual strengths.

Anyway, that's my 2c

I was trawling through the web for Python, Ruby, Perl comparisons when I saw your excellent discussion. Although I can program in many languages I am hooked on Perl. Nevertheless I was aware of all the hullabaloo about python and ruby and I thought I may need to upgrade.

I read all the contributions with interest and I conclude that although python and ruby are nice languages and they indeed force you to write readable and maintainable code, perl has nothing to be envious of them. I already know that in perl we can write highly readable or highly idiomatic code or both if you write lots of comments. It is your choice. In the comments of those who spoke for perl I detect something like a certain infatuation with the language and I can relate to this.

In conclusion, I will carry on with perl. Thanks

PS. If JAPH wrote his first perl code 18 years ago then he was using perl before Larry Wall invented the language! (I will say nothing about his "zeta function and the Riemann hypothesis... or the chinese typesetting" he he he...

Re: Proteus

Funny we both found this old discussion so many years later. I was also looking for a comparison of the three languages, but after seeing Gavin's excellent post from December 1 at 7:17 PM, I don't really need to see more. Perl allows for very clear and concise code. It's just a matter of becoming familiar with the idioms; becoming fluent.

[...] Michael Tsai post. Smack down between Perl vs Python vs Ruby. The flame war uses List as its battlefield. The sample codes include list comprehension, filter, lambda, block, and various Perl idioms. Very good flame war! [...]

Old Unix Guy

In days of old, we had sed and awk, and we liked 'em:

#!/bin/sh

sed -e 1d -e 's/"//g' |
awk 'BEGIN { FS="\t" } $28 == 1 { printf "%s\t%s\n", $35, $18 }' |
sort

---
$ doit.sh

Old Unix Guy

Ooops. Hit and posted the durned thing!

---
In days of old, we had sed and awk, and we liked 'em:

#!/bin/sh

sed -e 1d -e 's/"//g' |
awk 'BEGIN { FS="\t" } $28 == 1 { printf "%s\t%s\n", $35, $18 }' |
sort

---
$ doit.sh

Old Unix Guy

Okay, my apologies. I assumed the comment-taker was smart enough to HTML-ificate the text. But no, it just eats things that look like tags. Ugh.

Anyway, feed the sample file to the shell script as standard input. It works. Trust me. :-)

The whole argument that one can write easy-to-read Perl code is, to me, kind of vacuous. Yeah, one can. But get real - almost nobody DOES. Perl is an abomination. Some cool stuff written in it, though. :-)

Younger Unix Guy

Word to the Old Unix Guy.

I've been writing perl scripts for well, at very least 5 or 6 years, and I still cannot skim through one of my colleague's scripts and know exactly what it does instantly.

I do not, and have never liked Python's syntax, for whatever reason, I don't like it. It's a personal thing I suppose.

I have been recently learning and scripting in Ruby, and I could not be happier, an excellent language that has been eating up and converting some of the oldest Perl programmers.

Wow, I love this thread. I just started learning python and was interested in other scripting languages (that darned o'reilly book gave me the idea of looking up "python vs perl" (seems he's pretty popular around here)).

So, I came across this page. I'll have to say that
Comentador's comment about nightmares from programming perl really cracked me up once I saw what he was talking about. Good job there.

So once I get a bit better at Python, I'll post my solution and see how it fares against these others (there hasn't been a new example in a few years).

Ruby, from what it sounds, looks pretty good as well. Though being primarily a BASIC/C programmer myself, I think I'm going to like Python/C and feel at home.

One thing that Python has and GCC doesn't is, have you ever seen a current, stable GCC Windows XP x64 binary? I didn't think so, only thing that I've seen for Windows x64 would be the MS VC .NET IDE. Thats one of the things that I am really impressed with with Python.

Well, there's my two cents.

Oh, and one more thing:

I had was completly ignorant as to the fact that you could just type "python" or "ruby" or whatever kindof oldskool stuff that that Unix guy conjured up, into like 99% of major linux distrobutions.

As I only have a liveCD distro to work with atm, I was shocked really, when I opened my terminal, and to my utter surprise, when I type "python", sure enough, the interactive python interpreter comes up.

Things are looking great right now. Fedora 9 comes out in a few days, I'll use that as my development platform.

(sorry in advance if I seem a bit too loosly connected to the subject of this thread)

[...] korporacijų užnugarius ir didelių projektų, tad artimiausi Python populiarumo konkurentai yra Perl, Ruby ir tikriausiai PHP bei Delphi. Įdomu, ką Tiobe reitingai rodys po [...]

My principal tool of choice was Perl for several years; then it was Python for several years; Now, I am back to Perl.

I really do like the cleanliness of the Python. Guido did an excellent job of making code look like pseudo code. This is a huge leap forward for programming.

And had it not been for the success of Python and its influence to fix the ugliness of Perl code. I would have happily never written another line of Perl code. But the Perl community took to heart the criticism of the programmers who fled to Pyhton. If not for the aforementioned English, and Perl Critic and, especially, Moose ... that made Perl more like Python ... I would have stayed with Python. Since the Perl community acknowledged and started devising problems for this short-coming, I was more comfortable returning to Perl.

I also like CPAN as a repository better than anything similar offered by any language, period. CPAN does have it's problems, but its wealth as others have pointed out is remarkable. Often projects become a simple stitching together CPAN modules.

And for what I do, Python has two annoying quirks, both related to performance.

1) Threading. The global interpreter lock problem is a sincere impediment. Once the task becomes CPU bound on one of the processors of your SMP box. You are stuck. You can, of course, incur an expensive fork process, but then passing values between the seperate processes becomes expensive. I am a little surprised that compute intensive companies such as Google have not pushed the Python folks to overcoming this limitation.

2) Performance. Objects orientation comes at a price (~20% from some simple benchmarks). Perhaps somethings such as strings should (optionally?) not be an object.

I suspect that the Python community will figure these out ... and not take as long as the Perl communities release of Perl6 ... and when they do, I will probably be back to Python.

xtopher: (1) Google has such big computations that they have to split them across multiple processes (and machines), anyway. (2) Cite, please.

Now that Google has released the Google Apps framework which is authored in Python, do you think that Python will become a more popular choice than Ruby?

Did you program SpamSieve in Python? I've been a SpamSieve user for many years (and love it).

Brian: Certainly, it can only help Python's popularity, however I was under the impression that Python was already more widely used than Ruby. SpamSieve is written in Objective-C, however I use Python in EagleFiler and DropDMG, as well as during the development process for all my apps.

Hi,
Very interesting posts! I have been using perl for many years now, and have also been involved with much Java and some .Net. I have toyed with Ruby and Python, and both, while "ok", just didn't do it for me.

I agree with several other posters though, in that good, readable code is "documented code". Any code is going to be difficult for another programmer to know exactly what the previous programmer wrote and why. Many times now I have revisited my old code, only to find that I hadn't documented it. Other times, I have found documented code in whatever language, and it is sooo much easier. Looking at other folks code (ahem, Java developers listen up), I find the lack of try...catch... disturbing, and the constant drone in logs about java.nullpointer...blah...blah..blah completely disgusting. test your variables please, it will make all of us very happy!

But then again, Perl's syntax doesn't bother me at all, and I can usually read right through (its kind of like a warm blanket ;-).

It also goes to point out that I agree (with whoever now ;-), that knowing multiple languages is invaluable. Those that only bother with one are destined to be the guy that gets laid off when the company moves on without them. As someone who has worked in everything ranging from Windows/VB -> Solaris/Java -> Linux/PHP/Perl, I have only found my skills getting stronger with each new task/project.

Much of any language is going to be planning, and figuring out what works, does so consistently, and provides benefits over other languages.

Oh, I also agree that Perl is the swiss army knife hell. I have never found a language so powerful, and so "suited' to UNIX. Oh, for the old guy, I agree that sed/awk/ etc were fairly easy, but man, what a pain in the arse, and to boot, I had so many of those darn things running around, I thought I was going to lose my mind ;-).

Regarding maintainability.

Standard and defacto standard ways of doing common tasks make for better maintainability.

I argue that this:
$t=~s/\s+//g;

is more maintainable than:

t.removeAllWhiteSpaces()

Because real perl programmers would know exactly what the first line did without having to look at some other function/method, and if they understand the rest of the code, they might know instantly whether that line is doing the right thing or not (otherwise hope the comments/docs are good ;) ).

Whereas the second requires you to check what the removeAllWhiteSpaces method is actually doing, and if it isn't doing it, then you have the problem of figuring out whether that behaviour is correct or not (better hope the comments/docs are good ;) ).

So, to me, the first way of doing it is more maintainable.

For a similar reason something like CPAN is a great factor in maintainability.

Since there's often a _good_ CPAN module/library that does something common, it results in perl programmers using the same libraries, in understandable ways.

While Java also has lots of standard libraries, Java tends to be a lower level language (more lines of code to look through), and many of the libraries appear to be written to fulfill some spec rather than to accomplish common tasks more easily (whereas CPAN modules tend to be written to be used by mere mortals).

[...] http://mjtsai.com/blog/2002/11/25/perl_vs_python_v&#8230; I’m evaluating Python and Ruby as replacements for Perl. I’ve been using Perl for [...]

Michael:

Understand this:

1) We can write maintainable code in perl too.
Python is anyway naturally clean

Perl - 1, Python -1

2) Perl is faster than Python
Even using Pyscho doesn't help(offers a max
of 4x speed than the original version). Only
few people can play with 'Pyrex'

perl - 2, Python -1

3) Mod-perl better than Mod-python
We can tweak apache better with perl

Perl - 3, Python -1

4) DBI:x (No such facility in Python)

Perl -4, Python -1

5) Great Template engines (equal)

Perl -5, Python-2

6) Good web framework
(Django better than catalyst)

Perl -5, Python-3

7) Optimized Scientific modules (Graph & Math)
Python better than perl

Perl -5, Python-4

8) Programming Web services with ease (equal)

Perl -6, Python-5

So almost both are nearly equals with a bit of
advantage with perl. Also, I can iterate that
we can write clean code with perl also. Further,
having $, @, % helps me to know the datatype
at any point in the program than just a name as
used in python, java, c++ etc.

Summary:
-------
For high calculation and graphics - Use Python
[SciPy, Wx are just gr8]

For Standalone apps - Use Python

For web apps - Use Mod-perl with Catalyst.
[Though Django is better than Catalyst (Less time to process requests), Catalyst has a better
DBIC ORM ]

Love u,
Sam

Wow. Seldom (I think this is the third time) do I comment on discussions/blogs/forums etc.

A real discussion on the internet with gentlemen (sorry if any of the post were from ladies).

I realised early on in my career that I was no programmer (with fortran, pascal, c). I could program what everyone else could but it took me five times longer and so my career path took a different turn (config mngmt).
A few years ago I discovered Perl. The immediacy of the language really charmed me as I could now write programs that made sense to me very quickly and with little effort. Things just worked! From that day, I am weekly writing perl scripts for my developers to parse information, sort files, text manipulation, and even the odd gui. I do this for them as I want them to get on and "develop" (using more traditional langs). My scripts are more often than not treated with a weird awe as they can't understand how someone who isn't a "techie" can produce results so quickly... even stupid little scripts such as

open (UNSORTED,"file1.txt");
open (OUT,">file2.txt");
@orig_list=<UNSORTED>;
@sorted_list=sort(@orig_list);
print OUT @sorted_list;

However, my scripts seem to reach critical mass at a certain number of lines (this could well be my lack of skill). After a few hundred lines the ability to quickly surmise what's going on becomes suddenly more challenging. I think the syntax also makes the program (at this point I no longer consider it a script) "look" like a garbled mess even though it's still as clear as its smaller cousins.

This appearance may be what makes some developers shy away from learning perl (some of my developers I still consider lazy mind).

I have recently started to dabble with Python and, thanks to this great article/discussion will also look at Ruby. I like the "clean" look of Python and will consider for larger scripts/programs especially if other people are going to pick it up and run with it.

Modules, support, structure, community, and the internet/open source mentality are paving the way for great new computer ideas and languages, as well as changing opinion that programming is just for "geeks". However, given a small computing task that needs to be done *right now* with little fuss I'll probably always do it in Perl and probably will be doing until my fingers can no longer type.

Matt B.

Useless talks...
Perl is art like poetry.
Python is practical like recipe.
Ruby is bicycle version 2.0.

Perl is a very powerful sharp knife, with bazillions of ways to accomplish a task. Regex is Perl's greatest strength. Legibility near the level of line-noise is Perl's greatest weakness -- Perl developers who write legible code are the exception, not the norm.

Ruby is a pure O-O language that should appeal to Perl programmers. Ruby on Rails is a home run in that problem domain. Ruby follows the Perl idiom of bazillions of ways to accomplish the task, but only gajillions (which is much less than bazillions) of them are Ruby-esque. Ruby follows Perl with its integrated regex support. Ruby's author took the best of Perl and the best of Python to create Ruby (other developer's idea of "best" may vary).

Python is a hybrid O-O language that is a good replacement for BASIC as a introductory tutorial language, yet has sufficient power to displace Perl for many tasks in the Perl domain. Perl fans may find Python's non-Perl to be off-putting. Legibility fans may find Python's emphasis on fun and legibility to be a major plus. Also it's emphasis on "only one way to do something" (the pythonic way). The blocks-by-indentation may be extremely distasteful to curly-brace lovers. Regex is not part of the core language, but is part of the Python standard library -- but having it be a standard library means that the core language is not intimately regex-ified (this may be a plus or a minus, depending on the programmer and the problem domain).

I blogged about this same topic... to me Ruby comes out way ahead almost all cases... I won't repeat all the reasons here but you can read my post if you like at http://www.strangeblueplanet.com/2008/10/perl-python-ruby.html

SBP Editor: It sounds like you have some misconceptions about how Python works.

I'm trying to create this blog on programming in different languages. Stop on by and check it out! :)

Perl have been, is and remains the most powerful piece of software.
Now with Parrot, no language can compete with Perl!

I happen to love perl. I know your input data was very specific. You didn't have to worry about embedded delimiters, records spanning multiple lines, and escaped quotes. On the other hand, it doesn't take any additional effort to write code that can be future proofed against data that would otherwise break.

#!/usr/bin/perl
use constant {SKU_TITLE => 1, CONTACT_ME => 2};
use strict;
use warnings;
use Text::CSV;

my $file = $ARGV[0] or die "Usage: $0 ";
open(my $fh, 'new({sep_char => "\t"});
$csv->column_names($csv->getline($fh));
while (my $rec = $csv->getline_hr($fh)) {
push @contactRecords, [$rec->{EMAIL}, $rec->{SKU_TITLE}, $rec->{CONTACT_ME}];
}

@contactRecords = grep $_->[CONTACT_ME], sort {$a->[SKU_TITLE] cmp $b->[SKU_TITLE]} @contactRecords;

for my $rec (@contactRecords) {
$csv->combine(@$rec);
print $csv->string, "\n";
}

Hi All,
Really interesting thread, I learned a lot...Thanks!
I'd like to add my 2p to the discussion.

I'm a biologist who sometimes has to analyze large datasets so a few months ago I convinced myself to learn a scripting language. Like many others, I had to face the biblical dilemma perl vs python (I hardly knew about the existence of Ruby at that time).
I decided to give it a go with perl for two good reasons 1) Perl is the English of bioinformatics 2) Lucky coincidence, some people in our department started a reading group of Tisdall's "Beginning perl for bioinformatics" which I promptly joined (good book by the way).

Well, after having gone through all the chapters and done many of the exercises, I realized that... perl and my brain go in different directions. I found perl often to be counterintuitive (the concept of default variable appeared to me mysterious, unnecessary and that $_ look didn't help. The same for reference/dereference.) And this is to say nothing about the syntax.

Then I tried python and that was another story. I learned the basics quite quickly and soon I could write useful scripts to make my life easier. I much prefer python's 'There must be one obviuos way of doing it' over perl's 'There is more than one way...'.

I think that in the decision to learn perl or python one should consider also how intensively s/he is going to use them. As I said, I'm not a full time programmer so for me an intuitive and english-like style is a big plus since I might not touch a script for months.
I could make sense of my python scripts even after weeks I wrote them. With perl, I forgot what I was doing even after coffee break.

I'd like to finish with a couple of questions. Does anyone have an idea of the number of perl vs python users, roughly? How is this trend changing (python gaining/loosing popularity over perl)? And what is the situation in the field of bioinformatics?

Bye

to all above: nice discussion.

to ddber: I'm also a biologist, but feel comfortable (actually, couldn't more) with perl. You simply cannot survive without the regular expression (IMHO, the real Swiss army knife) at your back, when you are thrown into the
tremendous biological text data desert.

If you want some tool intuitive(in these programmers' words object-oriented style) to deal with large dataset in biology, why do you just try R (www.r-project.com), the statistical software package with features they mentioned above in python, like classes, functions, English-style, along with even more powerful extensional packages designated for specific biological analyses.

Being a biologist, I found perl+R+MySQL almost can do every thing one can encounter.

I've been learning Python for a few months now, and it's surprisingly easy to learn. It seems very common sense to somebody coming from another language. It takes what other languages accomplish, does it faster, with less code, and makes it fun to write.

When I think of Ruby I think of:

if user.is.not.logged.in
create.new.table.in.the.database
end

As a programmer, it's easier to remember how to do things across languages if they at least appear similar, instead of reinventing the wheel for the sake of making your code appear to have the same exact lexical value as English, at the expense of the language's speed, and usability.

""Perl have been, is and remains the most powerful piece of software.
Now with Parrot, no language can compete with Perl!""
NO
Code that works and is easy to understand is powerful. Programming is complex why would I use a language thats hard to read and understand. "thats just seems stupid"
------------learn----------------------------
------------and-----------------------------
-------------USE---------------------------
-------------PYTHON------------------------

Python may or may not be the language that brings about a Computer-Programming-for-Everybody world. But it is currently the best contender. When there is a better horse, I'll switch my bet.

Perl may be cryptic.... yes, specially because of the sigils thing. Always changing! And the pass by reference...

But if you write "noisy" code, then COMMENT... dammit

In my case (and in my organization) we comment everything that is not clear enough at first sight.

Example.

foreach my $r ( @records )
{
push(@contactRecords, [$$r[$SKUTITLE], $$r[$CONTACTME], $$r[$EMAIL]]);
}

....
....
....
hundreds of lines of code here
....
....
....
foreach my $r ( @contactRecords )
{
my @rec = @{$r}; #--- contactRecords is an array of records
print join("\t", @rec), "\n";
}

Of course I KNOW that every perl programmer will assume that @contactRecords is like that... But many times, we have non-perl programmers looking at the code (we use many languages, from java, visual basic, asp, c adn perl) and many systems interact here, each part written by different companies with different language choices.

So, the solution is REGARDLESS THE LANGUAGE, DO COMMENT WHEN IS (even barely) WORTH IT.

BTW, the comment "#--- contactRecords is an...." is just an example... it could have been anything, of course it just have to be meaningful in the context.. Anyway

I found Python to be a good choice also. Having to see a lot of code from others programmers, having them forced to indent correctly seems like a nice feature.

The only thing i hate about perl5 is the lack of a nicer OOP syntax.

BTW, excelent discussion you have here :)

Hello friends, i am looking for perl indepth good tutorial. i found couple of tutorials over google but they are not good enough. i want tutorial for job perspective
or any one having personal notes and can share with me.....please send me buddy

please friends help me, i don't even have enough money to buy books or to join any classes but i promise i'll pay you back once i got the job....

for god sake help me

God bless u

Just another Programming Language Designer

Try http://perldoc.perl.org/ then click on "Tutorials". There are many tutorials about specific sub-topics in Perl, but maybe you better try http://perldoc.perl.org/perlintro.html first. The Perl Intro is a short introduction to Perl and complete enough to get you started in the earnest.

Well. the best easy book to learn perl i´ve ever read is Perl and CGI for the World Wide Web: Visual QuickStart Guide, from Liz Castro. Is really cheap, and short. Really a Quickstart. Is not an in-depth manual. For an indepth view, just check the documentation of perl itself, available through the console, or just googling PERLDOC PERL

This can be achieved in a SINGLE line of bash shell scripting:

cat inputfile | awk '{if($27=="1") print}' | sort -k 34 | awk '{print $34 $17}'

Assumptions:
1) inputfile is the input data file
2) As shown in an example above EMAIL is the 17th field, CONTACTME is the 27th field and SKUTITLE is the 34th field respectively.

The power of python and perl is not put into proper use when you don't do stuff that are the strengths of those languages.

You cannot compare perl and python with such examples/problems.

Generally perl is preferred over python MAINLY because of the advantages perl modules offer over that of python.

If the modules are equivalent then python is preferred because its programming style helps the programmer avoid much documentation.
Good perl code needs to be documented very heavily in order to be readable over time. Python needs it as well, but its syntax makes the program code more obvious than its perl counterpart.

Let me tell about my experience with perl.
We are quite big project (well, more than 100 requests per second).
What we do to avoid complex constructions is we have our internal rules which developers should follow. For example, we don't write ifs on the same line like:
print if $something_happened;
we use
if ($something_happened) {
print;
}
instead. That improves readability when you try to follow.
Another thing we do is review all commits to follow those rules. (We also review other possible problems in the code. All developers receive those updates).

So there's no readability problem in our project. Even if you are concerned by readability, perl is very customizable. I even think it is possible to convert perl to any language without even changing to other interpreter :)

What we really like in perl is very huge CPAN archive. It really helps saving time which would otherwise spent on programming quite simple things. For example, date conversion, file format detection, server processes handling, protocol handling and many many more...

This is in response to a question of perl vs python in the field of bioinformatics. There is no question about it. If there was no perl, biologists would be upto their a***s in C code or something even more horrible. Perl finishes the race and python is not even out of its tracks. Although the thread has argued long and hard over readability of code, I believe this finally boils down to individual style. Its possible to write unreadable code in any language. It may be a bit more true for perl, as it has been candidly confessed by its developers "more than enough rope to hang yourself". But my point is why should you go and hang yourself at all? Perl provides absolutely simple ways to document your code. So my advice to users (of any language) is to learn to document your code if you are concerned with readability. My own perl code is quite readable and usable even after years and years. I wouldn't touch python with a long pole if I had to, I'm now completely addicted to the perl way of things and beyond hope. As someone in the thread pointed out ... perl is like art and I agree. While python allows you to do a job, perl allows you to do it "and" dream.

peace

here's another way to do it, using a CPAN module
called Sort::Fields. I've deliberately tried not to be too terse.

#!/usr/bin/perl

use strict;
use Sort::Fields;

# you can use these as comments
#my $EMAIL = 17;
#my $CONTACTME = 27;
#my $SKUTITLE = 34;

my @selectedRecords;
while () {
my @array = split (/\t/);
next if $array[27] ne "\"1\"";
push (@selectedRecords,"$array[17]\t$array[27]\t$array[34]");
}

print map { "$_\n" } fieldsort '\t' , [3], @selectedRecords;

I realize this as a 7 year old weblog(blog) post, but I wonder what Michael Tsai is using as a programming language, when dealing with text now.

BTW. the website indeed has a list of how many perl jobs, php jobs, ruby jobs and python jobs are available. They started the tracking in 2005 and it is current to date. I took the liberty to state the jobs greatest to smallest for folks who may not check this out for themselves.

I do a lot of Perl maintenance programming and my run perl tidy on some programs to clean them up a bit, but Perl continues to be a beautiful girl to me.

Alpha Monk: I’m using Python, and I’m very happy with it.

Resently I was required to learn visual basic (and VBA) for a project. The microsoft world is a nightmare. It is the hell where condemned programmers go to for their sins. I am back to my beloved perl and I will never sin again.

P.S. Python sounds like a 70's joke and Ruby sounds jewish. :->

I don't understand how people can like Ruby. It's clean compared to Perl, or Java, etc. but it doesn't have a recognizable syntax. Too many @ signs, etc. It's like going from learning Italian, then Spanish, then Portuguese, and then Arabic.

The thing I like about Perl over Ruby is the fact that Perl's functions are very versatile. Ruby has a truck load more functions to do things which means you have to remember them, where as Perl has a small set of functions that can be manipulated to do all kinds of things. And this idiom is used throughout Perl. Once you get the hang of these idoms then you can use them throughout.

Kind of reminds me of PHP where there are a lot of functions that serve almost the same purpose. Why have 50 almost-the-same regexp functions, then another 50 for case insensitive operations?

My 2c.

Only one or 2 people have picked up on this, but not as a major point, but when it comes to maintaining; it is often not the one who wrote the code who has to do it.

As such the best language to use is one that's most likely to be known by the maintainer. Since most of the time you don't know who this will be, I'd always go for perl as it's the most widly known scripting language - even if it gives you the most rope to hang yourself.

My 2p.

Thanks for the comparison, I prefer Python over PERL for complicated programs.

Python tutorial has moved to a new URL : http://www.python.org/doc/current/tutorial/index.html

If Python 'has only one way to do it', how come we've got dozens of variants above?

I seems very ignorant/arrogant to choose a language mostly based on how much one can understand without knowing it. Although it is interesting somewhat, it only matters in the very beginning.

'Which language can teach me more' is at least as important as 'which language fits my brain'.

So, not only learn these, but Lisps, Haskell, TCL, etc.. and yes, regex.

wow.

this has got to be the most surreal thread I've ever read.

I felt like I just went through a 7-year long discussion, and never did it stray away from its initial focus (of Python, Perl, Ruby). Also, the absence of language-religious fanatics or dumb/flames is a rare find nowadays in the year 2009.

I never really did proper learning for programming, only bits of C or PHP here and there, and as such usually write horrid code. Recently a friend asked me if I could help him write a script for manipulating a dataset in text-file for his research project.

Remembering someone mentioning Perl being very well suited for such purposes (ie bioinformatics) I went online for a quick tutorial and very quickly came up with the script which did the job and had fun doing it =)

Now, I am actually planning a little pet project for home automation, and is torn between Perl and Python (thus I'm here, no Ruby tho, my brain just refuses to comprehend no matter how much I read it). As mentioned before, Perl is very very attractive as the components I'm going to use are incidentally already written in Perl, but yet I was afraid if the project gets large, Perl might get very messy.

Tho after spending a really long time going through this page, I think I'm settling on Perl. Even though I only knew it for days, it feels, weirdly comforting and nice (ref. to the warm blanket above).

AWK was the best suited then, now and in future for the task mentioned above. Please refer to the comment made by Anonymous on January 27, 2009 12:31 AM to see the simplicity of Awk.

I know this thread is old, but i couldn't resist to post my perl version. I've been programming perl for more than 10 years now, and i have to admint that i really love it.

I have my own conventions when i program perl so i won't get confused when i read the code later. One of the things that i do is to use a lot of hashes.

-------------
use strict;

my @contactRecords;

my $fields_hash ={EMAIL=>17, CONTACTME =>27, SKUTITLE=>34};
my $fields_array =[qw/EMAIL CONTACTME SKUTITLE/];

while (my $line=){
$line=~s/"//g;
my @record = split /\t/, $line;
if ($record[$fields_hash->{CONTACTME}]==1){
my $r={};
foreach my $k (@$fields_array){ $r->{$k}=$record[$fields_hash->{$k}]; }
push @contactRecords, $r;
}
}

foreach my $r (@contactRecords){
my $sep="";
foreach my $k (@$fields_array){
print $sep. $r->{$k};
$sep="\t";
}
print "\n";
}
-------------

Christopher wrote:
[quote]"No, no. You don't understand. Its PERL. Its an acronym."[/quote]
Um, it's not. It's a "backronym".

Edward C wrote:

> Even though I only knew it for days, it feels, weirdly comforting and nice (ref. to the warm blanket above).

Yes. That right there I think is one of the keys to Perl's success. Larry and Company have gone through great pains -- years of polishing and fine tuning -- to get Perl to have that well-worn comfortable feel.

My hunch/hope is that they're doing/have-done the same thing with Perl 6, but I haven't learned Perl 6 yet.

Generators! You must learn the power of generators in python.

Here is my preferred solution:
http://paste.lisp.org/display/90009

Also, read pep8, in python we eschew the use of camelCase. Leave that to the Java peeps since they can't get enough widgetWhichProducesBSFactory. I like that with mine it is more clear which field we are sorting and filtering on.

Awesome thread. I have been in management a long time but LOVE programming. I was a mainframe ALC/COBOL programmer for years before venturing into JAVA/C++ a little then was "promoted" (lol) to management. I miss programming.

The reason I found this is that I was asked to help with something in my company which may require file/text manipulation for a conversion of data from one system to another.

I was very proficient with REXX (if anyone knows that that is) which I loved. Seeing that it's no longer relevant I did a search on interpretive languages for file/text manipulation and eventually found this thread.

Many thanks to the person who started this topic and to EVERYONE for all the replies. As a result, I will go with Perl (I dabbled in it a LONG time ago) and will play with the others for fun.

Wow, nice article. I like Python better than Ruby better than Perl. I simply like expressive, clean languages.

John said that REXX is no longer relevant. I am not sure that is true :-) REXX continues to be distributed with IBM systems and there has been an ANSI standard for REXX since 1996. There are a number of open source REXX's available including several object-oriented REXX's. All are good implementations.

I used REXX for a long time, but use Python almost exclusively now. I never liked Perl. Why in the WORLD did someone come up with the idea in the late 1980s that there had to be TWO sets of comparison operators, one for numbers and one for strings? That looks like SUCH a "lazy compiler/interpreter writer" thing that one has to ask WTF? I mean even lowly BASIC didn't need that kludge. The first time I encountered that, I pretty much figured that Perl was a poorly designed language hack, even if it did make some things "easy to do".

Read Eric Raymond's 2000 article on Why Python? to get from someone who loved Perl and is well respected in the industry as a user of programming languages why Python is definitively a "better language".

BTW, I write/have written in MANY languages over the years, including like John, in ALC for the 370, Cobol, PL/I, C, C++, Objective-C (before it was used on the Mac :-), Java, REXX, Python, Ruby, SQL, etc. There are many languages, and there is no one "perfect" language. The good languages each have a problem domain (or several) that they are best suited for.

However, there are "poor" languages, languages that make things difficult or obscure or are just poorly designed. At the risk of LOTS of flames, I consider Perl to be one of these, ditto for PHP. This obviously ignores the fact that both are used by MANY MANY people, but that doesn't make them "better" by any means. Acceptance is as much an accident of history as it is a matter of goodness. Python, Perl and indeed PHP have many similar problem domains, and there is no question, that if I am in one of those domains, I prefer Python every time.

Jon
-------------------------------
"The difference between theory and reality is that in theory, there is no difference, but in reality, there is."

Great stuff all around. I am a Perl guy and just wanted to add my notes and clarify a few points, especially about the 200-line 'mental limit' for Perl scripts.

I have read Eric Raymond's discussion on why he prefers Python. I think he has good points, but he wrote those comments in 2000. The Perl community has come a long way. The best rejoinder is probably Perl Best Practices, by Damian Conway. But always remember: You can write fortran in any language. :)

As for scalability of perl scripts, I have heard that Perl programs get quite difficult to maintain after they get much larger than about 100 lines. I agree with that - maintaining a sense of context becomes mind-bending. My rule of thumb is that if my script gets much longer than 200 lines, it probably has some ideas that I can (and should) abstract into modules.

I recently ran into this very problem with a really simple simulator for my research. The program reached maybe 500 or 600 lines before I finally refactored it into many modules. Now I am back to writing 15 line scripts using my newly created libraries. It was painful but very enlightening to refactor and now I have an extensible (extendable) library that I can use for my research that should carry me through my thesis and quite possibly beyond.

By the way, you might think that my collection of modules should have the same 200-line mental limitations, but I find that I can work with a module with many thousands of lines of code and it doesn't bother me. This is because the modules tends to focus on a much more clearly defined target, which makes the context much narrower. Also, I tend to be much stricter with my libraries than with my scripts, especially when it comes to documentation.

Is this 200+ line mental limit on Perl scripts a feature or a bug? Obviously it depends on what you're trying to do, but for me, it has turned out to be a great feature.

P.S. Just to be sure everybody is clear: Perl can do serious computational stuff, at the computational level of C/Fortran. It's called PDL. I do all sorts of numerical stuff with PDL as grad student in physics, doing (single-core) simulations at the moment. I linked to PDL's web page, and while it looks old, the modules are still in active maintenance and development. And mailing list is awfully nice, too!

I started with Perl, then learned some Java and C/C++. Surrounded by many Pythonistas, I fully intended to make the jump to Python, but it felt awkward to me. Later I tried Ruby and very quickly fell in love. I'd been writing Perl for many years, but I found I could accomplish the same tasks in much cleaner ways with far fewer mistakes coding in Ruby. I've had to use Python (matplotlib and pymol scripting in particular) for some projects since then; it is a great language with fantastic libraries. Ruby and Python are far more similar than they are different. For me, Ruby, despite her few warts, flows in a way that I haven't been able to duplicate in Python. I guess I'd rather write my own library/extension in/for Ruby (which one has to do less frequently these days) than use an existing solution in Python or Perl. I do think the crosstalk between the languages is a good thing and appreciate all the great code that comes out of each camp. Interesting blog and comments.

Really an interesting discussion. Michel started with perl but settled (happy with python). Readability is definitely a point which assists a programmer or even a non programmer who wants to learn and improve the code but doesnt have the technical competence in that language. Perhaps in that sense python scores better. The >>> interface of python is where we can check the code. Ok, now a different set of problem. Suppose we have to automate the server configuration process. Say we have to modify /etc/inittab and change it to init 3. In such case which can score better? Perl, Python or Ruby .... Shell Script like old unix guy says...?

I know this is OT, but seriously, gentlemen, IF you want to use console & friends, then please do so responsiby ;->

I say forget grep, awk and whatnot. sed has it all, if you can be sure the input is in a fixed format.

This is what you need:
wget -O - http://www.mjtsai.com/blog/files/2002-11-25-sample-input.txt 2>/dev/null | sed -ne '/\("[^"]*"\t\)\{26\}"1"/s#\("[^"]*"\t\)\{17\}"\([^"]*\)".*$#\2#p'

and I guess this is also as fast as you can get...

Jon:
"Why in the WORLD did someone come up with the idea in the late 1980s that there had to be TWO sets of comparison operators, one for numbers and one for strings?"

Perl can't do all the work for you. If you want it to decide according to context whether "4.5" is a string or a number, you have to provide a context. I don't know about you, but I'd rather juggle two sets of operators than have to convert my numberish strings to numbers explicitly.

There is a comparison for Python, Perl, and Ruby as far as job demand and median salary goes at http://www.odinjobs.com/Odin/marketstatcompare?id=51907&q=python+vs+perl+vs+ruby

Paul Pomerleau

You can write ugly code in any language. If you chose to ignore the things which make perl (or again, any language) readable, easy to understand, easy to maintain, etc. then of course you can make things unpleasant.

But if the aim here is to accomplish a goal and speed is not a crucial factor, then the following perl code would be a better solution, as it is more self-documenting. It is more wasteful of space and slightly slower than a some of the above examples, but unless you have a whole lot of records, that shouldn't matter.

For the input below

"fred@1.com" 0 Director of research
"fred@3.com" 1 Director of research
"fred@4.com" 1 Director of research
"fred@5.com" 1 Director of research
"fred@2.com" 1 Director of research

I recommend this as a solution:

#!/usr/bin/perl -w

use strict;

my @contactable_records;

while ( ) {
if (/\S/) { # Check for blank lines
my @fields;
my %record;

s/[\r\n\"]//g;
@fields = split("\t",$_);

($record{eMail}, $record{Contact_Me}, $record{SKU_Title}) =
($fields[0], $fields[1], $fields[2]);

push (@contactable_records, \%record) if ($record{Contact_Me} == 1);
}
}

foreach my $record ( sort { $a->{eMail} cmp $b->{eMail} } @contactable_records ) {
print "$record->{eMail} $record->{Contact_Me} $record->{SKU_Title}\n";
}

You can certainly make a much shorter perl program than this, but you generally want something which makes sense when you look at it.

I'm only few month programer in perl, but...
not better write perl like this ?

my ($EMAIL,$CONTACTME,$SKUTITLE,@contactRecords)=(17,27,34);
push @contactRecords,[$_->[$SKUTITLE],$_->[$CONTACTME],$_->[$EMAIL]] for @records;

and not like :
my $EMAIL = 17;
my $CONTACTME = 27;
my $SKUTITLE = 34;

my @contactRecords = ();
foreach my $r ( @records )
{
push(@contactRecords, [$$r[$SKUTITLE], $$r[$CONTACTME], $$r[$EMAIL]]);
}

The comment thread clearly needs more code examples:

#!/usr/bin/env ruby
b = []
a = `curl -s http://www.mjtsai.com/blog/files/2002-11-25-sample-input.txt`
a.gsub!('"', '')
a.each {|l| b << l.split("\t").values_at(34, 27, 17)}
b[1..-1].sort!.each {|l| puts l.join("\t") if l[1] == "1"}

It is now nearly nine (9) years later. RubyGems has got to be close to rivaling CPAN in breadth, if not in depth (I'm referring to a comment up there ^^^ from 2006). All three languages have had 9 years of additional development and maturity.

What are you using these days to solve problems similar to the original one?

Thanks

>What are you using these days to solve problems similar to the original one?
grep & awk of course :D
or Perl if on topic...or maybe I should learn Ruby more hehe. Somehow I do not like Python...

Idea is that stay away from any version of BASIC and everything will be just fine ;)

It's now 2050. Rupy replaced Ruby and Python and is now a rf-neural language. All you need is to be near a Rupy RF console, have a mental password and imagine the solution you need for a problem. In a few seconds a blue hologram shows you the code for approval.

(sorry. just kidding. it's amazing that this thread started in 2002!)