Interpreted Languages: PHP, Perl, Python, Ruby (Sheet Two)

a side-by-side reference sheet

libraries and modules
	php	perl	python	ruby
load library	require_once("foo.php");	require 'Foo.pm'; # or require Foo; # or use Foo;	import foo	require 'foo' # or require 'foo.rb'
reload library	require("foo.php");	do 'Foo.pm';	reload(foo)	load 'foo.rb'
library path	$o = ini_get("include_path"); $n = $o . ":/some/path"; ini_set("include_path", $n);	push @INC, "/some/path";	sys.path.append('/some/path')	$: << "/some/path"
library path environment variable	none	PERL5LIB	PYTHONPATH	RUBYLIB
library path command line option	none	-I	none	-I
main in library		unless (caller) { code }	if __name__ == '__main__': code	if $0 == __FILE__ code end
module declaration	namespace Foo;	package Foo; require Exporter; our @ISA = ("Exporter"); our @EXPORT_OK = qw(bar baz);	put declarations in foo.py	class Foo or module Foo
submodule declaration	namespace Foo\Bar;	package Foo::Bar;	create directory foo in library path containing file bar.py	module Foo::Bar or module Foo module Bar
module separator	\Foo\Bar\baz();	Foo::Bar::baz();	foo.bar.baz()	Foo::Bar.baz
import all definitions in module	none, but a long module name can be shortened	# imports symbols in @EXPORT: use Foo;	from foo import *	include Foo
import definitions	only class names can be imported	# bar and baz must be in # @EXPORT or @EXPORT_OK: use Foo qw(bar baz);	from foo import bar, baz	none
managing multiple installations			$ virtualenv -p /usr/bin/python foo $ source foo/bin/activate $ echo $VIRTUAL_ENV $ deactivate	$ ruby-build 1.9.3-p0 \ ~/.rbenv/versions/foo $ rbenv shell foo $ rbenv version $ rbenv shell system
list installed packages, install a package	$ pear list $ pear install Math_BigInteger	$ perldoc perllocal $ cpan -i Moose	$ pip freeze $ pip install jinja2	$ gem list $ gem install rails
package specification format			in setup.py: #!/usr/bin/env python from distutils.core import setup setup( name='foo', author='Joe Foo', version='1.0', description='a package', py_modules=['foo'])	in foo.gemspec: spec = Gem::Specification.new do \|s\| s.name = "foo" s.authors = "Joe Foo" s.version = "1.0" s.summary = "a gem" s.files = Dir["lib/*.rb"] end
objects
	php	perl	python	ruby
define class	class Int { public $value; function __construct($int=0) { $this->value = $int; } }	package Int; use Moose; has value => (is => 'rw', default => 0, isa => 'Int'); no Moose; 1;	class Int: def __init__(self, v=0): self.value = v	class Int attr_accessor :value def initialize(i=0) @value = i end end
create object	$i = new Int(); $i2 = new Int(7);	my $i = new Int(); # or my $i = Int->new(); my $i2 = new Int(value => 7);	i = Int() i2 = Int(7)	i = Int.new i2 = Int.new(7)
get and set attribute	$v = $i->value; $i->value = $v+1;	my $v = $i->value; $i->value($v+1);	v = i.value i.value = v+1	v = i.value i.value = v+1
instance variable accessibility	must be declared	private by default	public; attributes starting with underscore private by convention	private by default; use attr_reader, attr_writer, attr_accessor to make public
define method	function plus($i) { return $this->value + $i; }	# in package: sub plus { my $self = shift; $self->value + $_[0]; }	def plus(self,v): return self.value + v	def plus(i) value + i end
invoke method	$i->plus(7)	$i->plus(7)	i.plus(7)	i.plus(7)
destructor	function __destruct() { echo "bye, $this->value\n"; }	# in package: sub DEMOLISH { my $self = shift; my $v = $self->value; print "bye, $v\n"; }	def __del__(self): print('bye, %d' % self.value)	val = i.value ObjectSpace.define_finalizer(int) { puts "bye, #{val}" }
method missing	function __call($name, $args) { $argc = count($args); echo "no def: $name " . "arity: $argc\n"; }	# in package: our $AUTOLOAD; sub AUTOLOAD { my $self = shift; my $argc = scalar(@_); print "no def: $AUTOLOAD" . " arity: $argc\n"; }	def __getattr__(self, name): s = 'no def: '+name+' arity: %d' return lambda *a: print(s % len(a))	def method_missing(name, *a) puts "no def: #{name}" + " arity: #{a.size}" end
inheritance	class Counter extends Int { private static $instances = 0; function __construct($int=0) { Counter::$instances += 1; parent::__construct($int); } function incr() { $this->value++; } static function getInstances() { return $instances; } }	package Counter; use Moose; extends 'Int'; my $instances = 0; sub BUILD { $instances += 1; } sub incr { my $self = shift; my $v = $self->value; $self->value($v + 1); } sub instances { $instances; } no Moose;	class Counter(Int): instances = 0 def __init__(self, v=0): Counter.instances += 1 Int.__init__(self, v) def incr(self): self.value += 1	class Counter < Int @@instances = 0 def initialize @@instances += 1 super end def incr self.value += 1 end def self.instances @@instances end end
invoke class method	Counter::getInstances()	Counter->instances();	Counter.instances	Counter.instances
operator overloading				class Fixnum def /(n) self.fdiv(n) end end
reflection
	php	perl	python	ruby
object id			id(o)	o.object_id
inspect type	gettype(array()) == "array" returns object for objects	ref([]) eq "ARRAY" returns empty string if argument not a reference; returns package name for objects	type([]) == list	[].class == Array
basic types	NULL boolean integer double string array object resource unknown type	SCALAR ARRAY HASH CODE REF GLOB LVALUE FORMAT IO VSTRING Regexp	NoneType bool int long float str SRE_Pattern datetime list array dict object file	NilClass TrueClass FalseClass Fixnum Bignum Float String Regexp Time Array Hash Object File
inspect class	returns FALSE if not an object: get_class($o) == "Foo"	ref($o) eq "Foo"	o.__class__ == Foo isinstance(o, Foo)	o.class == Foo o.instance_of?(Foo)
inspect class hierarchy	get_parent_class($o)		o.__class__.__bases__	o.class.superclass o.class.included_modules
has method?	method_exists($o, "reverse")	$o->can("reverse")	hasattr(o, 'reverse')	o.respond_to?("reverse")
message passing	for ($i = 1; $i <= 10; $i++) { call_user_func(array($o, "phone$i"), NULL); }	for $i (0..10) { $meth = "phone$i"; $o->$meth(undef); }	for i in range(1,10): getattr(o, 'phone'+str(i))(None)	(1..9).each do \|i\| o.send("phone#{i}=", nil) end
eval	eval evaluates to argument of return statement or NULL: while ($line = fgets(STDIN)) { echo eval($line) . "\n"; }	while(<>) { print ((eval), "\n"); }	argument of eval must be an expression: while True: print(eval(sys.stdin.readline()))	loop do puts eval(gets) end
inspect methods	get_class_methods($o)	$class = ref($o); keys eval "%${class}::";	[m for m in dir(o) if callable(getattr(o,m))]	o.methods
inspect attributes	get_object_vars($o)	keys %$o;	dir(o)	o.instance_variables
pretty print	$d = array("lorem"=>1, "ipsum"=>array(2,3)); print_r($d);	use Data::Dumper; %d = (lorem=>1, ipsum=>[2, 3]); print Dumper(\%d);	import pprint d = {'lorem':1, 'ipsum':[2,3]} pprint.PrettyPrinter().pprint(d)	require 'pp' d = {"lorem"=>1, "ipsum"=>[2,3]} pp d
source line number and file name	__LINE__ __FILE__	__LINE__ __FILE__	import inspect cf = inspect.currentframe() cf.f_lineno cf.f_code.co_filename	__LINE__ __FILE__
web
	php	perl	python	ruby
http get	$url = 'http://www.google.com'; $s = file_get_contents($url);	use LWP::UserAgent; $url = 'http://www.google.com'; $r = HTTP::Request->new(GET=>$url); $ua = LWP::UserAgent->new; $resp = $ua->request($r); my $s = $resp->content();	import httplib url = 'www.google.com' f = httplib.HTTPConnection(url) f.request("GET",'/') s = f.getresponse().read()	require 'net/http' url = 'www.google.com' r = Net::HTTP.start(url, 80) do \|f\| f.get('/') end s = r.body
url encode/decode	urlencode("lorem ipsum") urldecode("lorem+ipsum")	use CGI; CGI::escape('lorem ipsum') CGI::unescape('lorem%20ipsum')	# Python 3 location: urllib.parse import urllib urllib.quote_plus("lorem ipsum") urllib.unquote_plus("lorem+ipsum")	require 'cgi' CGI::escape('lorem ipsum') CGI::unescape('lorem+ipsum')
base64 encode	$s = file_get_contents('foo.png'); echo base64_encode($s);	use MIME::Base64; open my $f, '<', 'foo.png'; my $s = do { local $/; <$f> }; print encode_base64($s);	import base64 s = open('foo.png').read() print(base64.b64encode(s))	require 'base64' s = File.open('foo.png').read puts Base64.strict_encode64(tmp)
json	$a = array('t'=>1, 'f'=>0); $s = json_encode($a); $d = json_decode($s, TRUE);	# cpan -i JSON use JSON; $raw = { t => 1, f => 0 }; $json = JSON->new->allow_nonref; $s = $json->encode($raw); $d = $json->decode($s);	import json s = json.dumps({'t':1, 'f':0}) d = json.loads(s)	Ruby 1.8: sudo gem install json require 'json' s = {'t'=> 1,'f'=> 0}.to_json d = JSON.parse(s)
build xml	$xml = '<a></a>'; $sxe = new SimpleXMLElement($xml); $sxe->addChild('b', 'foo'); echo $sxe->asXML();	# cpan -i XML::Writer use XML::Writer; my $writer = XML::Writer->new; $writer->startTag('a'); $writer->startTag('b'); $writer->characters('foo'); $writer->endTag('b'); $writer->endTag('a'); $writer->end;	import xml.etree.ElementTree as ET builder = ET.TreeBuilder() builder.start('a', {}) builder.start('b', {}) builder.data('foo') builder.end('b') builder.end('a') et = builder.close() print(ET.tostring(et))	# gem install builder require 'builder' builder = Builder::XmlMarkup.new xml = builder.a do \|child\| child.b("foo") end puts xml
parse xml	$xml = '<a><b>foo</b></a>'; $doc = simplexml_load_string($xml); foreach ($doc->children() as $c) { break; } echo $c;	# cpan -i XML::Twig use XML::Twig; my $t= XML::Twig->new(); my $xml = '<a><b>foo</b></a>'; $t->parse($xml); my $doc= $t->root; print $doc->first_child('b')->text;	from xml.etree import ElementTree xml = '<a><b>foo</b></a>' doc = ElementTree.fromstring(xml) print(doc[0].text)	require 'rexml/document' xml = '<a><b>foo</b></a>' doc = REXML::Document.new(xml) puts doc[0][0].text
xpath	$x = '<a><b><c>foo</c></b></a>'; $d = simplexml_load_string($x); $n = $d->xpath('/a/b/c'); echo $n[0];	# cpan -i XML::XPath use XML::XPath; my $x = '<a><b><c>foo</c></b></a>'; my $xp = XML::XPath->new(xml => $x); my $node = $xp->find('/a/b/c'); print $node->string_value();	from xml.etree import ElementTree xml = '<a><b><c>foo</c></b></a>' doc = ElementTree.fromstring(xml) node = doc.find("b/c") print(node.text)	require 'rexml/document' include REXML xml = '<a><b><c>foo</c></b></a>' doc = Document.new(xml) node = XPath.first(doc,'/a/b/c') puts node.text
tests
	php	perl	python	ruby
test class		# cpan -i Test::Class Test::More package FooTest; use Test::Class; use Test::More; use base qw(Test::Class); sub test_01 : Test { ok(1); } 1;	import unittest class FooTest(unittest.TestCase): def test_01(self): assert(True) if __name__ == '__main__': unittest.main()	require 'test/unit' class FooTest < Test::Unit::TestCase def test_01 assert(true) end end
run tests, run test method		$ cat FooTest.t use FooTest; Test::Class->runtests; $ perl ./FooTest.t	$ python foo_test.py $ python foo_test.py FooTest.test_01	$ ruby foo_test.rb $ ruby foo_test.rb -n test_01
equality assertion		my $s = "do re me"; is($s, "do re me");	s = 'do re me' self.assertEqual('do re me', s)	s = "do re me" assert_equal("do re me", s)
regex assertion		my $s = "lorem ipsum"; like($s, qr/lorem/);	s = 'lorem ipsum' # uses re.search, not re.match: self.assertRegexpMatches(s, 'lorem')	s = "lorem ipsum" assert_match(/lorem/, s)
exception assertion		use Test::Fatal; ok(exception { 1 / 0 });	a = [] with self.assertRaises(IndexError): a[0]	assert_raises(ZeroDivisionError) do 1 / 0 end
setup		# in class FooTest: sub make_fixture : Test(setup) { print "setting up"; };	# in class FooTest: def setUp(self): print('setting up')	# in class FooTest: def setup puts "setting up" end
teardown		# in class FooTest: sub teardown : Test(teardown) { print "tearing down"; };	# in class FooTest: def tearDown(self): print("tearing down")	# in class FooTest: def teardown puts "tearing down" end
debugging and profiling
	php	perl	python	ruby
check syntax	$ php -l foo.php	$ perl -c foo.pl	import py_compile # precompile to bytecode: py_compile.compile('foo.py')	$ ruby -c foo.rb
flags for stronger and strongest warnings	none	$ perl -w foo.pl $ perl -W foo.pl	$ python -t foo.py $ python -3t foo.py	$ ruby -w foo.rb $ ruby -W2 foo.rb
lint			$ sudo pip install pylint $ pylint foo.py
run debugger		$ perl -d foo.pl	$ python -m pdb foo.py	$ sudo gem install ruby-debug $ rdebug foo.rb
debugger commands		h l n s b c T ?? ?? p q	h l n s b c w u d p q	h l n s b c w u down p q
benchmark code		use Benchmark qw(:all); $t = timeit(1_000_000, '$i += 1;'); print timestr($t);	import timeit timeit.timeit('i += 1', 'i = 0', number=1000000)	require 'benchmark' n = 1_000_000 i = 0 puts Benchmark.measure do n.times { i += 1 } end
profile code		$ perl -d:DProf foo.pl $ dprofpp	$ python -m cProfile foo.py	$ sudo gem install ruby-prof $ ruby-prof foo.rb
java interoperation
	php	perl	python	ruby
version			Jython 2.5	JRuby 1.4
repl			$ jython	$ jirb
interpreter			$ jython	$ jruby
compiler			none in 2.5.1	$ jrubyc
prologue			import java	none
new			rnd = java.util.Random()	rnd = java.util.Random.new
method			rnd.nextFloat()	rnd.next_float
import			from java.util import Random rnd = Random()	java_import java.util.Random rnd = Random.new
non-bundled java libraries			import sys sys.path.append('path/to/mycode.jar') import MyClass	require 'path/to/mycode.jar'
shadowing avoidance			import java.io as javaio	module JavaIO include_package "java.io" end
convert native array to java array			import jarray jarray.array([1,2,3],'i')	[1,2,3].to_java(Java::int)
are java classes subclassable?			yes	yes
are java class open?			no	yes
	__________________________________________	__________________________________________	__________________________________________	__________________________________________

Library and Module Footnotes

How terminology is used in this sheet:

library: code in its own file that can be loaded by client code. For interpreted languages, loading a library means parsing the library into the intermediate representation used by the interpreter VM. It is of little use to load an library and not make its definitions available under names in the client code. Hence languages to import identifiers defined in the library automatically when the library is loaded.
module: a set of names that can be imported a unit. Importing an identifier means adding it to a scope. Importing a module means adding all the identifers in the module to a scope.
package: a library that can be installed by a package manager.

A few notes:

According to our terminology, Perl and Java packages are modules, not packages.

PHP and C++ namespaces are another of example of modules.

We prefer to reserve the term namespace for divisions of the set of names imposed by the parser. For example, the identifier foo in the Perl variables $foo and @foo belong to different namespaces. Another example of namespaces in this sense is the Lisp-1 vs. Lisp-2 distinction: Scheme is a Lisp-1 and has a single namespace, whereas Common Lisp is a Lisp-2 and has multiple namespaces.

Some languages (e.g. Python, Java) impose a one-to-one mapping between libraries and modules. All the definitions for a module must be in a single file, and there are typically restrictions on how the file must be named and where it is located on the filesystem. Other languages allow the definitions for a module to be spread over multiple files or permit a file to contain multiple modules. Ruby and C++ are such languages.

load library

Execute the specified file. Normally this is used on a file which only contains declarations at the top level.

php:

include_once behaves like require_once except that it is not fatal if an error is encountered executing the library.

If it is desirable to reload the library even if it might already have been loaded, use require or include.

perl:

The last expression in a perl library must evaluate to true. When loading a library with use, the suffix of the file must be .pm.

The do directive will re-execute a library even if it has already been loaded.

reload library

How to reload a library. Altered definitions in the library will replace previous versions of the definition.

library path

How to augment the library path by calling a function or manipulating a global variable.

library path environment variable

How to augment the library path by setting an environment variable before invoking the interpreter.

library path command line option

How to augment the library path by providing a command line option when invoking the interpreter.

main in library

How to put code in a library which executes when the file is run as a top-level script and not when the file is loaded as a library.

module declaration

How to declare a section of code as belonging to a module.

submodule declaration

How to declare a section of code as belonging to a submodule.

module separator

The punctuation used to separate the labels in the full name of a submodule.

import all definitions in module

How to import all the definitions in a module.

import definitions

How to import specific definitions from a module.

managing multiple installations

How to manage multiple versions of the interpreter on the same machine; how to manage multiple versions of 3rd party libraries for the interpreter.

The examples show how to (1) create an installation, (2) enter the environment, (3) display the current environment, and (4) exit the environment.

While in the environment executing the interpreter by its customary name will invoke the version of the interpreter specified when the environment was created. 3rd party libraries installed when in the environment will only be available to processes running in the environment.

python:

virtualenv

virtualenv can be downloaded and installed by running this in the virtualenv source directory:

sudo python setup.py install

When virtualenv is run it creates a bin directory with copies of the the python executable, pip, and easy_install. When the activate script is sourced the bin directory is appended to the front of the PATH environment variable.

By default the activate script puts the name of the environment in the shell prompt variable PS1. A different name can be provided with the --prompt flag when virtualenv is run. To remove the name completely it is necessary to edit the activate script.

ruby:

To use rbenv, check out the code from Github and put it in your home directory at .rbenv. Edit your shell PATH so that ~/.rbenv/bin is in your search path. Also put the output of the following in your shell startup file:

rbenv init -

To create a new environment build the desired version of ruby in ~/.rbenv/versions. This can be done be done from source or with ruby-build.

One can switch environments with rbenv shell, rbenv local or rbenv global. When these commands are run the environment is recorded in the RBENV_VERSION shell environment variable, the file .rbenv-version in the current directory, and the file .rbenv/version, respectively. This is also precedence that is observed when determining which environment is in effect. Specifically, rbenv first checks RBENV_VERSION, and if that fails it looks in the current directory and each parent directory recursively for a .rbenv-version file, and it that fails it checks ~/.rbenv/version.

rbenv puts ~/.rbenv/shims in the search PATH and creates executables in ~/.rbenv/shims to control which version of the ruby executables get invoked.

list installed packages, install a package

How to show the installed 3rd party packages, and how to install a new 3rd party package.

perl

cpanm is an alternative to cpan which is said to be easier to use.

How to use cpan to install cpanm:

$ sudo cpan -i App::cpanminus

How to install a module with cpanm:

$ sudo cpanm Moose

python

Two ways to list the installed modules and the modules in the standard library:

$ pydoc modules

$ python
>>> help('modules')

Most 3rd party Python code is packaged using distutils, which is in the Python standard library. The code is placed in a directory with a setup.py file. The code is installed by running the Python interpreter on setup.py:

package specification format

The format of the file used to specify a package.

python:

distutils.core reference

Here is an example of how to create a Python package using distutils. Suppose that the file foo.py contains the following code:

def add(x, y):
    return x+y

In the same directory as foo.py create setup.py with the following contents:

#!/usr/bin/env python

from distutils.core import setup

setup(name='foo',
      version='1.0',
      py_modules=['foo'],
     )

Create a tarball of the directory for distribution:

$ tar cf foo-1.0.tar foo
$ gzip foo-1.0.tar

To install a tar, perform the following:

$ tar xf foo-1.0.tar.gz
$ cd foo
$ sudo python setup.py install

If you want people to be able to install the package with pip, upload the tarball to the Python Package Index.

ruby:

gemspec attributes

For an example of how to create a gem, create a directory called foo. Inside it create a file called lib/foo.rb which contains:

def add(x, y)
  x + y
end

Then create a file called foo.gemspec containing:

spec = Gem::Specification.new do |s|
  s.name = 'foo'
  s.authors = 'Joe Foo'
  s.version = '1.0'
  s.summary = 'a gem'
  s.files = Dir['lib/*.rb']
end

To create the gem, run this command:

$ gem build foo.gemspec

A file called foo-1.0.gem is created. To install foo.rb run this command:

$ gem install foo-1.0.gem

Object Footnotes

define class

php:

Properties (i.e. instance variables) must be declared public, protected, or private. Methods can optionally be declared public, protected, or private. Methods without a visibility modifier are public.

perl:

The sheet shows how to create objects using the CPAN module Moose. To the client of an object, Moose objects and traditional Perl objects are largely indistinguishable. Moose provides convenience functions to aid in the definition of a class, and as a result a Moose class definition and a traditional Perl class definition look quite different.

The most common keywords used when defining a Moose class are has, extends, subtype.

The before, after, and around keywords are used to define method modifiers. The with keyword indicates that a Moose class implements a role.

The no Moose; statement at the end of a Moose class definition removes class definition keywords, which would otherwise be visible to the client as methods.

Here is how to define a class in the traditional Perl way:

package Int;

sub new {
  my $class = shift;
  my $v = $_[0] || 0;
  my $self = {value => $v};
  bless $self, $class;
  $self;
}

sub value {
  my $self = shift;
  if ( @_ > 0 ) {
    $self->{'value'} = shift;
  }
  $self->{'value'};
}

sub add {
  my $self = shift;
  $self->value + $_[0];
}

sub DESTROY {
  my $self = shift;
  my $v = $self->value;
  print "bye, $v\n";
}

python:

As of Python 2.2, classes are of two types: new-style classes and old-style classes. The class type is determined by the type of class(es) the class inherits from. If no superclasses are specified, then the class is old-style. As of Python 3.0, all classes are new-style.

New-style classes have these features which old-style classes don't:

universal base class called object.
descriptors and properties. Also the __getattribute__ method for intercepting all attribute access.
change in how the diamond problem is handled. If a class inherits from multiple parents which in turn inherit from a common grandparent, then when checking for an attribute or method, all parents will be checked before the grandparent.

create object

How to create an object.

get and set attribute

How to get and set an attribute.

perl:

Other getters:

$i->value()
$i->{'value'}

Other setters:

$i->{'value'} = $v;

python:

Defining explicit setters and getters in Python is considered poor style. If it becomes necessary to extra logic to attribute, this can be achieved without disrupting the clients of the class by creating a property:

def getValue(self):
  print("getValue called")
  return self.__dict__['value']
def setValue(self,v):
  print("setValue called")
  self.__dict__['value'] = v
value = property(fget=getValue, fset = setValue)

instance variable accessibility

How instance variable access works.

define method

How to define a method.

invoke method

How to invoke a method.

perl:

If the method does not take any arguments, the parens are not necessary to invoke the method.

destructor

How to define a destructor.

perl:

Perl destructors are called when the garbage collector reclaims the memory for an object, not when all references to the object go out of scope. In traditional Perl OO, the destructor is named DESTROY, but in Moose OO it is named DEMOLISH.

python:

A Python destructor is not guaranteed to be called when all references to an object go out of scope, but apparently this is how the CPython implementations work.

ruby:

Ruby lacks a destructor. It is possible to register a block to be executed before the memory for an object is released by the garbage collector. A ruby
interpreter may exit without releasing memory for objects that have gone out of scope and in this case the finalizer will not get called. Furthermore, if the finalizer block holds on to a reference to the object, it will prevent the garbage collector from freeing the object.

method missing

How to handle when a caller invokes an undefined method.

php:

Define the method __callStatic to handle calls to undefined class methods.

python:

__getattr__ is invoked when an attribute (instance variable or method) is missing. By contrast, __getattribute__, which is only available in Python 3, is always invoked, and can be used to intercept access to attributes that exist. __setattr__ and __delattr__ are invoked when attempting to set or delete attributes that don't exist. The del statement is used to delete an attribute.

ruby:

Define the method self.method_missing to handle calls to undefined class methods.

inheritance

How to use inheritance.

perl:

Here is how inheritance is handled in traditional Perl OO:

package Counter;

our @ISA = "Int";

my $instances = 0;
our $AUTOLOAD;

sub new {
  my $class = shift;
  my $self = Int->new(@_);
  $instances += 1;
  bless $self, $class;
  $self;
}

sub incr {
  my $self = shift;
  $self->value($self->value + 1);
}
sub instances {
  $instances;
}

sub AUTOLOAD {
  my $self = shift;
  my $argc = scalar(@_);
  print "undefined: $AUTOLOAD " .
    "arity: $argc\n";
}

invoke class method

How to invoke a class method.

Reflection Footnotes

object id

How to get an identifier for an object or a value.

inspect type

php:

The PHP manual says that the strings returned by gettype are subject to change and advises using the following predicates instead:

is_null
is_bool
is_numeric
is_int
is_float
is_string
is_array
is_object
is_resource

perl:

ref returns the empty string when the argument is not a scalar containing a reference. If the argument is a reference, ref returns the package name if it points to a blessed object. Otherwise it returns the name of the built-in type.

basic types

php:

All possible return values of gettype are listed.

perl:

All the built-in types are listed.

inspect class

How to get the class of an object.

inspect class hierarchy

has method?

perl:

$a->can() returns a reference to the method if it exists, otherwise it returns undef.

python:

hasattr(o,'reverse') will return True if there is an instance variable named 'reverse'.

message passing

eval

How to interpret a string as code and return its value.

php:

The value of the string is the value of of the return statement that terminates execution. If execution falls off the end of the string without encountering a return statement, the eval evaluates as NULL.

python:

The argument of eval must be an expression or a SyntaxError is raised. The Python version of the mini-REPL is thus considerably less powerful than the versions for the other languages. It cannot define a function or even create a variable via assignment.

inspect methods

perl:

The following code

$class = ref($a);
keys eval "%${class}::";

gets all symbols defined in the namespace of the class of which $a is an instance. The can method can be used to narrow the list to instance methods.

inspect attributes

perl:

keys %$a assumes the blessed object is a hash reference.

python:

dir(o) returns methods and instance variables.

pretty print

How to display the contents of a data structure for debugging purposes.

source line number and file name

How to get the current line number and file name of the source code.

Web

http get

How to make an HTTP GET request and read the response into a string.

url encode/decode

How to URL encode and URL unencode a string.

URL encoding, also called percent encoding, is described in RFC 3986. It replaces all characters except for the letters, digits, and a few punctuation marks with a percent sign followed by their two digit hex encoding. The characters which are not escaped are:

A-Z a-z 0-9 - _ . ~

URL encoding can be used to encode UTF-8, in which case each byte of a UTF-8 character is encoded separately.

When form data is sent from a browser to a server via an HTTP GET or an HTTP POST, the data is percent encoded but spaces are replaced by plus signs + instead of %20. The MIME type for form data is application/x-www-form-urlencoded.

perl:

CGI::escape replaces spaces with %20. CGI::unescape will replace both + and %20 with a space, however.

python:

In Python 3 the functions quote_plus, unquote_plus, quote, and unquote moved from urllib to urllib.parse.

urllib.quote replaces a space character with %20.

urllib.unquote does not replace + with a space character.

base64 encode

How to encode binary data in ASCII using the Base64 encoding scheme.

json

How to encode data in a JSON string; how to decode such a string.

build xml

How to build an XML document.

An XML document can be constructed by concatenating strings, but the techniques illustrated here guarantee the result to be well-formed XML.

parse xml

How to parse XML

xpath

How to extract data from XML using XPath.

Tests

test class

How to define a test class.

perl:

If there is more than one assertion in a test, then set the Test attribute appropriately in the test method signature to quiesce a warning:

sub test_01 : Test(2) {
  ok(1);
  ok(2);
}

run tests; run test method

How to run all the tests in a test class; how to run a single test from the test class.

equality assertion

How to test for equality.

regex assertion

How to test that a string matches a regex.

exception assertion

How to test whether an exception is raised.

setup

How to define a setup method which gets called before every test.

teardown

How to define a cleanup method which gets called after every test.

Debugging and Profiling Footnotes

check syntax

How to check the syntax of code without executing it.

flags for stronger and strongest warnings

Flags to increase the warnings issued by the interpreter.

perl:

The

use warnings;

pragma is the same as the -w flag except that warnings are only issued for constructs in the current scope.

python:

The -t flag warns about inconsistent use of tabs in the source code. The -3 flag is a Python 2.X option which warns about syntax which is no longer valid in Python 3.X.

lint

A lint tool.

run debugger

How to run a script under the debugger.

debugger commands

A selection of commands available when running the debugger. The gdb commands are provided for comparison.

cmd	perl -d	python -m pdb	rdebug	gdb
help	h	h	h	h
list	l [first, last]	l [first, last]	l [first, last]	l [first, last]
next statement	n	n	n	n
step into function	s	s	s	s
set breakpoint	b	b [file:]line b function	b [file:]line b class[.method]	b [file:]line
list breakpoints	L	b	info b	i b
delete breakpoint	B num	cl num	del num	d num
continue	c	c	c	c
show backtrace	T	w	w	bt
move up stack		u	u	u
move down stack		d	down	do
print expression	p expr	p expr	p expr	p expr
(re)run	R	restart [arg1[, arg2 …]]	restart [arg1[, arg2 …]]	r [arg1[, arg2 …]]
quit debugger	q	q	q	q

benchmark code

How to run a snippet of code repeatedly and get the user, system, and total wall clock time.

profile code

How to run the interpreter on a script and get the number of calls and total execution time for each function or method.

perl:

perl -d:DProf writes the profiling information to the file tmon.out in the current directory. dprofpp reads that file.

Java Interoperation Footnotes

Both Python and Ruby have JVM implementations. It is possible to compile both Python code and Ruby code to Java bytecode and run it on the JVM. It is also possible to run a version of the Python interpreter or the Ruby interpreter on the JVM which reads Python code or Ruby code, respectively.

version

Version of the scripting language JVM implementation used in this reference sheet.

repl

Command line name of the repl.

interpreter

Command line name of the interpreter.

compiler

Command line name of the tool which compiles source to java byte code.

prologue

Code necessary to make java code accessible.

new

How to create a java object.

method

How to invoke a java method.

import

How to import names into the current namespace.

import non-bundled java library

How to import a non-bundled Java library

shadowing avoidance

How to import Java names which are the same as native names.

convert native array to java array

How to convert a native array to a Java array.

are java classes subclassable?

Can a Java class be subclassed?

are java classes open?

Can a Java array be monkey patched?

History

php version history
perl version history
python version history
ruby version history

History of Scripting Languages

Scripting the Operating System

Every program is a "script": a set of instructions for the computer to follow. But early in the evolution of computers a need arose for scripts composed not of just of machine instructions but other programs.

IBM introduced Job Control Language (JCL) with System 360 in 1964. Apparently before JCL IBM machines were run by operators who fed programs through the machine one at a time. JCL provided the ability to run a sequence of jobs as specified on punch cards without manual intervention. The language was rudimentary, not having loops or variable assignment, though it did have parametrized procedures. In the body of a procedure a parameter was preceded by an ampersand: &. The language had conditional logic for taking actions depending upon the return code of a previously executed program. The return code was an integer and zero was used to indicate success.

Also in 1964 Louis Pouzin wrote a program called RUNCOM for the CTSS operating system which could run scripts of CTSS commands. Pouzin thought that shells or command line interpreters should be designed with scriptability in mind and he wrote a paper to that effect.

Unix

The first Unix shell was the one Ken Thomson wrote in 1971. It was scriptable in that it supported if and goto as external commands. It did not have assignment or variables.

In the late 1970s the Unix shell scripting landscape came into place. The Bourne shell replaced the Thomson shell in 7th Edition Unix which shipped in 1979. The Bourne shell dispensed with the goto and instead provided an internally implemented if statement and while and for loops. The Bourne shell had user defined variables which used a dollar sign sigil ($) for access but not assignment.

The C-shell also made its appearance in 1979 with the 2nd Berkeley standard distribution of Unix. It was so named because its control structures resembled the control structures of C. The C-shell eventually acquired a bad reputation as a programming environment. Its true contribution was the introduction of job control and command history. Later shells such as the Korn Shell (1982) and the Bourne Again Shell would attempt to incorporate these features in a manner backwardly comptable with the Bourne shell.

Another landmark in Unix shell scripting was awk which appeared in 1977. awk is a specialized language in that there is an implicit loop and the commands are by default executed on every line of input. However, this was a common pattern in the text file oriented environment of Unix.

more IBM developments

The PC made its appearance in 1981. It came with a command interpreter called COMMAND.COM which could run on a batch file. PC-DOS for that matter was patterned closely on CP/M, the reigning operating system of home computers at the time, which itself borrowed from various DEC operating systems such as TOPS-10. I'm not certain whether CP/M or even TOPS-10 for that matter had batch files. As a programming environment COMMAND.COM was inferior to the Unix shells. Modern Windows systems make this programming environment available with CMD.EXE.

IBM released a scripting language called Rexx for its mainframe operating systems in 1982. Rexx was superior as a programming environment to the Unix shells of the time, and in fact Unix didn't have anything comparable until the appearance of Perl and Tcl in the late 1980s. IBM also released versions of Rexx for OS/2 and PC-DOS.

Perl

In 1987, while he was working for Unisys in Los Angeles, Larry Wall released the first version of a language which would define the scripting language genre. Wall seems to have been both proficient and dissatisfied with Unix shell scripting. Performance was probably one of Wall's complaints, because the interpreter for the new language no longer resolved unrecognized symbols by trying to run an external command in the search path. Internal functions were provided to do the work of many of the traditional Unix utilities. A side benefit of internal functions is that the special character escaping issues that sometimes plague shell scripting go away.

The performance of arithmetic operations was improved by permitting both strings and numbers to be stored in variables. This caused no inconvenience for the programmer because Perl underlyingly would call atof or sprintf to convert one data type to the other when needed.

Perl introduced two container data types: the array and the hash. As a side note, hashes were not in the original Perl, but they had been introduced by Perl 3.0 in 1989. Variables holding arrays or hashes were identified by the sigils @ and %. With these two types Perl seems to have found the sweet spot. They are a significant reason Perl programming is more pleasant than shell programming, with its anemic arrays that are actually strings of white space separated values. Wall nevertheless saw an advantage in the way shell scripting arrays work: they made it easy to store all the arguments for a command in a single variable. In Perl the same effect was achieved by having arrays automatically expand to separate values when passed to a function.

Perl was both more powerful and easier to use than shell scripting. The easier to use part was achieved through consistency here and there. For example Perl always prefixed scalar variables with dollar signs ($) in contrast to shells which did not use the sigils in assignments. With Perl 2.0 Wall began expanding on the regular expression language. He introduced backslash sequences such as \s and \d for whitespace characters and digits. Since the backslash was the escape character for characters which are special to the regular expression language, Wall adopted the rule that a backslashed punctuation character always matches itself.

the camel book

Wall co-authored a book called Programming Perl for O'Reilly Associates in 1991. On p. xiv the book states Wall's paradoxical "great virtues of a programmer": laziness, impatience, and hubris. On p. 4 we are told that with Perl There's More Than One Way To Do It. As an example of TMTOWTDI p. 5 illustrates three ways of running a Perl program:

$ perl -e 'print "Howdy, world!\n";
Howdy, world!

$ cat howdy
print "Howdy, world!\n";

$ perl howdy
Howdy, world!

$ cat howdy
#!/usr/bin/perl
print "Howdy, world!\n";

$ howdy
Howdy, world!

Perl 5

Scripting the Web

The Original HTTP as defined in 1991
HTML Specification Draft June 1993
WorldWideWeb Browser
Mosaic Web Browser

Tim Berners-Lee created the web in 1990. It ran on a NeXT cube. The browser and the web server communicated via a protocol invented for the purpose called HTTP. The documents were marked up in a type of SGML called HTML. The key innovation was the hyperlink. If the user clicked on a hyperlink, the browser would load the document pointed to by the link. A hyperlink could also take the user to a different section of the current document.

The initial version of HTML included these tags:

html, head, title, body, h1, h2, h3, h4, h5, h6, pre, blockquote, b, i, a, img, ul, ol, li, dl, dt, dd

The browser developed by Berners-Lee was called WorldWideWeb. It was graphical, but it wasn't widely used because it only ran on NeXT. Nicola Pellow wrote a text-only browser and ported it to a variety of platforms in 1991. Mosaic was developed by Andreesen and others at NCSA and released in February 1993. Mosaic was the first browser which could display images in-line with text. It was originally released for X Windows, and it was ported to Macintosh a few months later. Ports for the Amiga and Windows were available in October and December of 1993.

CGI and Forms

RFC 3875: CGI Version 1.1 2004
HTML 2.0 1995
NSAPI Programmer's Guide (pdf) 2000
Apache HTTP Server Project
History of mod_perl
FastCGI Specification 1996

The original web permitted a user to edit a document with a browser, provided he or she had permission to do so. But otherwise the web was static. The group at NCSA developed forms so users could submit data to a web server. They developed the CGI protocol so the server could invoke a separate executable and pass form data to it. The separate executable, referred to as a CGI script in the RFC, could be implemented in almost any language. Perl was a popular early choice. What the CGI script writes to standard out becomes the HTTP response. Usually this would contain a dynamically generated HTML document.

HTML 2.0 introduced the following tags to support forms:

form input select option textarea

The input tag has a type attribute which can be one of the following:

text password checkbox radio image hidden submit reset

If the browser submits the form data with a GET, the form data is included in the URL after a question mark (?). The form data consists of key value pairs. Each key is separated from its value by an equals (=), and the pairs are separated from each other by ampersands (&). The CGI protocol introduces an encoding scheme for escaping the preceding characters in the form data or any other characters that are meaningful or prohibited in URLs. Typically, the web server will set a QUERY_STRING environment variable to pass the GET form data to the CGI script. If the browser submits the data with POST, the form data is encoded in the same manner as for GET, but the data is placed in the HTTP request body. The media type is set to application/x-www-form-urlencoded.

Andreesen and others at NCSA joined the newly founded company Netscape, which released a browser in 1994. Netscape also released a web server with a plug-in architecture. The architecture was an attempt to address the fact that handling web requests with CGI scripts was slow: a separate process was created for each request. With the Netscape web server, the equivalent of a CGI script would be written in C and linked in to the server. The C API that the developer used was called NSAPI. Microsoft developed a similar API called ISAPI for the IIS web server.

The NCSA web server had no such plug-in architecture, but it remained the most popular web server in 1995 even though development had come to a halt. The Apache web server project started up that year; it used the NCSA httpd 1.3 code as a starting point and it was the most popular web server within a year. Apache introduced the Apache API, which permitted C style web development in the manner of NSAPI and ISAPI. The Apache extension mod_perl, released in March 1996, was a client of the Apache API. By means of mod_perl an Apache web server could handle a CGI request in memory using an embedded perl interpreter instead of forking off a separate perl process.

Ousterhout on Scripting Languages

Scripting: Higher Level Programming for the 21st Century Ousterhout

Ousterhout wrote an article for IEEE Computer in 1998 which drew a distinction between system programming languages and scripting languages. As examples of scripting languages Ousterhout cited Perl, Python, Texx, Tcl, Visual Basic, and the Unix shells. To Ousterhout the biggest difference between the two classes of language is that system programming languages are strongly typed whereas scripting languages are typeless. Being typeless was in Ousterhout's mind a necessary trait for a scripting language to serve as "glue language" to connect the components of an application written in other languages. Ousterhout also noted that system programming languages are usually compiled whereas scripting langauges are usually interpreted, and he predicted that the relative use of scripting language would rise.

Later Web Developments

HTML Templates

PHP/FI Version 2.0
PHP Usage

Web development with CGI scripts written in Perl was easier than writing web server plug-ins in C. The task of writing Perl CGI scripts was made easier by libraries such as cgi-lib.pl and CGI.pm. These libraries made the query parameters available in a uniform fashion regardless of whether a GET or POST request was being handled and also took care of assembling the headers in the response. Still, CGI scripts tended to be difficult to maintain because of the piecemeal manner in which the response document is assembled.

Rasmus Lerdorf adopted a template approach for maintaining his personal home page. The document to be served up was mostly static HTML with an escaping mechanism for inserting snippets of code. In version 2.0 the escapes were <? code > and <?echo code >. Lerdorf released the code for the original version, called PHP/FI and implemented in Perl, in 1995. The original version was re-implemented in C and version 2.0 was released in 1997. For version 3.0, released in 1998, the name was simplified to PHP. Versions 4.0 and 5.0 were released in 2000 and 2004. PHP greatly increased in popularity with the release of version 4.0. Forum software, blogging software, wikis, and other content management systems (CMS) are often implemented in PHP.

Microsoft added a tempate engine called Active Server Pages (ASP) for IIS in 1996. ASP uses <% code %> and <%= code %> for escapes; the code inside the script could be any number of languages but was usually a dialect of Visual Basic called VBScript. Java Server Pages (JSP), introduced by Sun in 1999, uses the same escapes to embed Java.

MVC Frameworks

The template approach to web development has limitations. Consider the case where the web designer wants to present a certain page if the user is logged in, and a completely unrelated page if the user is not logged in. If the request is routed to an HTML template, then the template will likely have to contain a branch and two mostly unrelated HTML templates. The page that is presented when the user is not logged in might also be displayed under other circumstances, and unless some code sharing mechanism is devised, there will be duplicate code and the maintenance problem that entails.

The solution is for the request to initially be handled by a controller. Based upon the circumstances of the request, the controller chooses the correct HTML template, or view, to present to the user.

Websites frequently retrieve data from and persist data to a database. In a simple PHP website, the SQL might be placed directly in the HTML template. However, this results in a file which mixes three languages: SQL, HTML, and PHP. It is cleaner to put all database access into a separate file or model, and this also promotes code reuse.

The Model-View-Controller design pattern was conceived in 1978. It was used in Smalltalk for GUI design. It was perhaps in Java that the MVC pattern was introduced to web development.

Early versions of Java were more likely to be run in the browser as an applet than in the server. Sun finalized the Servlet API in June 1997. Servlets handled requests and returned responses, and thus were the equivalent of controllers in the MVC pattern. Sun worked on a reference web server which used servlets. This code was donated to the Apache foundation, which used it in the Tomcat webserver, released in 1999. The same year Sun introduced JSP, which corresponds to the view of the MVC pattern.

The Struts MVC framework was introduced in 2000. The Spring MVC framework was introduced in 2002; some prefer it to Struts because it doesn't use Enterprise JavaBeans. Hibernate, introduced in 2002, is an ORM and can serve as the model of an MVC framework.

Ruby on Rails was released in 2004. Ruby has a couple of advantages over Java when implementing an MVC framework. The models can inspect the database and create accessor methods for each column in the underlying table on the fly. Ruby is more concise than Java and has better string manipulation features, so it is a better language to use in HTML templates. Other dynamic languages have built MVC frameworks, e.g. Django for Python.

PHP Version History

1.0 (1995-06-08)

Originally called PHP/FI and implemented in Perl. Reimplemented in C.

2.0 (1997-11-01)

<? code > and <?echo code > are the syntax used to insert code into an HTML template.

3.0 (1998-06-06)

Name simplified to PHP.

4.0 (2000-05-22)

5.0 (2004-07-13)

5.1 (2005-11-24)

5.2 (2006-11-02)

5.3 (2009-06-30)

Namespaces are added.

Perl Version History

Perl 1.0 (gzipped tar) Dec 18, 1987

Perl 2.0 (gizpped tar) Jun 5, 1988

New regexp routines derived from Henry Spencer's.
- Support for /(foo|bar)/.
- Support for /(foo)*/ and /(foo)+/.
- \s for whitespace, \S for non-, \d for digit, \D nondigit
Local variables in blocks, subroutines and evals.
Recursive subroutine calls are now supported.
Array values may now be interpolated into lists: unlink 'foo', 'bar', @trashcan, 'tmp';
File globbing.
Use of <> in array contexts returns the whole file or glob list.
New iterator for normal arrays, foreach, that allows both read and write.
Ability to open pipe to a forked off script for secure pipes in setuid scripts.
File inclusion via do 'foo.pl';
More file tests, including -t to see if, for instance, stdin is a terminal. File tests now behave in a more correct manner. You can do file tests on filehandles as well as filenames. The special filetests -T and -B test a file to see if it's text or binary.
An eof can now be used on each file of the <> input for such purposes as resetting the line numbers or appending to each file of an inplace edit.
Assignments can now function as lvalues, so you can say things like ($HOST = $host) =~ tr/a-z/A-Z/; ($obj = $src) =~ s/\.c$/.o/;
You can now do certain file operations with a variable which holds the name of a filehandle, e.g. open(++$incl,$includefilename); $foo = <$incl>;
Warnings are now available (with -w) on use of uninitialized variables and on identifiers that are mentioned only once, and on reference to various undefined things.
There is now a wait operator.
There is now a sort operator.
The manual is now not lying when it says that perl is generally faster than sed. I hope.

Perl 3.0 (gzipped tar) Oct 18, 1989)

Perl can now handle binary data correctly and has functions to pack and unpack binary structures into arrays or lists. You can now do arbitrary ioctl functions.
You can now pass things to subroutines by reference.
Debugger enhancements.
An array or associative array may now appear in a local() list.
Array values may now be interpolated into strings.
Subroutine names are now distinguished by prefixing with &. You can call subroutines without using do, and without passing any argument list at all.
You can use the new -u switch to cause perl to dump core so that you can run undump and produce a binary executable image. Alternately you can use the "dump" operator after initializing any variables and such.
You can now chop lists.
Perl now uses /bin/csh to do filename globbing, if available. This means that filenames with spaces or other strangenesses work right.
New functions: mkdir and rmdir, getppid, getpgrp and setpgrp, getpriority and setpriority, chroot, ioctl and fcntl, flock, readlink, lstat, rindex, pack and unpack, read, warn, dbmopen and dbmclose, dump, reverse, defined, undef.

Perl 4.0 (gzipped tar) Mar 21, 1991

According to wikipedia, this was not a major change. The version was bumped solely because of the publication of the camel book.
There were 36 updates numbered 4.0.01 to 4.0.36. These were distributed as patches and usually contained a single commit. The 4.0.36 patch was released in 1993.

Perl 5.0 (Oct 18, 1994)

Objects.
The documentation is much more extensive and perldoc along with pod is introduced.
Lexical scoping available via my. eval can see the current lexical variables.
The preferred package delimiter is now :: rather than '.
New functions include: abs(), chr(), uc(), ucfirst(), lc(), lcfirst(), chomp(), glob()
There is now an English module that provides human readable translations for cryptic variable names.
Several previously added features have been subsumed under the new keywords use and no.
Pattern matches may now be followed by an m or s modifier to explicitly request multiline or singleline semantics. An s modifier makes . match newline.
@ now always interpolates an array in double-quotish strings. Some programs may now need to use backslash to protect any @ that shouldn't interpolate.
It is no longer syntactically legal to use whitespace as the name of a variable, or as a delimiter for any kind of quote construct.
The -w switch is much more informative.
=> is now a synonym for comma. This is useful as documentation for arguments that come in pairs, such as initializers for associative arrays, or named arguments to a subroutine.

Perl 5.4 (aka 5.004) (May 1997)

Perl 5.5 (aka 5.005) (July 1998)

experimental threads implementation
experimental compiler implementation
new regular expression constructs: (?<=RE), (?<!RE), (?{ CODE }), (?i-x), (?i:RE), (?(COND)YES_RE|NO_RE), (?>RE), \z

Perl 5.6 Mar 28, 2000

Several experimental features, including: support for Unicode, fork() emulation on Windows, 64-bit support, lvalue subroutines, weak references, and new regular expression constructs. See below for the full list.
Standard internal representation for strings is UTF-8. (EBCDIC support has been discontinued because of this.)
Better support for interpreter concurrency.
Lexically scoped warning categories.
"our" declarations for global variables.
String literals can be written using character ordinals. For example, v102.111.111 is the same as "foo".
New syntax for subroutine attributes. (The attrs pragma is now deprecated.)
Filehandles can be autovivified. For example: open my $foo, $file or die;
open() may be called with three arguments to avoid magic behavior.
Support for large files, where available (will be enabled by default.)
CHECK blocks. These are like END blocks, but will be called when the compilation of the main program ends.
POSIX character class syntax supported, e.g. /[[:alpha:]]/
pack() and unpack() support null-terminated strings, native data types, counted strings, and comments in templates
Support for binary numbers.
exists() and delete() work on array elements. Existence of a subroutine (as opposed to its defined-ness) may also be checked with exists(&sub)).
Where possible, Perl does the sane thing to deal with buffered data automatically.
binmode() can be used to set :crlf and :raw modes on dosish platforms. The open pragma does the same in the lexical scope, allowing the mode to be set for backticks.
Many modules now come standard, including Devel::DProf, Devel::Peek, and Pod::Parser.
Many modules have been substantially revised or rewritten.
The JPL ("Java Perl Lingo") distribution comes bundled with Perl.
Most platform ports have improved functionality. Support for EBCDIC platforms has been withdrawn due to standardization on UTF-8.
Much new documentation in the form of tutorials and reference information has been added.
Plenty of bug fixes.

Perl 5.8 (July 2002)

new threading implementation (5.005 threads are deprecated)
better unicode support, including support in regular expressions
64bit support
new modules in core library: Digest::MD5, File::Temp, Filter::Simple, libnet, List::Util, Memoize, MIME::Base64, Scalar::Util, Storable, Switch, Test::More, Test::Simple, Text::Balanced, Tie::File

Perl 5.10 (Dec 2007)

smart match operator: ~~
switch statement
named captures in regular expressions
recursive regular expressions
state variables (like static variables in C functions)
defined-or operator: //
field hashes for inside-out objects

Perl 5.12 (Apr 2010)

Unicode implemented according to Unicode standard
improvements to time, fix to Y2038
version numbers in package statements

Perl 5.14 (May 2011)

/r flag to make s/// non-destructive
package Foo {} syntax

Python Version History

Python History Blog by Guido van Rossum
Brief Timeline of Python:
History of Python

0.9 (Feb 20, 1991)

classes with inheritance
exception handling
list, dict, str datatypes
modules

1.0 (Jan 26, 1994)

lambda, map, filter, reduce

1.1 (Oct 11, 1994)

1.2 (Apr 13, 1995)

1.3 (Oct 13, 1995)

1.4 (Oct 25, 1996)

1.5 (Jan 3, 1998)

exceptions are classes, not strings
re module (perl style regular expressions) replaces regex
nested modules (i.e. hierarchical namespace)

2.0 (Oct 16, 2000)

unicode
list comprehensions
augmented assignment (+=, *=, etc.)
cycle detection added to garbage collector

2.1 (Apr 17, 2001)

nested lexical scope
classes able to override all comparison operators individually

2.2 (Dec 21, 2001)

introduction of new-style classes
descriptors and properties
ability to subclass built-in classes
class and static methods
callbacks for object property access
ability to restrict settable attributes to class defined set
base class object added for all built-ins
for generalized from sequences to all objects with iter()
integers and long integers unified, removing some overflow errors
// is integer division, and / is now float division
add generators and the yield statement

2.3 (Jul 29, 2003)

sets module
generators/yield no longer optional
source code encodings (for string literals only): # -*- coding: UTF-8 -*-
boolean type added

2.4 (Nov 30, 2004)

sets built-in
Template class for string substitutions (in addition to older % operator)
generator expressions (when list comprehensions use too much memory)
decorator functions for type checking
decimal data type with user specified precision

2.5 (Sep 16, 2006)

conditional expressions
partial function evaluation: partial()
unifed try/except/finally
with statement and context management protocol (must be imported from __future__)

2.6 (Oct 1, 2008)

-3 command line switch warns about use of features absent in python 3.0
with statement does not require import
multiprocessing package (processes communicate via queues, threadlike syntax for process management)
str.format()
as keyword in except clause: except TypeError as exc (old syntax still supported: except TypeError, exc)
Abstract Base Classes (ABC): Container, Iterable; user can define an ABC with ABCMeta
binary string literal b'', binary and octal numbers 0b01010 and 0o1723
class decorators
number hierarchy expansion: Number, Fraction, Integral, Complex

3.0 (Dec 3, 2008)

old style classes no longer available
print is function instead of statement (parens around argument mandatory)
comma in except clause discontinued: (except TypeError, exc)
1/2 returns float instead of int (1//2, available since 2.2, returns int)
strings (str) are always Unicode; bytes data type available for arbitrary 8-bit data

3.1 (Jun 27, 2009)