Shiny Perls of code

  • Perl has sometimes been called "the duct tape of the Internet". It is probably the first Unix-inspired programming language to combine the power of both shellscript and C. Newer languages have become more popular, including Python, Ruby and new entrants like Go. It is very difficult to get quantitative data on the relative popularity of programming languages, but some attempts have been made (see here and here). Depending on the yardsticks chosen, Perl continues to be roughly about as popular as Python, and is less popular than Java, PHP, C, and C++.

  • Why Perl At one level, the reasons why our seniormost engineers adopted Perl fifteen years ago largely remain valid even today. Perl is probably the best general-purpose programming language for systems programming in the Unix environment. It is no longer the best for building Web-based interfaces -- there are other choices, including Java, Python and PHP. But when it comes to writing network daemons, system management programs, batch processing programs, etc., there is probably no other programming language in the Unix environment which beats Perl. Perl permits the programmer to enjoy the text processing power of shellscripts with the performance and low-level system programming access of C.

    Perl has its critics. The most visible criticism is about Perl syntax -- it is supposed to be unreadable. We say that Perl, like C, is an acquired taste.

  • Our work in Perl We have built entire complex software products in Perl, including network daemons, browser-based interfaces, report generators, communication programs, and everything else.

    Our seniormost engineers started working with Perl 4 and moved to Perl 5 when it stabilised. We learned Perl best practices from the Internet community and eschewed pyrotechnics (always a temptation with a taut and dynamic programming language) in favour of strict processes and safe programming practices. All our Perl code runs with -w and is warning-free.

  • C and Perl We first integrated Perl with a C library when we began using the GD library to create GIF images from within Perl code. Many years later, we needed to communicate using Modbus with some industrial control devices. We used and debugged a Modbus library written in C, and then linked the C library to the Perl interpreter through the Perl XS interface. This library has been in production use in factory shop floors for more than four years.

    In 2011, we wrote a string search library in C with a highly specialised algorithm. This algorithm uses large in-memory data structures which it traverses to deliver very fast searches. Since the creation of these in-memory data structures is very time consuming, we modified the data structure to work in a chunk of RAM which we then loaded and stored from a disk file using the Linux mmap() system calls. Once this was stable, we integrated this library into Perl through the XS interface. We are not aware of any other C library which uses mmap() and which exports a Perl-callable API.

  • External interaction Two types of communication are used heavily from our Perl code: databases and networks.

    We have written Perl code which works reliably with Oracle, IBM DB2, MySQL, PostgreSQL, mSQL, SQLite, and Microsoft SQL Server. We have used direct SQL strings as well as the prepare() and execute() approach. We have invoked stored procedures. We have written code to detect database connection errors and re-attempt connection.

    We have written network code for TCP and UDP communication. We have implemented long-running network daemons, where reliability depends on an absence of memory leaks. We have integrated such network daemons with the Sendmail message transport agent through its milter interface, and with the Squid proxy server cache through its authenticator and URL redirector interfaces. In one particularly challenging situation, we had to keep a large read-only data structure in memory in our daemon, taking its RAM consumption to about 2 GB. For performance reasons, we had to keep 5-10 instances of this daemon active to share the work load. If each instance of daemon had built its own copy of this data structure, the total RAM requirement would have exceeded 10 GB and would have hit system limits. Therefore, we chose to start one daemon instance first. This daemon builds the in-memory data structure, then fork()s the other daemon instances. The copy-on-write memory management of modern Unixen ensures that total physical RAM requirements for all these instances of the daemon only slightly exceeds the RAM requirements for the first instance.

  • Future perls Perl is not expected to die anytime soon. The Perl community is watching the development of Perl6 and hoping that it will give the language a new lease on life. One serious limitation of Perl5 is its lack of support for true multi-threaded programming. There are excellent co-routine libraries, but they do not support shared-memory applications with true pre-emptive OS-level threads which can draw on the power of SMP hardware. We have also stopped using Perl for Web application front-end development -- we find PHP and Java more suitable for such work. But there is no replacement for Perl for systems programming in the Unix environment and we will continue to work in Perl.


  • Perl: official site

    The most popular all-purpose interpreted language on Unix. Also the Wikipedia page

  • CPAN

    The Comprehensive Perl Archive Network, hosting all the publicly available Perl modules, libraries and other resources

  • Perl Coro

    Co-routine implementation for Perl programmers, in the absence of OS-supported pre-emptive thread support

  • The Catalyst framework

    The most popular agile Perl MVC Web framework

  • Perl 6

    The Perl 6 site, dedicated to the next new version of Perl