Julia

A Julia Meta Tutorial

julia

If you are thinking about taking Julia, the hot new mathematical, statistical, and data-oriented programming language, for a test drive, you might need a little bit of help. In this blog we round up some great posts discussing various aspects of Julia to get you up and running faster.

Why We Created Julia

If only you could always read through the intentions and thoughts of the creators of a language! With Julia you can. Jump over to here to get the perspectives of four of the original developers, Jeff BezansonStefan KarpinskiViral Shah, and Alan Edelman.

We are power Matlab users. Some of us are Lisp hackers. Some are Pythonistas, others Rubyists, still others Perl hackers. There are those of us who used Mathematica before we could grow facial hair. There are those who still can’t grow facial hair. We’ve generated more R plots than any sane person should. C is our desert island programming language.

We love all of these languages; they are wonderful and powerful. For the work we do — scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing — each one is perfect for some aspects of the work and terrible for others. Each one is a trade-off.

We are greedy: we want more.

An IDE for Julia

If you are looking for an IDE for Julia, check out the Julia Studio. Even better, Forio, the makers of this IDE, offer a nice series of beginner, intermediate, and advanced tutorials to help you get up and running.

 Julia Documentation

By far the most comprehensive and best source of help and information on Julia are the ever growing Julia Docs which includes a Manual for the language (with a useful getting started guide), details of the Standard Library, and an overview of available packages.  Not to be missed are the two sections detailing noteworthy differences between Matlab and R.

MATLAB, R, and Julia: Languages for data analysis

Avi Bryant provides a very nice overview and comparison of Matlab, R, Julia, and Python. Definitely recommended reading if you are considering a new data analysis language.

An R Programmer Looks at Julia

This post is from mid-2012 so a lot has changed with Julia. However, it is an extensive look at the language from an experienced R developer.

There are many aspects of Julia that are quite intriguing to an R programmer. I am interested in programming languages for "Computing with Data", in John Chambers' term, or "Technical Computing", as the authors of Julia classify it. I believe that learning a programming language is somewhat like learning a natural language in that you need to live with it and use it for a while before you feel comfortable with it and with the culture surrounding it. Read more ...

The State of Statistics in Julia - Late 2012 

Continuing on this theme of statistics and Julia, John Myles White provides a great view of using Julia for statistics which he updated in December of last year.

A Matlab Programmer's Take on Julia - Mid 2012

A quick look at Julia from the perspective of a Matlab programmer and pretty insightful as well.

Julia is a new language for numerical computing. It is fast (comparable to C), its syntax is easy to pick up if you already know Matlab, supports parallelism and distributed computing, has a neat and powerful typing system, can call C and Fortran code, and includes a pretty web interface. It also has excellent online documentation. Crucially, and contrary to SciPy, it indexes from 1 instead of 0.  Read more ...

Why I am Not on the Julia Bandwagon Yet

Finally, we leave you, good reader, with a contrarian view point.

Python vs R vs SPSS ... Can't All Programmers Just Get Along?

Programmers have long been very proud and loyal with their tools, and often very vocal. This has led to well-contested rivalries and "fights" about which tool is better:

  • emacs or vi;
  • Java or C++;
  • Perl or Python;
  • Django or Rails;
  • and, for data geeks, the SAS/SPSS/R/Matlab fight.

file000890717941

The truth is, very few of us data geeks (data scientists, data analysts, statisticians, or what ever we call ourselves [editor note: Data Practitioners]) use only a single tool for all of our work. We will often extract data from a SQL database, munge it using Perl or Python, and then do statistical analysis using R or SAS, reporting the results using Word or, increasingly, the web. Specially for data analysis, there is often no single tool that can do the end-to-end workflow well, however much we would like to believe that there is. Each tool has its strengths and weaknesses, and often a mixture works best. The trick is in finding the right "glue" that can string our workflow together.

There are now several interface packages available to talk between open-source languages. I'll speak to the interfaces with R, which I'm most familiar with, but I'm sure that the community will point out other useful interfaces. R is not the fastest nor most elegant of languages, but has by far the richest ecosystem of cutting-edge data analysis packages. There are now ways to communicate with R from other general programming languages like Java (through the rJava package and JNI), Perl (Statistics::R, available in CPAN),  Python (rpy2, PypeR, available in PyPI). Packages in R allow communication out with general packages, like RSPython, RSPerl (both available at Omegahat) and rJava. Most commercial statistical packages, like SAS, SPSS and Statistica allow you to write R code to send to R and then get back the results. A specially nice SAS macro to do this for those without the latest versions of SAS is %Proc_R, available here. One can also call R from Matlab. There are also many ways of interfacing with R using web-based tools like Rserve or, on Windows, the rcom interface to utilize COM and connect with, among other things, Word and Excel.

More recently I have been excited about platforms where code can be written in different languages and integrated using literate programming (i.e., the weaving of the results of code with text to create reports).

  • Babel is a part of org-mode in Emacs which allows different programming languages to be used in the same document to perform an analysis and report. There are several examples of how this is done.
  • The latest IPython distribution now allows you to integrate other languages using user-contributed magic functions. The initial languages available are R, Octave and, very recently, Julia. The first two are already integrated into IPython. Using these magic functions, you can use the power of R, Octave and Julia along with all the tools available in Python like Numpy, Scipy, matplotlib, pandas and the like on one platform. Literate programming is easily achieved through the excellent HTML notebook that is now part of IPython distributions. Update: A sql magic function was just added to the ecosystem.

The interfacing tools I've described now allows us to create a greater ecosystem where different tools can be integrated to a common goal rather easily. Instead of fighting over which tool is better, we're now going to a place where that doesn't matter; what matters is being able to use the right tools for each piece of the job and getting the tools working together to do the best job possible. We can, after all, all get along.

PS: For translating code between Matlab/Octave, Python and R, there is a great little site called Mathesaurus.

 

(Note, DataCommunityDC is an Amazon Affiliate. Thus, if you click the image in the post and buy the book, we will make approximately $0.43 and retire to a small island).