Wednesday, May 10, 2023

Mojo: Is it the Last Programming Paradigm We Need?

I think it’s safe to say that anyone who has been writing code for any length of time has had to learn multiple languages. This is partially because of the nature of who we are, meaning anyone who is doing some kind of engineering or science, because we are always learning and on the lookout for that new language that handles different situations better than our current “main” language. Maybe it’s to go faster, use memory better or safer, or works with a new OS or hardware. Maybe it's a new language that is specialized for front end or back end, works close to the metal or is abstracted from the hardware, or it just follows new and better software design principles. 

Regardless of the reason, constantly learning changes in software languages is a requirement for working in software technology. So the question is if it's really necessary for yet another language - that language being Mojo from a company named Modular.

Before we get into why I think the answer could be yes, a little history. My very first language was Fortran, because at the time it was the best language, meaning fastest, for doing any kind of mathematical or scientific programming. At that time, there were very few packaged libraries. If you wanted special clustering options for a K-means cluster, you wrote that from scratch. If you wanted to convert a string to a number, you looped through each place in the array, subtracted 48 from it and built the number up. And there was always at least one person in an organization who knew Assembler for those heavily travelled pieces of functionality. Good times.

But I quickly moved on from Fortran and became involved with C/C++ because “modern language”, “Windows programming!”, and Object Oriented Programming, because OOP was going to be the ultimate way of structuring large, complicated ideas into code and the OOP paradigm ruled across many different languages for many years (well, until functional programming challenged some of its core ideas).

From C/C++, I went to C# and .Net for many years (which I loved), and of course Java, a myriad of JavaScript-based front-end languages, and many others. All great languages, but none of these compared to doing Python which has been my language of choice for many years. With Python and its easy-to-read syntax, productivity, and all of its supporting libraries; it’s really like having a superpower. 

This is all to set up why I hope a new language called Mojo, can take Python to an even higher level and maybe take myself and others off the rollercoaster of changing programming paradigms – at least for a while.

So what is Mojo?

Mojo is being put out by a company called Modular led by Chris Lattner. If that name sounds familiar, it’s because he’s the person who created Swift, LLVM, and Clang. LLVM has many parts, but at its core it’s a low-level, assembly-like representation called intermediate representation (IR) that many, many languages use or can use – including C#, Scala, Ruby, Julia, Kotlin, Rust, and the list goes on and on. 

LLVM is an important part of the Mojo story because even though Mojo is not based on LLVM, Mojo is based on a newer technology that grew out of LLVM called MLIR which is also another project led by Chris Lattner. One of the many things MLIR does is that it can abstract away the targeting of a variety of hardware – things like threads, vectors, GPUs, TPUs – things that are really important for modern AI programming.

All of this is to say that Mojo has some major history of years of technical expertise in compiler design behind it. 

But what does Mojo mean to the regular data engineer or scientist who doesn’t care about all of the details of how it compiles and just wants to get stuff done? 

The short answer would be that Mojo is (or hopes to be) a superset of Python that is much faster than Python that also makes it much easier to target typical machine learning hardware.  

A “superset” of Python? Yes, that means that all existing Python code and libraries will work without changing anything. As a superset language it may bring to mind C++ and TypeScript as supersets to C and JavaScript respectively. Although I’m not sure that the comparison will turn out completely accurate because for one TypeScript has some idiosyncrasies in creating TypeScript code that some would argue whether it’s correct or even important to call it a superset. And for C++, I think the transition for Python programmers to implement Mojo code might be easier than the pure C programmer implementing C++ code in their C code – but this all remains to be seen.


Advantages of Mojo over Python:

Outside of the aforementioned advantage of Mojo being a superset of Python, the main advantage is speed - pure and simple. One of the advantages of Python is that it’s interpreted, which increase productivity and ease of use, but it comes at a price of being really slow compared to languages like C/C++ and Rust.

Enter Mojo. As you can see in this demo, under the right circumstances Mojo can be up to 35,000x faster than Python. In the video, you can see Jeremy Howard, who is an advisor on Mojo, step through different optimizations that speed up a Python use case. But even if you don’t do all of the optimizations, you can see starting at 1:27 taking Python code without changing anything except running that same code using the Mojo compiler he got over an 8x speed up.

There are many speed up opportunities, too many to list, but it’s important to know that Mojo can explicitly do parallelism in a way that Python simply can’t do and it can eloquently take advantage of different hardware types like GPUs because of MLIR. 

And also because of MLIR, it’s not just targeting GPU, it could potentially take advantage of any emerging hardware – which is why it could have real staying power.

Finally, another important advantage is that Mojo can do better memory management similar to Rust and allows for strict type checking, which can eliminate many errors.


Reservations about Mojo:

Okay, this all sounds great, but what are the potential downsides. Well, there are a few, none of which I believe are big enough to dissuade anyone from trying out Mojo.

  • It’s not open source – yet

This I think is the biggest concern. Their intention is to open source it “soon” once it reaches a certain level. Their rationale is that they want to iron out the fundamentals and that they can move faster on it with their dedicated team before open sourcing it, which I can understand. I’m not an open source absolutist, but the concern is that the best way for a new language to break through the noise and reach wide adoption is for it to be open source. And maybe it might have been better to wait until they were ready to open source it before making the announcement.

  • It’s not generally available yet 

You can sign up for it and be put on a waitlist to have access to try it out on their notebook server. For a new language, it’s important that anyone can download and test it out locally on their machine, but even once you get access you can't run it locally yet. So again, it might have been better for them to wait for the release announcement until this was possible.

  • Productionizing

I think with any new language or even new versions of Python, there are always questions of how this will work in an existing pipeline and infrastructure. Not a huge problem, but something to note.


So what should you do now? 

If you are intrigued by Mojo, then you should read through their documentation and definitely sign up to get access here: https://www.modular.com/mojo. It doesn’t mean that the great Python speed ups that have been happening in Python 3.1x should be ignored or discounted. Or even if you have tried to solve some speed problems with Rust code talking to Python or using Julia that these efforts should be discarded.

But Mojo is something to have on the radar. If they are able to accomplish what they are setting out to do,  it could turn out to really define not just AI programming but be a new programming paradigm for a whole host of applications.

 

No comments:

Post a Comment

Elements of Monte Carlo Tree Search - Typical and Non-typical Applications

Monte Carlo Tree Search (MCTS) offers a very intuitive way of tackling challenging decision making problems. In essence, MCTS combines the...