Pros and cons of iOS machine learning APIs

Matthijs Hollemans
by Matthijs Hollemans
23 July 2017

Table of contents

If Apple’s announcements at WWDC have got you excited about adding machine learning to your iOS apps, you’re probably thinking, “I should use Core ML for this.”

However, Core ML is not the only choice. In this blog post I will list the possible ways you can add machine learning to your iOS apps.

These are the APIs you can choose from:

Note that none of these APIs offer training on the device — they are for making predictions only. For the time being, training is still something you’ll need to do offline or in the cloud. (Interestingly, Clarifai recently announced a mobile SDK that allows for training on the device.)

Before I go into the pros and cons of these APIs, I want to make an important point about Core ML first…

Warning: Core ML is not magic

Even though it’s now easier than ever to include machine learning models into your apps, you still need to know a thing or two about machine learning.

It's not magic!

With Core ML it’s literally possible to drag-and-drop a pre-trained model such as Inception-v3 into your app and then use it to make predictions — without having any further knowledge about machine learning. However, that limits you to only ever using other people’s models.

Most app developers will want to use their own models, or use existing models that work on their own data. If that’s you, this means you’ll have to learn how to build and train your own models.

And for that, you need to understand how machine learning works. If you’re new to the field, get ready to do some studying.

Note: Despite the hype, mobile devices are still quite limited in what they can do when it comes to machine learning. Don’t expect to see the latest deep learning research running in real-time on an iPhone or iPad. We’re at the point where deep learning is only just starting to become feasible on mobile devices. No doubt machine learning on mobile will be ubiquitous in two or three years time, but right now you’ll need to work within — or around — some very real limitations.

With that out of the way, let’s look at the possible machine learning APIs that are currently available for iOS.

The cloud

By far the easiest way to add machine learning into your apps is by using a cloud service such as Clarifai, Google Cloud Vision, IBM Watson, Microsoft Azure Cognitive Services, Amazon Rekognition, and many others.

With these kinds of cloud services, Machine learning is simply a matter of calling a web API — any iOS developer will know how to do this. You can also host your own models in the cloud.

Pros:

Cons:

For an in-depth look at the tradeoffs between cloud and mobile machine learning, see my article Machine learning on mobile: on the device or in the cloud?

Core ML (iOS 11)

If you don’t want to use the cloud but run machine learning algorithms directly on the user’s device, then Core ML is the easiest solution. Just take a model and drop it into your Xcode project. Done.

Well, that’s not entirely true… For many machine learning models you’ll still need to process the input and output data that goes into / comes out of the model.

Models are very particular about the inputs they accept, so you’ll need to massage your data into a format that the model understands. Likewise, you have to convert the model’s output into something that your app can use.

For image data this is easy (use the Vision framework); for other data you’re on your own. But at least you won’t need to worry about doing any of the machine learning computations — that’s Core ML’s job.

Note: If you’re using Core ML or any of the other on-device solutions, you will need to obtain a model that is suitable for your app. The models that Apple makes available for download are only capable of a limited number of tasks. For most apps you’ll have to train your own models, which requires special expertise. And machine learning experts are in high demand these days, so they don’t come cheap. 💰

Pros of Core ML:

Cons:

Update Dec 2017: As of iOS 11.2, Core ML supports custom layers in neural networks. This fixes the biggest complaint I had about Core ML and makes it a lot more powerful!

See also Alex Sosnovshchenko’s critique of Core ML: Why Core ML will not work for your app (most likely).

* * *

Regardless of whether you’re training your own model or you’re using a freely available one, you’re still going to have to convert the model to Core ML. And to do this you need to understand a little about how the model was trained, and the software that was used for training.

To convert models you don’t need to be an expert but you do need to have a grasp on the fundamentals of machine learning. The most common thing I see go wrong is that someone converts a model but skips over the step that converts the input (such as an image) into the format the model expects, and then the model will not make the correct predictions. As they say, garbage in equals garbage out.

It’s possible that we’re going to see online marketplaces for pre-trained Core ML models that are ready to use — but if you have a unique set of data that you want to train on, or a unique problem you want to solve, you’re going to have to get your hands dirty.

If you’re just getting your feet wet with machine learning on mobile, then Core ML is the way to go. But chances are you’ll run into some of its limitations before long. If it works for you, great. If not, then you’ll need to use one of the APIs I’ll describe next.

MPS graph API (iOS 11)

MPS stands for Metal Performance Shaders and is a collection of classes that let you unleash the power of Metal without having to write any GPU code — or really without having to know much about Metal at all.

As of iOS 11, you can create neural networks using the new graph API. You simply add the layers in your neural net to this graph and then MPS does the rest.

Note: I’m making a distinction here between the higher-level MPS graph API — MPSNNGraph and related classes — and the lower-level MPS kernels such as MPSCNNConvolution (covered in the next section).

Pros of the graph API:

Cons:

I like the idea of the MPS graph API but in practice I’m just not feeling it. The world of deep learning moves fast. By using the graph API you’re limiting yourself to the operations that Apple has decided to implement. If next week a new paper comes out with a new layer type or a new activation function, then you cannot implement this with the MPS graph API. And even if Apple does make an effort to support new layer types in the future, you’ll always have to wait until the next OS upgrade to get them.

Note: I think we as developers would benefit from frameworks such as Core ML and the MPS graph API being decoupled from OS updates — and preferably even being open source. Every other machine learning framework is open source and they all have thriving communities of contributors. I can understand that Apple wants to keep some portions of their frameworks secret, and running an open source project takes up time and resources, but by keeping their frameworks behind closed doors Apple will always be a step behind everyone else. </rant>

Low-level MPS (iOS 10 and 11)

For full control over your GPU compute pipeline, Metal Performance Shaders (MPS) are the answer. Didn’t we just talk about MPS? Yes and no.

MPS is a framework that contains a lot of powerful functionality that runs on the GPU. This includes:

When you use the graph API, you take the MPS building blocks — known as kernels — and stick them into graph nodes. Then as you execute the graph, MPS will make all these nodes run on the GPU automatically and in the correct order.

But what I’m talking about here is using the MPS kernels by hand.

Instead of building a graph — where you have to play by the rules of the graph — you now instantiate the layer objects you need, such as MPSCNNConvolution, and you encode them yourself into a Metal command buffer. It’s less convenient than building the graph, but in return you get full control.

Often you’ll combine MPS with your own Metal compute kernels. You use MPS to handle things like convolutional layers, but you’ll use your own Metal code for layer types that MPS does not support.

If Core ML doesn’t support want you want to do — and the MPS graph API isn’t helping out either — then these low-level compute kernels are what you turn to.

Pros:

Cons:

Even though it’s more work than Core ML or the graph API, using the MPS kernels in combination with your own compute shaders is where you’ll need to go if the other options don’t support the functionality you need.

Note: To make it easier to implement neural networks with Metal, I wrote an open source library called Forge. It lets you use the MPS kernels without most of the boilerplate. It also comes with a bunch of cool demos.

Third-party APIs

There are several non-Apple APIs for machine learning. I’ll highlight the two most popular: TensorFlow and Caffe.

TensorFlow

You’ve probably heard about TensorFlow. It’s the most popular tool for building machine learning models and there’s a version that runs natively on iOS.

To get a better idea of what’s involved in using TensorFlow on iOS, check out the blog post I wrote about it.

Pros:

Cons:

Even though I don’t like how slow TensorFlow is, it’s still a great option if you’re still iterating on your model. It only takes a few lines of code to load the model into your app, and this can give you a quick way to evaluate how well your model is doing in practice — without having to spend days implementing it from scratch in Metal first. Once you have a model that works, you can decide to speed it up by converting to Metal.

Some time this fall Google is planning to release TensorFlow Lite, which supposedly is optimized for mobile devices. I’m curious to see how fast this will run on iOS, or if Google’s priority is Android first.

Note: An interesting alternative to using the TensorFlow library is Bender. This open source project lets you load TensorFlow models into your app but uses Metal Performance Shaders to actually run the models. This makes it a lot faster than using the TensorFlow library. Of course, this approach only supports a subset of the full TensorFlow functionality, but it’s worth checking out if you’ve got a TensorFlow model!

Caffe

Caffe is one of the original deep learning tools and it’s still being used by many researchers. Many pretrained models are available in Caffe format.

There are some unofficial Caffe ports for iOS on GitHub. Caffe2 is a modern rewrite of the original and has native support for iOS and Metal.

The pros and cons are similar to those of TensorFlow: loading your models is fairly easy / you have to use C++ / it tends to be slower than re-implementing the model yourself from scratch using Metal Performance Shaders.

Roll your own

As a last-ditch option, you can skip the Apple APIs altogether and implement your own machine learning algorithms from scratch.

Why do this? At the moment, Core ML does not support Naive Bayes classification, for example. Neither does MPS.

Does that mean you can’t use a Naive Bayes model on iOS? Of course you can… but you’ll have to program it yourself. (Tip: search GitHub first.)

For models that are not deep neural networks, I recommend using the Accelerate framework. Accelerate has loads of functionality for writing really fast applications. It’s one of those obscure libraries that tend to scare away people — some of the naming conventions go back to 1960’s FORTRAN libraries — but it’s a great tool to have in your arsenal.

And if all else fails, you can write many machine learning algorithms directly in Swift or Objective-C. For basic models, straight Swift code will be fast enough. And more often than not, a basic model is all you need.

Pros of the DIY approach:

Cons:

If Core ML does not support your particular model type and it’s not a deep neural network, then implementing the algorithms yourself is a valid strategy. There are plenty of books and online courses where you can learn how to do this.

Conclusion

As you can see, there are a lot of ways to do machine learning on iOS. Core ML is definitely the first thing you should try. And if your model is not supported by Core ML, it is usually possible to get it working using TensorFlow or Metal. 🤘

Written by Matthijs Hollemans.
First published on Sunday, 23 July 2017.
If you liked this post, say hi on Twitter @mhollemans or LinkedIn.
Find the source code on my GitHub.

Code Your Own Synth Plug-Ins With C++ and JUCENew e-book: Code Your Own Synth Plug-Ins With C++ and JUCE
Interested in how computers make sound? Learn the fundamentals of audio programming by building a fully-featured software synthesizer plug-in, with every step explained in detail. Not too much math, lots of in-depth information! Get the book at Leanpub.com