CreateML and CoreML go Great Together

  • Posted on: 31 January 2019
  • By: C.J.
Image showing swift code used to create MLDataTable in Xcode by Apple

Apple makes it easy to build text and image classifier regressors. Cogsworth never sounded so smart!

I learned a trick about building text classifiers using Apple's CreateML. One, you have to upgrade to an OS that has full Siri capabilities, so that's macOS Mojave and iOS 12 (except on iPhone 5s, which does support iOS12, but doesn't have full Siri support).

Actually, those are just the requirements. The trick is to arrange and encapsulate your data for the regressor. Specifically, the text classifier doesn't know about negatives, so you have to train your model to know the context. I started with just 100 vocabulary items, but my training data grew to more than 900+ phrases for seven different dimensions. Once you have enough data, you can process your source and create an MLDataTable, but you won't get it right on your first try.

As part of the build process, you can segment your data for training and validation. But, with less than 1000 datapoints, I decided to do a full regression. When I did so, I found contradictions that I had missed earlier. I also found out that arranging my data in a reverse-hierarchy made for better backpropagation of errors. I grew my training accuracy (generated by an maximum entropy test that added noise to the original segmented data) from ~50% to ~85% with the full regression achieving 96% accuracy.

I think that's a good start! And Cogsworth sounds like it has empathy when I tell it how much I've worked!