Best method for protecting IP data downloaded to an iOS App?

Question

I'm enhancing a commercial App which until now has used cloud AI models to analyse data and make predictions.

The enhancement is moving the models onto the app for applications with no or limited network access.

These models represent significant IP to our clients and it is essential that we secure any data downloaded to a device from theft.

The App is iOS only for now and I was intrigued by WWDC2020's CoreML update including support for encrypting models. This would be ideal but we can't use CoreML at the moment due to its API not supporting the methods our models require. Nice to know though that this is a recognised issue with in-app ML model usage.

What is the best method and available options in iOS (>11.0) right now that won't run foul of encryption export laws or even Apple's app store rules etc?

Or models are Javascript which we run in a JavaScriptCore VM with additional data files loaded from json string files.

My current thinking is to use something like the iOS AES encryption. Not hardwire the private key in the app but instead pass it via https, after a user logs in, storing it in the keychain. Decrypt the data strings in memory before loading into the JS VM.

I can see the obvious weaknesses with this approach and would be keen to hear how others have approached this?

If your data is valuable enough and your app can be reversed engineered then this provides no protection, at least against attackers who can access a user account (perhaps by paying to become a customer). — President James K. Polk, Jul 10 '20 at 02:11
Right, that's why I said "I can see the obvious weaknesses". I'm looking for a solution that isn't obvious to me. There may not be one but the fact that Apple is including one in CoreML gives me hope that there is a good way to do this. ??? — Seoras, Jul 10 '20 at 02:11
You could use an ephemeral ECDH session to connect to a server and fetch a symmetric key, decrypt and then use. Zero after session is finished. Or some variant — Woodstock, Jul 10 '20 at 11:33
@Woodstock: That assumes his app is connecting and not my IP-stealing app that I made by reverse engineering his app. I'm a user of the system and I have valid authentication credentials. — President James K. Polk, Jul 10 '20 at 15:25
Good point @PresidentJamesK.Polk - I agree some extra auth is needed. — Woodstock, Jul 10 '20 at 16:48
The crux of this problem, in my mind, is getting the private key onto the device and into the iOS keychain in a way that can't be spoofed by a user with login credentials. I'm assuming that the iOS keychain is protected from users or jail breakers? For B2B the client is unlikely to want to steal their own IP. For B2C the problem of potential theft is real. Maybe I should be asking Apple if their method can be extended outside of CoreML. I do understand that there is no good way to hardwire API or encryption keys into the App binary. I'm I looking for platform/OS support to help here I think? — Seoras, Jul 11 '20 at 05:14

score 0 · Answer 1 · answered Jul 23 '20 at 12:37

The Data

The enhancement is moving the models onto the app for applications with no or limited network access.

These models represent significant IP to our clients and it is essential that we secure any data downloaded to a device from theft.

From the moment you make the data/secrets public, in the sense you include it with your mobile app binary or later download it into the device and store it encrypted, you need to consider it compromised. No bullet proof around this, no matter what you try, you can only make it harder to steal, but with all the instrumentation frameworks available to introspect and instrument code at runtime, your encrypted data can be extracted from the function that decrypts it:

Decrypt the data strings in memory before loading into the JS VM.

An example of a very popular instrumentation framework is Frida:

Inject your own scripts into black box processes. Hook any function, spy on crypto APIs or trace private application code, no source code needed. Edit, hit save, and instantly see the results. All without compilation steps or program restarts.

The Private Key

My current thinking is to use something like the iOS AES encryption. Not hardwire the private key in the app but instead pass it via https, after a user logs in, storing it in the keychain.

While not hard-coding the private key in the device is a wise decision it doesn't prevent the attacker from performing a man in the middle(MitM) attack to steal it, or use an instrumentation Framework to hook into the code that stores it in the keychain, but you may already be aware of this or not, because it's not clear from:

I can see the obvious weaknesses with this approach...

In my opinion, and as a side note, I think that first you and the business need to consider if the benefits for the user in having the predictions being made locally on their device outweighs the huge risk being taken of moving the data from the cloud into the device, and data protections laws need to be taken in consideration, because the fines when a data breach occurs can have a huge impact in the organization future.

iOS Solutions

What is the best method and available options in iOS (>11.0) right now that won't run foul of encryption export laws or even Apple's app store rules etc?

I am not an expert in iOS, thus I cannot help you much here, other then recommending you to use as many obfuscation techniques and run-time application self-protections(RASP) in top of the solution you already devised to protect your data, so that you can make an attacker life harder.

RASP:

Runtime application self-protection (RASP) is a security technology that uses runtime instrumentation to detect and block computer attacks by taking advantage of information from inside the running software.

RASP technology is said to improve the security of software by monitoring its inputs, and blocking those that could allow attacks, while protecting the runtime environment from unwanted changes and tampering.

You can also try to use advanced bio-metrics solutions to ensure that a real user is present while the mobile app is being used, but bearing in mind that the more skilled attackers will always find a way to extract the data to a command and control server. It's not a question if they will be able, but when it will happen, and when it happens it's a data breach, and you need to have planned ahead to deal with it's business and legal consequences.

So after you apply the most suitable in app defenses you still have an issue left to resolve, that boils down to ensure your API server knows what is making the request, because it seems you already have implemented user authentication to solve in behalf of who the request is being made.

The Difference Between WHO and WHAT is Accessing the API Server

When downloading the data into the device you need to consider how you will ensure that your API server is indeed accepting the download requests from what you expect, a genuine instance of your mobile app, not from a script, bot, etc., and I need to alert you that user authentication only says in behalf of who the request is being made, not what is doing it.

I wrote a series of articles around API and Mobile security, and in the article Why Does Your Mobile App Need An Api Key? you can read in detail the difference between who and what is accessing your API server, but I will extract here the main takes from it:

The what is the thing making the request to the API server. Is it really a genuine instance of your mobile app, or is it a bot, an automated script or an attacker manually poking around your API server with a tool like Postman?

The who is the user of the mobile app that we can authenticate, authorize and identify in several ways, like using OpenID Connect or OAUTH2 flows.

Think about the who as the user your API server will be able to Authenticate and Authorize access to the data, and think about the what as the software making that request in behalf of the user.

I see this misconception arise over and over, even among experienced developers, devops and devsecops, because our industry is more geared towards identifying the who not the what.

Others approach

I can see the obvious weaknesses with this approach and would be keen to hear how others have approached this?

As I said previously I am not an expert in iOS and I don't have more to offer to you then what I have already mention in the iOS Solutions section, but if you want to learn how you can lock your mobile app to the API server in order to only reply with a very high degree of confidence to requests from a genuine instance of your mobile app, then I recommend you to read my accepted answer to the question How to secure an API REST for mobile app?, specifically the section Securing the API server and the section A Possible Better Solution, where you will learn how the Mobile App Attestation concept may be a possible solution for this problem.

Do you want to go the Extra Mile?

In any response to a security question I always like to reference the amazing work from the OWASP foundation.

For Mobile Apps

OWASP Mobile Security Project - Top 10 risks

The OWASP Mobile Security Project is a centralized resource intended to give developers and security teams the resources they need to build and maintain secure mobile applications. Through the project, our goal is to classify mobile security risks and provide developmental controls to reduce their impact or likelihood of exploitation.

OWASP - Mobile Security Testing Guide:

The Mobile Security Testing Guide (MSTG) is a comprehensive manual for mobile app security development, testing and reverse engineering.

For APIS

OWASP API Security Top 10

The OWASP API Security Project seeks to provide value to software developers and security assessors by underscoring the potential risks in insecure APIs, and illustrating how these risks may be mitigated. In order to facilitate this goal, the OWASP API Security Project will create and maintain a Top 10 API Security Risks document, as well as a documentation portal for best practices when creating or assessing APIs.

Thanks for answering my question. I'd run a parallel question over on Apple's developer forums and someone from Apple answered. They drew a good parallel between what I was trying to do with DRM, pointing out that there is no secure DRM. Anything can be cracked. In your other answer you talk about "Attestation" which I think is the what Apple are doing in iOS14's new "DeviceCheck" https://developer.apple.com/documentation/devicecheck. We are in talks with our client(s) about the risks and perhaps waiting for iOS14 and making it a requirement will be the way forward. — Seoras, Jul 28 '20 at 01:46
From a first glance the new DeviceCheck capability of storing a unique key per device in the secure enclave and using it for signing requests with a challenge from the server looks like an attempt to limit spoofing/replay requests to your API, but doesn't seem to do realtime integrity checks in order to attest that is the same exact and untampered app binary you have uploaded to the Apple store. It looks that is still vulnerable to be instrumented by instrumentation Frameworks, like Frida, but I need to do a deep dive into all this new added capabilities to ensure my claims are accurate. — Exadra37, Jul 28 '20 at 11:48
Apple posted an update yesterday on DeviceCheck with news about their new "Attest" API. https://developer.apple.com/documentation/devicecheck/preparing_to_use_the_app_attest_service — Seoras, Aug 03 '20 at 20:29