Hello All,
I’m working on a computer-vision–heavy iOS application that uses the camera, LiDAR depth maps, and semantic segmentation to reason about the environment (object identification, localization and measurement - not just visualization).
Current architecture
I initially built the image pipeline around CIImage as a unifying abstraction. It seemed like a good idea because:
CIImage integrates cleanly with Vision, ARKit, AVFoundation, Metal, Core Graphics, etc.
It provides a rich set of out-of-the-box transforms and filters.
It is immutable and thread-safe, which significantly simplified concurrency in a multi-queue pipeline.
The LiDAR depth maps, semantic segmentation masks, etc. were treated as CIImages, with conversion to CVPixelBuffer or MTLTexture only at the edges when required.
Problem
I’ve run into cases where Core Image transformations do not preserve numeric fidelity for non-visual data.
Example:
Rendering a CIImage-backed segmentation mask into a larger CVPixelBuffer can cause label values to change in predictable but incorrect ways.
This occurs even when:
using nearest-neighbor sampling
disabling color management (workingColorSpace / outputColorSpace = NSNull)
applying identity or simple affine transforms
I’ve confirmed via controlled tests that:
Metal → CVPixelBuffer paths preserve values correctly
CIImage → CVPixelBuffer paths can introduce value changes when resampling or expanding the render target
This makes CIImage unsafe as a source of numeric truth for segmentation masks and depth-based logic, even though it works well for visualization, and I should have realized this much sooner.
Direction I’m considering
I’m now considering refactoring toward more intent-based abstractions instead of a single image type, for example:
Visual images: CIImage (camera frames, overlays, debugging, UI)
Scalar fields: depth / confidence maps backed by CVPixelBuffer + Metal
Label maps: segmentation masks backed by integer-preserving buffers (no interpolation, no transforms)
In this model, CIImage would still be used extensively — but primarily for visualization and perceptual processing, not as the container for numerically sensitive data.
Thread safety concern
One of the original advantages of CIImage was that it is thread-safe by design, and that was my biggest incentive.
For CVPixelBuffer / MTLTexture–backed data, I’m considering enforcing thread safety explicitly via:
Swift Concurrency (actor-owned data, explicit ownership)
Questions
For those may have experience with CV / AR / imaging-heavy iOS apps, I was hoping to know the following:
Is this separation of image intent (visual vs numeric vs categorical) a reasonable architectural direction?
Do you generally keep CIImage at the heart of your pipeline, or push it to the edges (visualization only)?
How do you manage thread safety and ownership when working heavily with CVPixelBuffer and Metal? Using actor-based abstractions, GCD, or adhoc?
Are there any best practices or gotchas around using Core Image with depth maps or segmentation masks that I should be aware of?
I’d really appreciate any guidance or experience-based advice. I suspect I’ve hit a boundary of Core Image’s design, and I’m trying to refactor in a way that doesn't involve too much immediate tech debt, remains robust and maintainable long-term.
Thank you in advance!
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hi all, I spent the last few months developing an MLX/Ollama local AI Benchmarking suite for Apple Silicon, written in pure Swift and signed with an Apple Developer Certificate, open source, GPL, and free. I would love some feedback to continue development. It is the only benchmarking suite I know of that supports live power metrics and MLX natively, as well as quick exports for benchmark results, and an arena mode, Model A vs B with history. I really want this project to succeed, and have widespread use, so getting 75 stars on the github repo makes it eligible for Homebrew/Cask distribution.
Github Repo
Topic:
Machine Learning & AI
SubTopic:
Core ML
I can no longer achieve 100% ANE usage since upgrading to MacOS26 Beta 5. I used to be able to get 100%. Has Apple activated throttling or power saving features in the new Betas? Is there any new rate limiting on the API? I can hardly get above 3w or 40%.
I have a M4 Pro mini (64GB) with High Power energy setting. MacOS 26 Beta 5.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hello,
I’m experiencing a severe performance degradation when running CoreML models on a live AVFoundation video feed compared to offline or synthetic inference. This happens across multiple models I've converted (including SCI, RTMPose, and RTMW) and affects multiple devices.
The Environment
OS: macOS 26.3, iOS 26.3, iPadOS 26.3
Hardware: Mac14,6 (M2 Max), iPad Pro 11 M1, iPhone 13 mini
Compute Units: cpuAndNeuralEngine
The Numbers
When testing my SCI_output_image_int8.mlpackage model, the inference timings are drastically different:
Synthetic/Offline Inference: ~1.34 ms
Live Camera Inference: ~15.96 ms
Preprocessing is completely ruled out as the bottleneck. My profiling shows total preprocessing (nearest-neighbor resize + feature provider creation) takes only ~0.4 ms in camera mode. Furthermore, no frames are being dropped.
What I've Tried
I am building a latency-critical app and have implemented almost every recommended optimization to try and fix this, but the camera-feed penalty remains:
Matched the AVFoundation camera output format exactly to the model input (640x480 at 30/60fps).
Used IOSurface-backed pixel buffers for everything (camera output, synthetic buffer, and resize buffer).
Enabled outputBackings.
Loaded the model once and reused it for all predictions.
Configured MLModelConfiguration with reshapeFrequency = .frequent and specializationStrategy = .fastPrediction.
Wrapped inference in ProcessInfo.processInfo.beginActivity(options: .latencyCritical, reason: "CoreML_Inference").
Set DispatchQueue to qos: .userInteractive.
Disabled the idle timer and enabled iOS Game Mode.
Exported models using coremltools 9.0 (deployment target iOS 26) with ImageType inputs/outputs and INT8 quantization.
Reproduction
To completely rule out UI or rendering overhead, I wrote a standalone Swift CLI script that isolates the AVFoundation and CoreML pipeline. The script clearly demonstrates the ~15ms latency on live camera frames versus the ~1ms latency on synthetic buffers.
(I have attached camera_coreml_benchmark.swift and coreml model (very light low light enghancement model) to this repo on github https://github.com/pzoltowski/apple-coreml-camera-latency-repro).
My Question:
Is this massive overhead expected behavior for AVFoundation + Core ML on live feeds, or is this a framework/runtime bug? If expected, what is the Apple-recommended pattern to bypass this camera-only inference slowdown?
One think found interesting when running in debug model was faster (not as fast as in performance benchmark but faster than 16ms. Also somehow if I did some dummy calculation on on different DispatchQueue also seems like model got slightly faster. So maybe its related to ANE Power State issues (Jitter/SoC Wake) and going to fast to sleep and taking a long time to wakeup? Doing dummy calculation in background thought is probably not a solution.
Thanks in advance for any insights!
I have been using "apple" to test foundation models.
I thought this is local, but today the answer changed - half way through explanation, suddenly guardrailViolation error was activated! And yesterday, all reference to "Apple II", "Apple III" now refers me to consult apple.com!
Does foundation models connect to Internet for answer? Using beta 3.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hi everyone,
I'm experiencing an inconsistent behavior with the Translation framework on iOS 18. The LanguageAvailability.status() API reports language models as .installed, but translation fails with Code 16.
Setup:
Using translationTask modifier with TranslationSession
Batch translation with explicit source/target languages
Languages: Portuguese→English, German→English
Issue:
let status = await LanguageAvailability().status(from: sourceLang, to: targetLang) // Returns: .installed
// But translation fails:
let responses = try await session.translations(from: requests)
// Error: TranslationErrorDomain Code=16 "Offline models not available"
Logs:
Language model installed: pt -> en
Language model installed: de -> en
Starting translation: de -> en
Error Domain=TranslationErrorDomain Code=16 "Translation failed"NSLocalizedFailureReason=Offline models not available for language pair
What I've tried:
Re-downloading languages in Settings
Using source: nil for auto-detection
Fresh TranslationSession.Configuration each time
Questions:
Is there a way to force model re-validation/re-download programmatically?
Should translationTask show download popup when Code 16 occurs?
Has anyone found a reliable workaround?
I've seen similar reports in threads 791357 and 777113. Any guidance appreciated!
Thanks!
Topic:
Machine Learning & AI
SubTopic:
General
Documentation on adapter train is lacking any details related to training on dataset with tool calling. And page about tool calling itself only explain how to use it from Swift without any internal details useful in training.
Question is how schema should looks like for including tool calling in dataset?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hi, I am a new IOS developer, trying to learn to integrate the Apple Foundation Model.
my set up is:
Mac M1 Pro
MacOS 26 Beta
Version 26.0 beta 3
Apple Intelligence & Siri --> On
here is the code,
func generate() {
Task {
isGenerating = true
output = "⏳ Thinking..."
do {
let session = LanguageModelSession( instructions: """
Extract time from a message. Example
Q: Golfing at 6PM
A: 6PM
""")
let response = try await session.respond(to: "Go to gym at 7PM")
output = response.content
} catch {
output = "❌ Error:, \(error)"
print(output)
}
isGenerating = false
}
and I get these errors
guardrailViolation(FoundationModels.LanguageModelSession.GenerationError.Context(debugDescription: "Prompt may contain sensitive or unsafe content", underlyingErrors: [Asset com.apple.gm.safety_embedding_deny.all not found in Model Catalog]))
Can you help me get through this?
I'm trying to use Apple's new Visual Intelligence API for recommending content through screenshot image search. The problem I encountered is that the SemanticContentDescriptor labels are either completely empty or super misleading, making it impossible to query for similar content on my app. Even the closest matching example was inaccurate, returning a single label ["cardigan"] for a Supreme T-Shirt.
I see other apps using this API like Etsy for example, and I'm wondering if they're using the input pixel buffer to query for similar content rather than using the labels?
If anyone has a similar experience or something that wasn't called out in the documentation please lmk! Thanks.
I've created the following Foundation Models Tool, which uses the .anyOf guide to constrain the LLM's generation of suitable input arguments. When calling the tool, the model is only allowed to request one of a fixed set of sections, as defined in the sections array.
struct SectionReader: Tool {
let article: Article
let sections: [String]
let name: String = "readSection"
let description: String = "Read a specific section from the article."
var parameters: GenerationSchema {
GenerationSchema(
type: GeneratedContent.self,
properties: [
GenerationSchema.Property(
name: "section",
description: "The article section to access.",
type: String.self,
guides: [.anyOf(sections)]
)
]
)
}
func call(arguments: GeneratedContent) async throws -> String {
let requestedSectionName = try arguments.value(String.self, forProperty: "section")
...
}
}
However, I have found that the model will sometimes call the tool with invalid (but plausible) section names, meaning that .anyOf is not actually doing its job (i.e. requestedSectionName is sometimes not a member of sections).
The documentation for the .anyOf guide says, "Enforces that the string be one of the provided values."
Is this a bug or have I made a mistake somewhere?
Many thanks for any help you provide!
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I am follwing this tutorial:
https://apple.github.io/coremltools/docs-guides/source/convert-a-torchvision-model-from-pytorch.html
I have obtained simialr result using the python code.
However when I view it in Xcode, the preview prediction percentage confidence is way off I suspect it is due the the output of the model, which is in percentage already and in Xcode it multiply 100 again leading to this result. Please give me any feedback to fix this, thank you.
Hi all,
I'm trying to find out if/when we can expect mxfp8/mxfp4 support on Apple Silicon. I've noticed that mlx now has casting data types, but all computation is still done in bf16. Would be great to reduce power consumption with support for these lower precision data types since edge inference is already typically done at a lower precision!
Thanks in advance.
Topic:
Machine Learning & AI
SubTopic:
Core ML
Using highly optimized Metal Shading Language (MSL) code, I pushed the MacBook Air M2 to its performance limits with the deformable_attention_universal kernel. The results demonstrate both the efficiency of the code and the exceptional power of Apple Silicon.
The total computational workload exceeded 8.455 quadrillion FLOPs, equivalent to processing 8,455 trillion operations. On average, the code sustained a throughput of 85.37 TFLOPS, showcasing the chip’s remarkable ability to handle massive workloads. Peak instantaneous performance reached approximately 673.73 TFLOPS, reflecting near-optimal utilization of the GPU cores.
Despite this intensity, the cumulative GPU runtime remained under 100 seconds, highlighting the code’s efficiency and time optimization. The fastest iteration achieved a record processing time of only 0.051 ms, demonstrating minimal bottlenecks and excellent responsiveness.
Memory management was equally impressive: peak GPU memory usage never exceeded 2 MB, reflecting efficient use of the M2’s Unified Memory. This minimizes data transfer overhead and ensures smooth performance across repeated workloads.
Overall, these results confirm that a well-optimized Metal implementation can unlock the full potential of Apple Silicon, delivering exceptional computational density, processing speed, and memory efficiency. The MacBook Air M2, often considered an energy-efficient consumer laptop, is capable of handling highly intensive workloads at performance levels typically expected from much larger GPUs. This test validates both the robustness of the Metal code and the extraordinary capabilities of the M2 chip for high-performance computing tasks.
I got 3203.23 GFLOPS (FP16) on the M3 Macbook Pro and only 2833.24 GFLOPS (FP16) on the M4 Macbook Air for 4096x4096 matrix multiplications for a PyTorch MPS FP16 Benchmark. Wasn't the performance supposed to be twice as high on the M4 compared to the M3 even with the termal throtling on the Macbook Air? What went wrong?
I downloaded the new developer beta and then installed xcode. I did the downloads but I couldn't download the Predictive Code Completion Model. When I try to download it I get the error "The operation couldn’t be completed. (ModelCatalog.CatalogErrors.AssetErrors error 1.)". I am using the M3 Pro model.
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
Hi all,
I’m encountering an issue when trying to run Apple Foundation Models in a blank project targeting iOS 26.
Below are the details:
Xcode: Latest version with iOS 26 SDK
macOS: macOS 26 Tahoe (installed on main disk)
Mac: 16” MacBook Pro with M2 Pro chip
Apple Intelligence: Available and functional on this machine
Problem:
I created a new blank iOS project, set the deployment target to iOS 26, and ran the following minimal code using Foundation Models. However, I get no response at all in the output - not even an error. The app runs, but the model does not produce any output.
#Playground {
let session = LanguageModelSession()
let response = try await session.respond(to: "Tell me a story")
}
Then, I tried to catch an error with this code:
#Playground {
let session = LanguageModelSession()
do {
let response = try await session.respond(to: "Tell me a story")
print(response)
} catch {
print("Failed to get response:", error)
}
print("This line, never gets executed")
}
And got these results:
I’ve done further testing and discovered something important:
I tried running the Code Along sample project, and there the #Playground macro worked without issues. The only significant difference I noticed was the Canvas run destination:
In my original project, I was using iPhone 16 Pro (iOS 26) as the run target in Canvas. Apple Intelligence was enabled on the simulator, but no response was returned when executing the prompt.
In the sample project, the Canvas was running on My Mac.
I attempted to match that setup, but at first, my destination was My Mac (Designed for iPad), which still didn’t work. The macro finally executed properly once I switched to My Mac (AppKit).
So the question is ... it seems that for now, Foundation Models and the #Playground macro only run correctly when the canvas or destination is set to “My Mac (AppKit)”?
Hello,
I am developing an iOS app that uses machine learning models.
To improve accuracy and user experience, I would like to download .mlmodel files (compiled and compressed as zip files) from our own server after the app is installed, and use them for inference within the app.
No executable code, scripts, or dynamic libraries will be downloaded—only model data files are used.
According to App Store Review Guideline 2.5.2, I understand that apps may not download or execute code which introduces or changes features or functionality.
In this case, are compiled and zip-compressed .mlmodel files considered "data" rather than "code", and is it allowed to download and use them in the app?
If there are any restrictions or best practices related to this, please let me know.
Thank you.
Hello folks! Taking a look at https://developer.apple.com/documentation/foundationmodels it’s not clear how to use another models there.
Do anyone knows if it’s possible use one trained model from outside (imported) here in foundation models framework?
Thanks!
I generate an array of random floats using the code shown below. However, I would like to do this with Double instead of Float. Are there any BNNS random number generators for double values, something like BNNSRandomFillUniformDouble? If not, is there a way I can convert BNNSNDArrayDescriptor from float to double?
import Accelerate
let n = 100_000_000
let result = Array<Float>(unsafeUninitializedCapacity: n) { buffer, initCount in
var descriptor = BNNSNDArrayDescriptor(data: buffer, shape: .vector(n))!
let randomGenerator = BNNSCreateRandomGenerator(BNNSRandomGeneratorMethodAES_CTR, nil)
BNNSRandomFillUniformFloat(randomGenerator, &descriptor, 0, 1)
initCount = n
}
Hey guys 👋
I’ve been thinking about a feature idea for iOS that could totally change the way we interact with apps like Twitter/X.
Imagine if we could define our own recommendation algorithm, and have an AI on the iPhone that replaces the suggested tweets in the feed with ones that match our personal interests — based on public tweets, and without hacking anything.
Kinda like a personalized "AI skin" over the app that curates content you actually care about. Feels like this would make content way more relevant and less algorithmically manipulative.
Would love to know what you all think — and if Apple could pull this off 🔥
Topic:
Machine Learning & AI
SubTopic:
General