Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

Accessibility & Inclusion
We are developing Apple AI for foreign markets and adapting it for iPhone models 17 and above. When the system language and Siri language are not the same—for example, if the system is in English and Siri is in Chinese—it can cause a situation where Apple AI cannot be used. So, may I ask if there are any other reasons that could cause Apple AI to be unavailable within the app, even if it has been enabled?
0
0
481
Dec ’25
Is it possible to pass the streaming output of Foundation Models down a function chain
I am writing a custom package wrapping Foundation Models which provides a chain-of-thought with intermittent self-evaluation among other things. At first I was designing this package with the command line in mind, but after seeing how well it augments the models and makes them more intelligent I wanted to try and build a SwiftUI wrapper around the package. When I started I was using synchronous generation rather than streaming, but to give the best user experience (as I've seen in the WWDC sessions) it is necessary to provide constant feedback to the user that something is happening. I have created a super simplified example of my setup so it's easier to understand. First, there is the Reasoning conversation item, which can be converted to an XML representation which is then fed back into the model (I've found XML works best for structured input) public typealias ConversationContext = XMLDocument extension ConversationContext { public func toPlainText() -> String { return xmlString(options: [.nodePrettyPrint]) } } /// Represents a reasoning item in a conversation, which includes a title and reasoning content. /// Reasoning items are used to provide detailed explanations or justifications for certain decisions or responses within a conversation. @Generable(description: "A reasoning item in a conversation, containing content and a title.") struct ConversationReasoningItem: ConversationItem { @Guide(description: "The content of the reasoning item, which is your thinking process or explanation") public var reasoningContent: String @Guide(description: "A short summary of the reasoning content, digestible in an interface.") public var title: String @Guide(description: "Indicates whether reasoning is complete") public var done: Bool } extension ConversationReasoningItem: ConversationContextProvider { public func toContext() -> ConversationContext { // <ReasoningItem title="${title}"> // ${reasoningContent} // </ReasoningItem> let root = XMLElement(name: "ReasoningItem") root.addAttribute(XMLNode.attribute(withName: "title", stringValue: title) as! XMLNode) root.stringValue = reasoningContent return ConversationContext(rootElement: root) } } Then there is the generator, which creates a reasoning item from a user query and previously generated items: struct ReasoningItemGenerator { var instructions: String { """ <omitted for brevity> """ } func generate(from input: (String, [ConversationReasoningItem])) async throws -> sending LanguageModelSession.ResponseStream<ConversationReasoningItem> { let session = LanguageModelSession(instructions: instructions) // build the context for the reasoning item out of the user's query and the previous reasoning items let userQuery = "User's query: \(input.0)" let reasoningItemsText = input.1.map { $0.toContext().toPlainText() }.joined(separator: "\n") let context = userQuery + "\n" + reasoningItemsText let reasoningItemResponse = try await session.streamResponse( to: context, generating: ConversationReasoningItem.self) return reasoningItemResponse } } I'm not sure if returning LanguageModelSession.ResponseStream<ConversationReasoningItem> is the right move, I am just trying to imitate what session.streamResponse returns. Then there is the orchestrator, which I can't figure out. It receives the streamed ConversationReasoningItems from the Generator and is responsible for streaming those to SwiftUI later and also for evaluating each reasoning item after it is complete to see if it needs to be regenerated (to keep the model on-track). I want the users of the orchestrator to receive partially generated reasoning items as they are being generated by the generator. Later, when they finish, if the evaluation passes, the item is kept, but if it fails, the reasoning item should be removed from the stream before a new one is generated. So in-flight reasoning items should be outputted aggresively. I really am having trouble figuring this out so if someone with more knowledge about asynchronous stuff in Swift, or- even better- someone who has worked on the Foundation Models framework could point me in the right direction, that would be awesome!
0
0
282
Jul ’25
Inquiry Regarding Siri–AI Integration Capabilities
: Hello, I’m seeking clarification on whether Apple provides any framework or API that enables deep integration between Siri and advanced AI assistants (such as ChatGPT), including system-level functions like voice interaction, navigation, cross-platform syncing, and operational access similar to Siri’s own capabilities. If no such option exists today, I would appreciate guidance on the recommended path or approved third-party solutions for building a unified, voice-first experience across Apple’s ecosystem. Thank you for your time and insight.
0
0
147
Nov ’25
Feature Request: Allow Foundation Models in MessageFilter Extensions
I’d like to submit a feature request regarding the availability of Foundation Models in MessageFilter extensions. Background MessageFilter extensions play a critical role in protecting users from spam, phishing, and unwanted messages. With the introduction of Foundation Models and Apple Intelligence, Apple has provided powerful on-device natural language understanding capabilities that are highly aligned with the goals of MessageFilter. However, Foundation Models are currently unavailable in MessageFilter extensions. Why Foundation Models Are a Great Fit for MessageFilter Message filtering is fundamentally a natural language classification problem. Foundation Models would significantly improve: Detection of phishing and scam messages Classification of promotional vs transactional content Understanding intent, tone, and semantic context beyond keyword matching Adaptation to evolving scam patterns without server-side processing All of this can be done fully on-device, preserving user privacy and aligning with Apple’s privacy-first design principles. Current Limitations Today, MessageFilter extensions are limited to relatively simple heuristics or lightweight models. This often results in: Higher false positives Lower recall for sophisticated scam messages Increased development complexity to compensate for limited NLP capabilities Request Could Apple consider one of the following: Allowing Foundation Models to be used directly within MessageFilter extensions Providing a constrained or optimized Foundation Model API specifically designed for MessageFilter Enabling a supported mechanism for MessageFilter extensions to delegate inference to the containing app using Foundation Models Even limited access (e.g. short text only, strict execution limits) would be extremely valuable. Closing Foundation Models have the potential to significantly raise the quality and effectiveness of message filtering on Apple platforms while maintaining strong privacy guarantees. Supporting them in MessageFilter extensions would be a major improvement for both developers and users. Thank you for your consideration and for continuing to invest in on-device intelligence.
1
0
491
Jan ’26
Inquiry About Building an App for Object Detection, Background Removal, and Animation
Hi all! Nice to meet you., I am planning to build an iOS application that can: Capture an image using the camera or select one from the gallery. Remove the background and keep only the detected main object. Add a border (outline) around the detected object’s shape. Apply an animation along that border (e.g., moving light or glowing effect). Include a transition animation when removing the background — for example, breaking the background into pieces as it disappears. The app Capword has a similar feature for object isolation, and I’d like to build something like that. Could you please provide any guidance, frameworks, or sample code related to: Object segmentation and background removal in Swift (Vision or Core ML). Applying custom borders and shape animations around detected objects. Recognizing the object name (e.g., “person”, “cat”, “car”) after segmentation. Thank you very much for your support. Best regards, SINN SOKLYHOR
0
0
193
Nov ’25
VNDetectFaceRectanglesRequest does not use the Neural Engine?
I'm on Tahoe 26.1 / M3 Macbook Air. I'm using VNDetectFaceRectanglesRequest as properly as possible, as in the minimal command line program attached below. For some reason, I always get: MLE5Engine is disabled through the configuration printed. I couldn't find any notes on developer docs saying that VNDetectFaceRectanglesRequest can not use the Apple Neural Engine. I'm assuming there is something wrong with my code however I wasn't able to find any remarks from documentation where it might be. I wasn't able to find the above error message online either. I would appreciate your help a lot and thank you in advance. The code below accesses the video from AVCaptureDevice.DeviceType.builtInWideAngleCamera. Currently it directly chooses the 0th format which has the largest resolution (Full HD on my M3 MBA) and "4:2:0" color "v" reduced color component spectrum encoding ("420v"). After accessing video, it performs a VNDetectFaceRectanglesRequest. It prints "VNDetectFaceRectanglesRequest completion Handler called" many times, then prints the error message above, then continues printing "VNDetectFaceRectanglesRequest completion Handler called" until the user quits it. To run it in Xcode, File > New project > Mac command line tool. Pasting the code below, then click on the root file > Targets > Signing & Capabilities > Hardened Runtime > Resource Access > Camera. A possible explanation could be that either Apple's internal CoreML code for this function works on GPU/CPU only or it doesn't accept 420v as supplied by the Macbook Air camera import AVKit import Vision var videoDataOutput: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() var detectionRequests: [VNDetectFaceRectanglesRequest]? var videoDataOutputQueue: DispatchQueue = DispatchQueue(label: "queue") class XYZ: /*NSViewController or NSObject*/NSObject, AVCaptureVideoDataOutputSampleBufferDelegate { func viewDidLoad() { //super.viewDidLoad() let session = AVCaptureSession() let inputDevice = try! self.configureFrontCamera(for: session) self.configureVideoDataOutput(for: inputDevice.device, resolution: inputDevice.resolution, captureSession: session) self.prepareVisionRequest() session.startRunning() } fileprivate func highestResolution420Format(for device: AVCaptureDevice) -> (format: AVCaptureDevice.Format, resolution: CGSize)? { let deviceFormat = device.formats[0] print(deviceFormat) let dims = CMVideoFormatDescriptionGetDimensions(deviceFormat.formatDescription) let resolution = CGSize(width: CGFloat(dims.width), height: CGFloat(dims.height)) return (deviceFormat, resolution) } fileprivate func configureFrontCamera(for captureSession: AVCaptureSession) throws -> (device: AVCaptureDevice, resolution: CGSize) { let deviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [AVCaptureDevice.DeviceType.builtInWideAngleCamera], mediaType: .video, position: AVCaptureDevice.Position.unspecified) let device = deviceDiscoverySession.devices.first! let deviceInput = try! AVCaptureDeviceInput(device: device) captureSession.addInput(deviceInput) let highestResolution = self.highestResolution420Format(for: device)! try! device.lockForConfiguration() device.activeFormat = highestResolution.format device.unlockForConfiguration() return (device, highestResolution.resolution) } fileprivate func configureVideoDataOutput(for inputDevice: AVCaptureDevice, resolution: CGSize, captureSession: AVCaptureSession) { videoDataOutput.setSampleBufferDelegate(self, queue: videoDataOutputQueue) captureSession.addOutput(videoDataOutput) } fileprivate func prepareVisionRequest() { let faceDetectionRequest: VNDetectFaceRectanglesRequest = VNDetectFaceRectanglesRequest(completionHandler: { (request, error) in print("VNDetectFaceRectanglesRequest completion Handler called") }) // Start with detection detectionRequests = [faceDetectionRequest] } // MARK: AVCaptureVideoDataOutputSampleBufferDelegate // Handle delegate method callback on receiving a sample buffer. public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { var requestHandlerOptions: [VNImageOption: AnyObject] = [:] let cameraIntrinsicData = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil) if cameraIntrinsicData != nil { requestHandlerOptions[VNImageOption.cameraIntrinsics] = cameraIntrinsicData } let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)! // No tracking object detected, so perform initial detection let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: CGImagePropertyOrientation.up, options: requestHandlerOptions) try! imageRequestHandler.perform(detectionRequests!) } } let X = XYZ() X.viewDidLoad() sleep(9999999)
0
0
460
Nov ’25
How to Ensure Controlled and Contextual Responses Using Foundation Models ?
Hi everyone, I’m currently exploring the use of Foundation models on Apple platforms to build a chatbot-style assistant within an app. While the integration part is straightforward using the new FoundationModel APIs, I’m trying to figure out how to control the assistant’s responses more tightly — particularly: Ensuring the assistant adheres to a specific tone, context, or domain (e.g. hospitality, healthcare, etc.) Preventing hallucinations or unrelated outputs Constraining responses based on app-specific rules, structured data, or recent interactions I’ve experimented with prompt, systemMessage, and few-shot examples to steer outputs, but even with carefully generated prompts, the model occasionally produces incorrect or out-of-scope responses. Additionally, when using multiple tools, I'm unsure how best to structure the setup so the model can select the correct pathway/tool and respond appropriately. Is there a recommended approach to guiding the model's decision-making when several tools or structured contexts are involved? Looking forward to hearing your thoughts or being pointed toward related WWDC sessions, Apple docs, or sample projects.
0
0
130
Jul ’25
Can MPSGraphExecutable automatically leverage Apple Neural Engine (ANE) for inference?
Hi, I'm currently using Metal Performance Shaders Graph (MPSGraphExecutable) to run neural network inference operations as part of a metal rendering pipeline. I also tried to profile the usage of neural engine when running inference using MPSGraphExecutable but the graph shows no sign of neural engine usage. However, when I used the coreML model inspection tool in xcode and run performance report, it was able to use ANE. Does MPSGraphExecutable automatically utilize the Apple Neural Engine (ANE) when running inference operations, or does it only execute on GPU? My model (Core ML Package) was converted from a pytouch model using coremltools with ML program type and support iOS17.0+. Any insights or documentation references would be greatly appreciated!
0
0
478
Nov ’25
How to create updatable models using Create ML app
I've built a model using Create ML, but I can't make it, for the love of God, updatable. I can't find any checkbox or anything related. It's an Activity Classifier, if it matters. I want to continue training it on-device using MLUpdateTask, but the model, as exported from Create ML, fails with error: Domain=com.apple.CoreML Code=6 "Failed to unarchive update parameters. Model should be re-compiled." UserInfo={NSLocalizedDescription=Failed to unarchive update parameters. Model should be re-compiled.}
0
0
368
Nov ’25
CreateML Training Object Detection Not using MPS
Hi everyone Im currently developing an object detection model that shall identify up to seven classes in an image. While im usually doing development with basic python and the ultralytics library, i thought i would like to give CreateML a shot. The experience is actually very nice, except for the fact that the model seem not to be using any ANE or GPU (MPS) for accelerated training. On https://developer.apple.com/machine-learning/create-ml/ it states: "On-device training Train models blazingly fast right on your Mac while taking advantage of CPU and GPU." Am I doing something wrong? Im running the training on Apple M1 Pro 16GB MacOS 26.1 (Tahoe) Xcode 26.1 (Build version 17B55) It would be super nice to get some feedback or instructions. Thank you in advance!
0
0
296
Nov ’25
Do App Intent Domains work with Siri already?
Hi, guys. I'm writing about Apple Intelligence and I reached the point I have to explain App Intent Domains https://developer.apple.com/documentation/AppIntents/app-intent-domains but I noticed that there is a note explaining that these services are not available with Siri. I tried the example provided by Apple at https://developer.apple.com/documentation/AppIntents/making-your-app-s-functionality-available-to-siri and I can only make the intents work from the Shortcuts App, but not from Siri. Is this correct. App Intent Domains are still not available with Siri? Thanks
0
0
474
Nov ’25
Embedding model missing once transferred to Xcode
I've created a "Transfer Learning BERT Embeddings" model with the default "Latin" language family and "Automatic" Language setting. This model performs exceptionally well against the test data set and functions as expected when I preview it in Create ML. However, when I add it to the Xcode project of the application to which I am deploying it, I am getting runtime errors that suggest it can't find the embedding resources: Failed to locate assets for 'mul_Latn' - '5C45D94E-BAB4-4927-94B6-8B5745C46289' embedding model Note, I am adding the model to the app project the same way that I added an earlier "Maximum Entropy" model. That model had no runtime issues. So it seems there is an issue getting hold of the embeddings at runtime. For now, "runtime" means in the Simulator. I intend to deploy my application to iOS devices once GM 26 is released (the app also uses AFM). I'm developing on Tahoe 26 beta, running on iOS 26 beta, using Xcode 26 beta. Is this a known/expected issue? Are the embeddings expected to be a resource in the model? Is there a workaround? I did try opening the model in Xcode and saving it as an mlpackage, then adding that to my app project, but that also didn't resolve the issue.
1
0
505
Sep ’25
Foundational Model - Image as Input? Timeline
Hi all, I am interested in unlocking unique applications with the new foundational models. I have a few questions regarding the availability of the following features: Image Input: The update in June 2025 mentions "image" 44 times (https://machinelearning.apple.com/research/apple-foundation-models-2025-updates) - however I can't seem to find any information about having images as the input/prompt for the foundational models. When will this be available? I understand that there are existing Vision ML APIs, but I want image input into a multimodal on-device LLM (VLM) instead for features like "Which player is holding the ball in the image", etc (image understanding) Cloud Foundational Model - when will this be available? Thanks! Clement :)
1
0
578
Sep ’25
[MPSGraph runWithFeeds:targetTensors:targetOperations:] randomly crash
I'm implementing an LLM with Metal Performance Shader Graph, but encountered a very strange behavior, occasionally, the model will report an error message as this: LLVM ERROR: SmallVector unable to grow. Requested capacity (9223372036854775808) is larger than maximum value for size type (4294967295) and crash, the stack backtrace screenshot is attached. Note that 5th frame is mlir::getIntValues<long long> and 6th frame is llvm::SmallVectorBase<unsigned int>::grow_pod It looks like mlir mistakenly took a 64 bit value for a 32 bit type. Unfortunately, I could not found the source code of mlir::getIntValues, maybe it's Apple's closed source fork of llvm for MPS implementation? Anyway, any opinion or suggestion on that?
0
0
228
Mar ’25
CoreML Inference Acceleration
Hello everyone, I have a visual convolutional model and a video that has been decoded into many frames. When I perform inference on each frame in a loop, the speed is a bit slow. So, I started 4 threads, each running inference simultaneously, but I found that the speed is the same as serial inference, every single forward inference is slower. I used the mactop tool to check the GPU utilization, and it was only around 20%. Is this normal? How can I accelerate it?
2
0
700
Sep ’25
Unwrapping LanguageModelSession.GenerationError details
Apologies if this is obvious to everyone but me... I'm using the Tahoe AI foundation models. When I get an error, I'm trying to handle it properly. I see the errors described here: https://developer.apple.com/documentation/foundationmodels/languagemodelsession/generationerror/context, as well as in the headers. But all I can figure out how to see is error.localizedDescription which doesn't give me much to go on. For example, an error's description is: The operation couldn’t be completed. (FoundationModels.LanguageModelSession.GenerationError error 2. That doesn't give me much to go on. How do I get the actual error number/enum value out of this, short of parsing that text to look for the int at the end? This one is: case guardrailViolation(LanguageModelSession.GenerationError.Context) So I'd like to know how to get from the catch for session.respond to something I can act on. I feel like it's there, but I'm missing it. Thanks!
1
0
359
Jul ’25