This post documents an issue I reported in feedback FB19610114 and see if anyone knows of a workaround. Here is a copy of the feedback.
Short version
Manipulation (SwiftUI OR RealityKit) fails to translate entities after changing rooms. By changing rooms, I mean a human wearing an Apple Vision Pro leaving one room and entering another room. Once this issue occurs, it impacts all apps that use these features. A device restart is the only solution I have to fix it.
Feedback FB19610114
This is an odd one. I'm using the new Manipulation Component in visionOS 26. Most of the time this works well. Sometime it stops working and when it does the only way to get it working again is to reboot the headset.
When this happens, I can continue to rotate and scale items, but translation no longer works. It is as if the item is stuck to a fixed point in the parent scene (window, volume, etc). When this bug occurs, it affects every app across the entire operating system that is using manipulation, including the RealityKit component AND the SwiftUI version. This is not limited to one app and is not limited to apps that I am working on. Once this error occurs, it affects literally any application across the operating system that is using this API, including apps from Apple.
I won't speculate on the cause of this, but I do know of one way where I can always get it to happen.
Here is how to reproduce it:
Make an Xcode project with a single entity that uses the Manipulation Component. There is no need to customize the configuration of this component. The default implementation will work.
Build and run this app on device. You can keep running from device or quit and launch the app like normal on device.
Open the app and manipulate the entity - it should work as expected.
Physically walk into another room. It is vital that you leave the current room that you are in and enter a different room entirely.
Use the digital crown to recenter your view and bring your window or volume to you.
Test the manipulation on the entity again - it should still be working as expected at this point.
Physically, move yourself and your headset into the original room where you started.
Use the digital crown to recenter your view and bring your window or volume to you.
Test the manipulation on the entity again - you should now see the issue.
When I follow the steps above, then 100% of the time manipulation translation stops working at this point. It will impact any application using this API. The only way to fix it is to restart my headset.
A few points to keep in mind
It does not matter if an app is actively being run from Xcode.
When this occurs, it impacts every single app, not just one.
When this occurs, rotation and scaling continue to work, but the entity/view cannot be translated.
This impacts BOTH the SwiftUI version and the RealityKit version.
When this occurs, the only way to "fix" it is to reboot the device.
Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hi everyone,
I’m encountering a memory overflow issue in my visionOS app and I’d like to confirm if this is expected behavior or if I’m missing something in cleanup.
App Context
The app showcases apartments in real scale using AR.
Apartments are heavy USDZ models (hundreds of thousands of triangles, high-resolution textures).
Users can walk inside the apartments, and performance is good even close to hardware limits.
Flow
The app starts in a full immersive space (RealityView) for selecting the apartment.
When an apartment is selected, a new ImmersiveSpace opens and the apartment scene loads.
The scene includes multiple USDZ models, EnvironmentResources, and dynamic textures for skyboxes.
When the user dismisses the experience, we attempt cleanup:
Nulling out all entity references.
Removing ModelComponents.
Clearing cached textures and skyboxes.
Forcing dictionaries/collections to empty.
Despite this cleanup, memory usage remains very high.
Problem
After dismissing the ImmersiveSpace, memory does not return to baseline.
Check the attached screenshot of the profiling made using Instruments:
Initial state: ~30MB (main menu).
After loading models sequentially: ~3.3GB.
Skybox textures bring it near ~4GB.
After dismissing the experience (at ~01:00 mark): memory only drops slightly (to ~2.66GB).
When loading the second apartment, memory continues to increase until ~5GB, at which point the app crashes due to memory pressure.
The issue is consistently visible under VM: IOSurface in Instruments. No leaks are detected.
So it looks like RealityKit (or lower-level frameworks) keeps caching meshes and textures, and does not free them when RealityView is ended. But for my use case, these resources should be fully released once the ImmersiveSpace is dismissed, since new apartments will load entirely different models and textures.
Cleanup Code Example
Here’s a simplified version of the cleanup I’m doing:
func clearAllRoomEntities() {
for (entityName, entity) in entityFromMarker {
entity.removeFromParent()
if let modelEntity = entity as? ModelEntity {
modelEntity.components.removeAll()
modelEntity.children.forEach { $0.removeFromParent() }
modelEntity.clearTexturesAndMaterials()
}
entityFromMarker[entityName] = nil
removeSkyboxPortals(from: entityName)
}
entityFromMarker.removeAll()
}
extension ModelEntity {
func clearTexturesAndMaterials() {
guard var modelComponent = self.model else { return }
for index in modelComponent.materials.indices {
removeTextures(from: &modelComponent.materials[index])
}
modelComponent.materials.removeAll()
self.model = modelComponent
self.model = nil
}
private func removeTextures(from material: inout any Material) {
if var pbr = material as? PhysicallyBasedMaterial {
pbr.baseColor.texture = nil
pbr.emissiveColor.texture = nil
pbr.metallic.texture = nil
pbr.roughness.texture = nil
pbr.normal.texture = nil
pbr.ambientOcclusion.texture = nil
pbr.clearcoat.texture = nil
material = pbr
} else if var simple = material as? SimpleMaterial {
simple.color.texture = nil
material = simple
}
}
}
Questions
Is this expected RealityKit behavior (textures/meshes cached internally)?
Is there a way to force RealityKit to release GPU resources tied to USDZ models when they’re no longer used?
Should dismissing the ImmersiveSpace automatically free those IOSurfaces, or do I need to handle this differently?
Any guidance, best practices, or confirmation would be hugely appreciated.
Thanks in advance!
https://developer.apple.com/documentation/realitykit/videomaterial
The documentation: "Video materials support transparency if the source video’s file format also supports transparency."
I have a transparency video(Hand.mov, HEVC with alpha), I can show the video with transparency background correctly on Vision Pro Simulates, but on physic Device the video has a black background. I'm sure the video format is ok because I can see get the texture from video and display it on an UnlitMaterial.
How can I show the transparency video correctly with the RealityKit/VideoMaterial?
Is there any interest in this forum for those developing for the spatial web and safari. I can't seem to find any posts that are relevant here.
Since updating to iOS 26.0 (and confirmed on 26.1), ARBodyTrackingConfiguration no longer detects a valid ARBodyAnchor on devices with LiDAR (e.g., iPhone 15 Pro, iPhone 17 Pro Max).
This issue reproduces in custom projects and Apple’s official sample “Capturing Body Motion in 3D”.
The AR session runs normally, but the delegate call:
func session(_ session: ARSession, didUpdate anchors: [ARAnchor])
never yields an ARBodyAnchor with valid joint transforms.
All joints return nil when calling:
body.skeleton.modelTransform(for: jointName)
resulting in 0 valid joints per frame.
Environment
• Device: iPhone 17 Pro Max (LiDAR)
• iOS: 26.0 / 26.1
• Xcode: 16.0 (stable)
• Framework: ARKit + RealityKit
• Configuration used:
config.worldAlignment = .gravityAndHeading
config.isAutoFocusEnabled = true
config.environmentTexturing = .none
session.run(config)
Also tested: with and without frameSemantics = .bodyDetection
Expected Behavior
ARBodyAnchor should be detected and body.skeleton should contain ~89 valid joints with continuous updates.
Using the example code posted here:
https://developer.apple.com/documentation/visionOS/tracking-images-in-3d-space
I can register multiple ReferenceImage s with a ImageTrackingProvider, but only one updates at a time - to have realtime updating, I can only have one ImageAnchor in my field of view at a time.
Is it possible to track multiple imageAnchors at the same time in the same field of view? As in having several ImageAnchor's tracked and entities updated to the transforms of the anchor in the same frame/moment from the Apple Vision Pro?
Topic:
Spatial Computing
SubTopic:
ARKit
For the M2 Apple Vision Pro, there's "a general guideline, we recommend no more than 500 thousand triangles for an immersive scene, with 250 thousand for applications in the shared space." --https://developer.apple.com/videos/play/wwdc2024/10186/?time=147
Is there a revised recommendation for the M5 Apple Vision Pro?
After adding TextComponents to my Entities on visionOS, I have observed that visualBounds will ignore the TextComponents.
Documentation states that it should render a rounded rectangle mesh. These mashes are visible on the device, but not visible in the debugger ("Capture Entity Hierarchy") and ignored by visualBounds.
Am I missing something?
static func makeDirection(_ direction: Direction) -> Entity {
let text = Entity()
text.name = direction.rawValue
text.setScale(SIMD3(repeating: 5), relativeTo: nil)
text.transform.rotation = direction.rotation
text.components.set(direction.textComponent)
return text
}
My workaround is to add a disabled ModelEntity and take its bounds 😬
Hi guys,
I noticed that Apple created a really engaging visual effect for browsing spatial videos in the app. The video appears embedded in glass panel with glowing edges and even shows a parallax effect as you move around. When I tried to display the stereo video using RealityView, however, the video entity always floats above the panel.
May I ask how does VisionOS implement this effect? Is there any approach to achieve this effect or example code I can use in my own code.
Thanks!
Hello, I am trying to build an AVP app for real-time "zero-latency" spatial video streaming. I am trying to figure out, on a high level, the best way to do this.
Currently this is my method:
Server sends stereo images via a WebRTC service (ie, livekit)
The WebRTC stream is converted to a CVPixelBuffer, writes them to file, plays via AVPlayer, and applies a VideoMaterial to a plane entity.
However, this is a bit hacky and it seems like this won't be compatible with Apple's spatial experinces. To my understanding, Apple supports HLS streaming for spatial experiences and APMP content. However, HLS (and even Low Latency HLS) introduces a second or more of latency, likely do to the segmentation nature of HLS. Thus, HLS will not work for us.
Some other alternatives I've thought of are streaming the live stream video via webrtc from the server to a local computer in the AVP's network, and then using LL-HLS to stream from the local computer to the vision pro. Still, it seems like this would introduce latency on the order of seconds.
Is my current approach the best way to implement this? Or could anyone suggest a better way, perhaps something compatible with AVP's spatial experiences
Topic:
Spatial Computing
SubTopic:
General
Game Controller Input Limitations in visionOS Volumetric Windows
Hello Apple Developer Community,
I'm developing a game for visionOS and have encountered significant limitations with game controller input when using volumetric windows (WindowGroup with .volumetric style). I'd appreciate clarification on whether this is expected behavior and any guidance on best practices.
🧩 Issue Summary
When using a DualSense controller with a volumetric window in visionOS, only a subset of controller inputs are available to the app. The remaining inputs appear to be reserved by the system for UI navigation.
✅ Working Inputs (Volumetric Window)
D-Pad (all directions)
L3 (left thumbstick button click)
R3 (right thumbstick button click)
Menu button
Options button
❌ Not Working Inputs (Volumetric Window)
Left thumbstick analog movement (used for UI scrolling instead)
Right thumbstick analog movement (used for UI scrolling instead)
Face buttons (Cross, Circle, Square, Triangle / A, B, X, Y)
Shoulder buttons (L1, R1)
Triggers (L2, R2)
Key observation: When moving the left thumbstick in a volumetric window, the window's UI scrolls vertically instead of sending input to my app's GameController handlers. Similarly, face buttons seem to be reserved for system UI interactions.
⚙️ Implementation Details
I'm using the standard GameController framework:
Connect to controller via GCController.controllers()
Access extendedGamepad profile
Set up valueChangedHandler and pressedChangedHandler for all inputs
Handlers confirmed registered via logging
Working inputs (D-Pad, L3, R3) trigger immediately and consistently
Non-working inputs (thumbsticks, face buttons) never trigger
🧠 Critical Finding: ImmersiveSpace Works Perfectly
When testing the exact same code in an ImmersiveSpace (.mixed immersion style), all controller inputs work perfectly:
✅ Both thumbsticks provide full analog input
✅ All face buttons trigger their handlers
✅ All shoulder buttons and triggers work correctly
✅ 100% success rate with no intermittent issues
This suggests the issue isn't with my code, but rather how visionOS handles controller input differently between Volumetric Windows and ImmersiveSpace.
🧪 Test Environment
I created a minimal test project (Controller-Playground) to isolate the issue:
A simple ControllerTester class that registers all GameController handlers
A visual UI showing real-time input state
No game logic, RealityKit physics, or other complexity
Results
In volumetric window: Only D-Pad, L3, R3, Menu, Options work
In ImmersiveSpace: All inputs work perfectly
This confirms the limitation exists at the visionOS platform level, not in app code.
🧰 Attempted Workarounds
I tried the following without success:
Setting GCSupportsControllerUserInteraction = false in Info.plist
Setting UIRequiresFullScreen = true
Changing window styles (.plain, .volumetric)
Polling vs. handler-based input approaches
Various threading models (MainActor, separate thread)
Result: The only way to enable full controller support is to switch to ImmersiveSpace.
❓ Questions for Apple
Is this input reservation behavior in volumetric windows intended and documented?
Are game controllers expected to have limited functionality in volumetric windows while full functionality is reserved for ImmersiveSpace?
Is there a way to request full controller input access in a volumetric window, or is ImmersiveSpace the only option for complete controller support?
Where can I find official documentation about controller input differences between window types?
Are there any APIs or configuration options to disable system controller shortcuts in volumetric windows?
🎯 Impact
This limitation has a significant effect on game design and architecture:
Volumetric windows offer a multitasking-friendly, less immersive experience
ImmersiveSpace provides full controller support but may be more immersive than some games require
Games that only need basic D-Pad and button input can work fine in volumetric windows
Games requiring analog sticks or face buttons must currently use ImmersiveSpace
It would be very helpful if Apple could clarify or reference existing documentation regarding controller input handling in different visionOS window types. If such documentation doesn't exist yet, it might be valuable to include this information in future developer guides or best-practice documents.
🕹 Current Workaround
For now, I'm using:
D-Pad for character movement (digital 8-direction)
R3 (right stick click) as a substitute for the "X" button
This setup allows the game to function within a volumetric window, though full controller support still requires ImmersiveSpace.
📄 Request
If this is expected behavior, I may have simply missed the relevant documentation — could you please point me to any existing resources that explain this design?
If there isn't one yet, it would be great if future visionOS documentation could:
Clearly outline controller input behavior across window types
Provide guidance on when to use Volumetric Windows vs. ImmersiveSpace for games
Consider adding an API option to request full controller access when appropriate
If this is not expected behavior, I'm happy to file a detailed bug report with sample code.
💻 System Information
visionOS: Latest Simulator
Xcode: Latest version
Controller: Sony DualSense
Framework: GameController (standard extendedGamepad profile)
Test project: Minimal reproducible example available
Thank you for any clarification or guidance you can provide. This information would be valuable for many developers working on visionOS games.
Hi!
I attempted running a sample project for detecting human pose in 3D with vision framework, that can be found here: https://developer.apple.com/documentation/vision/detecting-human-body-poses-in-3d-with-vision.
It works perfectly on my Macbook Pro M1, but fails on Apple Vision Pro. After selecting a photo, an endless loading screen is displayed and the following message is produced in the console:
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Network path is nil: (null)
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Unable to perform the request: Error Domain=com.apple.Vision Code=9 "Async status object reported as failed but without an error" UserInfo={NSLocalizedDescription=Async status object reported as failed but without an error}.
de-activating session 70138 after timeout
Is human pose detection expected to work on VisionOS? Is there any special configuration required, that I might be missing?
When assigning a ManipulationComponent to an Entity SceneEvents.WillRemoveEntity will be called for that Entity.
Expected Behavior: the Entity is not (even if temporarily) removed from the Scene and no SceneEvents will be triggered as a result of assigning a ManipulationComponent.
FB20872220
Sorry for the cross-post but it's now two days in and this isn't fixed.
If you try to use Xcode 16.3b3 with visionOS, it won't download the visionOS SDK, gives a 'network error' so you can't use the latest beta for Apple Vision Pro.
FB16927025
FB16917874
FB16910449
Hi I have a monitoring app, that will take input video from uvc and process it using Metal, and eventually get a MTLTexture.
The problem I'm facing is I have to convert MTLTexture to CGImage then call TextureResource.replace, which is super slow. Metal processing speed is same as input frame rate(50pfs), but MTLTexture -> CGImage -> TextureResource only got 7fps...
Is there any way I can make it faster?
Topic:
Spatial Computing
SubTopic:
General
Tags:
Media Player
Frameworks
Media Accessibility
Core Media
I'm getting the following error message when compiling the Apple provided sample, Spaceship game for the Apple Visio Pro. I've already tried deleting the derived data resetting the package cache and restarting Xcode but still getting the following error: [xrsimulator] Exception thrown during compile: Cannot get rkassets content for path /Users/myoungkang/Downloads/CreatingASpaceshipGame/Packages/Studio/Sources/Studio/Studio.rkassets because 'The file “Studio.rkassets” couldn’t be opened because you don’t have permission to view it.'
error: Tool exited with code 1
Topic:
Spatial Computing
SubTopic:
General
In visionOS, I'm trying to create an immersive environment which would feature several spheres in which immersive movies are visible. I'm starting from a sample code which creates a sphere, sets an immersive movie as its material, and opens it as an immersive environment. This works fine.
But if I create a sphere in an open immersive environment using Reality Composer Pro and sets its material to an immersive movie, I can see the movie on the sphere while I move outside of it but if I try to get inside the sphere, it disappears. What would be the right way of doing this ?
At a recent community meeting we were wondering how Apple creates this soft-edge effect around the occlusion cutouts. We see this effect on keyboard cutouts, iPhone cutouts, and in progressive spaces.
An example: Notice the soft edged around the occlusion cutout for the keyboard
One of our members created some Shader Graph materials to explore soft edges. These work by sending data into the opacity channel of the PreviewSurface node.
Unfortunately, the Occlusion Surface nodes lack any sort of input. If you know how to blend these concepts with RealityKit Occlusion, please let us know!
I sketched a idea for a project in Reality Composer on my iPad, thinking when I had a chance to sit down I would work it up in Xcode.
However, when I got back to my computer, I discovered I cannot open a file created in Reality Composer (or the exported Reality file) in Reality Composer Pro.
Am I missing something obvious here, because this seems like a huge oversight.
If anyone, can let me know how to open a file created in Reality Composer in Reality Composer Pro, I would greatly appreciate it. Partly, because there seems to be objects available in Reality Composer that are not in Reality Composer Pro.
Thanks
Stan
I've got an Immersive scene that I want to be able to bring additional users into via SharePlay where each user would be able to see (and hopefully interact) with the Immersive scene. How does one implement that?