Parallel/Steam processing of Apple Intelligence

Question

Created 5d

Replies 2

Boosts 0

Participants 2

I have built a MAC-OS machine intelligence application that uses Apple Intelligence. A part of the application is to preprocess text. For longer text content I have implemented chunking to get around the token limit. However the application performance is now limited by the fact that Apple Intelligence is sequential in operation. This has a large impact on the application performance.

Is there any approach to operate Apple Intelligence in a parallel mode or even a streaming interface. As Apple Intelligence has Private Cloud Services I was hoping to be able to send multiple chunks in parallel as that would significantly improve performance.

Any suggestions would be welcome. This could also be considered a request for a future enhancement.

Boost

Answer 1

DTS Engineer OP

Apple

2d

Would you mind to share what APIs are you using? If you are using the Foundation Model framework, please see the discussion here.

Best,
——
Ziqiao Chen
 Worldwide Developer Relations.

0

Answer 2

doonhammer OP

1d

Ziqiao,

Thanks I am using Foundation Model Framework. I read the discussion you pointed to, thank you. I understand the limitation is the on device resources, but I was hoping that the PCS could be leveraged if resources were used up on the device. It looks like I cannot really use Apple Intelligence and I need to move to other LLMs.

0