FEFreeExamDumps.in

Microsoft Certified: Azure AI Apps and Agents Developer Associate

Topic 1

Question 17

AI-103 voucher + Udemy course (lifetime access) = ₹3,500 for Indian ID card holders.

Details →

You have a Microsoft Foundry project that serves a high-volume chat app. Most requests are simple FAQs, but some require advanced reasoning. You need to reduce costs and latency for common queries, without degrading the quality of the responses to complex questions. What should you do?

  • ARoute all the requests to a smaller model.
  • BUse a model cascade that routes the requests to different models. ✓
  • CIncrease the value of the max_tokens parameter for all the requests.
  • DRoute all the requests to the most capable model.