Question 54
AI-103 voucher + Udemy course (lifetime access) = ₹3,500 for Indian ID card holders.
Details →You have a Microsoft Foundry project that contains an agent. The agent uses a knowledge source built from documents stored in Azure Blob Storage. The documents include digitally scanned PDFs that contain multipage tables. You have an ingestion job that extracts only plain text, causing loss of table structure, headings, and page-number metadata. Users frequently ask questions that require the retrieval of specific table rows across the pages. You need to configure an ingestion job for a Retrieval Augmented Generation (RAG) pipeline that performs optical character recognition (OCR) on scanned PDFs, preserves tables and headings as structure-aware chunks, and stores page-number metadata with each chunk. How should you configure the ingestion job?
- AUse advanced data parsing to reingest the documents.
- BUse OCR and page-level chunking. ✓
- CUse page-level OCR extraction and store each page as a single chunk.
- DUse basic parsing and fixed-size chunking.