LlamaIndex’s Post

View organization page for LlamaIndex, graphic

199,100 followers

A hidden release that makes LlamaParse way better at RAG over complex documents: we’ve made huge improvements to markdown-based table reconstruction - this allows you to parse very complex tables while making sure the rows/columns are well aligned! 🔨 Check out the results below (source image, before fix, after fix). Tutorial on how to build advanced RAG over your PDFs with lots of text and complex tables: https://lnkd.in/g5P2CMUA

  • No alternative text description for this image
Karthik Rajan Venkatesan

Computational Mechanics | Materials Science | Engineering & Manufacturing | Generative AI.

1mo

Do you have any recommendations on RAG workflow for parsing pdfs with plots?

Refat Ametov

Driving Business Automation & AI Integration | Co-founder of Devstark and SpreadSimple | Stoic Mindset

1mo

This is fantastic news! The improvements to markdown-based table reconstruction will surely enhance the capabilities of LlamaParse in handling complex documents. How do these changes impact the overall performance and speed of the parsing process?

Like
Reply
Norbert Kocoń

Data Scientist @ NorthGravity

1mo

Hi LlamaIndex, We discovered that some document text extraction solutions are not always accurate. We tested SimpleDirectoryReader and LamaParse. The problem arises when columns have multiple columns (common in financial reports) or if they have multiple rows (stacked into one). I suspect that additional OCR enhancement could help the model recognize the text more accurately, but probably at the cost of time. Time that stakeholders do not want to waste. 🙄

Yash Y.

Machine Learning Software Engineer | Ex-Data Science Engineer | Ex-R Instructor for Biostatistics | NLP | Machine Learning | Computer Vision | PyTorch | Tensorflow | Deep Neural Networks | Information Retrieval

1mo

I am curious, I have also built similar to this but without LlamaParse. I want to know what makes it really good at reading complex tables. I have used Table Transformers in combination with Camelot and with that I achieve 87%+ accuracy on really complex documents. What’s the backbone model used in this ?

Retrieval-Augmented Generation (RAG) significantly enhances document analysis by enabling more accurate and comprehensive retrieval of information from large datasets. It combines the retrieval of relevant documents with generative models to synthesize coherent responses, improving the understanding of complex documents and data analysis.

Like
Reply
Amar Harolikar

Specialist: Decision Sciences & Generative AI | Retail Banking

1mo

Wonderful. Been using Llama Parse since its release...now my go-to tool for parsing complex PDFs, especially Annual Reports and 10-Ks. Unmatched results. Easy to use via API and amazing markdown outputs, especially for tables.

Like
Reply
Viky Wahjoedin

Full-stack Developer at GRINDA AI

1mo

Please put this into pdf benchmark out there so we can see how it compares to other solutions quantitatively.

Victory Adugbo

Growth Marketing Leader & Business Developer || Expert in Hacking Business Growth in AI, Web3, and FinTech Companies || Automation Expert

1mo

Retrieval-Augmented Generation (RAG) benefits AI applications by enhancing accuracy, efficiency, and scalability. It improves response precision and contextual relevance, making it crucial for customer service and content creation.

Like
Reply

Retrieval-Augmented Generation (RAG) impacts AI learning efficiency by dynamically incorporating relevant external knowledge during the learning process. This approach enhances the model's understanding and reasoning capabilities, leading to more accurate and contextually relevant outputs.

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics