🔎 #DruidSummit Speaker Spotlight: Imply's Abhishek Balaji Radhakrishnan – Ingesting Delta Lake tables into Druid Level: 🟠 Intermediate https://bit.ly/4dBxZsm
Imply’s Post
More Relevant Posts
-
Did you know how easy it is to create a stacked bar chart in DataLion? Here’s everything you need to get started. #dashboards #datavisualization
To view or add a comment, sign in
-
Principal Program Manager @ Microsoft, Azure Data CAT | Spark | Lakehouse | Blogger on all things Big Data
UPDATE: Release 0.1.2 of onelake-shortcut-tools is out which includes the updated Delta Lake 3.1 features available in Fabric Runtime 1.3 Preview (liquid clustering, default columns) Need to evaluate what external tables can be read from and written to in Fabric? Just install my library via PIP and get a clean report out of compatibility. See blog in comments for how to run it. #deltalake
To view or add a comment, sign in
-
Managing Delivery Architect at Capgemini with expertise in Azure Databricks and Data Engineering. I teach Azure Data Engineering and Databricks!
It took me a while to learn this: for Delta tables that use Hive-style partitioning instead of Liquid Clustering, running an OPTIMIZE command will not compact smaller files if the table is over-partitioned. This is because files cannot be combined or compacted across partition boundaries. As a result, you will experience poor performance when querying that Delta table unless you perform a full rewrite of the table with a new partition scheme or implement Liquid Clustering. #dataengineering
To view or add a comment, sign in
-
Python developer | Programmer | Data science | Data analyst | Mysql | Pandas | Numpy | Jupyter | Bca from virendra swarup institute of computer studies | Mca from asian international University
a program to generate multiplication table from 2 to 20 and write it to the different file. def generateTable(n): table = "" for i in range (1, 11): table += f"{n} X {i} = {n*i}\n" with open(f"tables/table_{n}.txt", "w") as f: f.write(table) for i in range (2, 21): generateTable(i)
To view or add a comment, sign in
-
Technical Lead 👨💻 | Data Engineer 📊 | Data Analysis 📈| Business Intelligence 🤖 | Cloud Engineering ☁️ | snowflake ❄️ | AWS 🌨️ | ETL 💠 | IICS 📑|Databricks
Hello Everyone, in this presentation, I discussed the three ways to convert Parquet tables to Delta tables based on different use cases. The three methods are "CONVERT TO DELTA","shallow cloning" and "deep cloning"
Convert Parquet Tables into Delta Tables
https://www.loom.com
To view or add a comment, sign in
-
I screwed up. I implemented the change. Micro-benchmarks were promising, but real SQL benchmarks with serious datasets were disappointing. Some queries were running three times slower. It felt awful, I felt awful! I thought I would document my findings, blame pointer-chasing for the slowdown, and move on. But then my dear #QuestDB colleagues Andrei and Vlad pushed me to investigate a bit more. Sure enough, the slowdown was not inherent to the design, but was merely due to an implementation bug. The bug was subtle enough to go undetected in tests with a limited dataset size. After fixing the bug, the SQL benchmarks showed a nice 30% speed-up in some queries. Moral of the story: 1. Microbenchmarks are fine, but nothing replaces real-world-like datasets. 2. Do not give up. 2. Benchmark numbers are just data. To gain reusable insights, you need to follow up on why the numbers are the way they are ;-)
Jaromir Hamala is cooking something fancy. That's a specialized single string key hash table for the #QuestDB query engine. The idea is to avoid redundant allocations and memcpy and, instead, store mmapped memory pointers in the table entries. The memory footprint is much lower and the table is more efficient.
To view or add a comment, sign in
-
Jaromir Hamala is cooking something fancy. That's a specialized single string key hash table for the #QuestDB query engine. The idea is to avoid redundant allocations and memcpy and, instead, store mmapped memory pointers in the table entries. The memory footprint is much lower and the table is more efficient.
To view or add a comment, sign in
-
UG-CSE'27 || AI Enthusiast | SQL | Passionate about in Problem solving DSA | C, C++, Java, Python | Looking for Internship
#Day27 of #MonthOfGraphs: LQ.1319)Number of operations to make network Connected ( Medium) Today i solved a question realted to spanning trees, which is a part of graphs and i used kruskal algorithm to solve the problem .initially i attempted with my logic and got solved of many cases but one case getting rejected and later i looked into description , so many members were facing same situation and later i watch strivers approach . he also done with same logic but there was a little change in it . and i corrected it by analyzing the some more examples and got solved. #DFS #BFS #GraphAlgorithms #Undirected #LearningDSA Raj Vikramaditya Mani Bhargavi Bendalam, Pradeep Kumar Puvvala, Praveen Kumar Inti, Sahu Akshaya
To view or add a comment, sign in
-
Senior QA Engineer @ ABN AMRO Bank N.V. | Test Automation Coach | DevOps, REST APIs | GenAI Enthusiast
All of us are trying to make our models efficient and more reliable. In order to the that we are either fine tuning or implementing a RAG pipeline. Hereby a good article how to build a GraphRAG from LlamaIndex https://lnkd.in/e7wdUyUf
To view or add a comment, sign in
-
Column mapping feature allows Delta table columns and the underlying #Parquet file columns to use different names. 🙌 📌 This enables Delta schema evolution operations such as 𝚁𝙴𝙽𝙰𝙼𝙴 𝙲𝙾𝙻𝚄𝙼𝙽 and 𝙳𝚁𝙾𝙿 𝙲𝙾𝙻𝚄𝙼𝙽𝚂 on a Delta table without the need to rewrite the underlying Parquet files. ✍ 📌 It also allows users to name Delta table columns by using characters that are not allowed by Parquet, such as spaces, so that users can directly ingest CSV or JSON data into Delta without the need to rename columns due to previous character constraints. 🔗 Check out the documentation for more: https://lnkd.in/eviHRkRr #opensource #oss #linuxfoundation #deltalake
To view or add a comment, sign in
15,275 followers