About
15 years of experience building distributed systems (and having fun) 🛠️ ⚙️ ⌨️ 😀
7.5…
Articles by Zhe
-
Impact of Large Requests in Shared Services
Impact of Large Requests in Shared Services
By Zhe Zhang
Activity
-
Yesterday was my last day at Anyscale. I am beyond grateful for an opportunity to have worked with and learned from the world-class talent there and…
Yesterday was my last day at Anyscale. I am beyond grateful for an opportunity to have worked with and learned from the world-class talent there and…
Liked by Zhe Zhang
-
If this calculation doesn't appeal to your entrepreneurial side, I don't know what does. It's from a little app I recently built after reading…
If this calculation doesn't appeal to your entrepreneurial side, I don't know what does. It's from a little app I recently built after reading…
Liked by Zhe Zhang
-
Come and join us in #RaySummit to see how Ray transformed Samsara’s AI innovation!
Come and join us in #RaySummit to see how Ray transformed Samsara’s AI innovation!
Liked by Zhe Zhang
Experience
Education
Publications
Patents
Projects
-
HDFS Erasure Coding
Currently HDFS triplicates each block by default for a number of purposes: 1) protection against DataNode failures; 2) better locality for MapReduce tasks; 3) avoidance of overloaded DataNodes through choosing among multiple replicas. Replication is expensive -- the default triplication scheme has 200% overhead in storage space and other resources (e.g., NameNode memory usage). However, for “cold” and "warm" datasets with low I/O activities, secondary block replicas are rarely accessed during…
Currently HDFS triplicates each block by default for a number of purposes: 1) protection against DataNode failures; 2) better locality for MapReduce tasks; 3) avoidance of overloaded DataNodes through choosing among multiple replicas. Replication is expensive -- the default triplication scheme has 200% overhead in storage space and other resources (e.g., NameNode memory usage). However, for “cold” and "warm" datasets with low I/O activities, secondary block replicas are rarely accessed during normal operation -- while consuming the same amount of resources as the primary ones. Therefore, a natural improvement is to use Erasure Coding (EC) in place of replication, which provides the same level of fault tolerance with much less storage space. In typical EC setups the storage overhead is ≤ 50%.
This project aims to build erasure coding support as a first-class citizen in HDFS. By doing it will improve the performance, security, and robustness upon prior solutions including HDFS-RAID, as well as enable storage saving for a much wider range of workloads.
Honors & Awards
-
Outstanding Technical Achievement Award
-
For contributions in the Software License Management project.
Languages
-
Chinese
Native or bilingual proficiency
-
English
Full professional proficiency
Recommendations received
2 people have recommended Zhe
Join now to viewMore activity by Zhe
-
Linus was invited to attend KubeCon in Hong Kong. On the way to a small dinner gathering, I was fortunate enough to sit next to the world’s best…
Linus was invited to attend KubeCon in Hong Kong. On the way to a small dinner gathering, I was fortunate enough to sit next to the world’s best…
Liked by Zhe Zhang
-
BetterCloud is the ONLY leader across all FIVE SaaS management categories from G2! Get the free report to find out which tools your peers are raving…
BetterCloud is the ONLY leader across all FIVE SaaS management categories from G2! Get the free report to find out which tools your peers are raving…
Liked by Zhe Zhang
-
One of the most common questions I get is "How is Anyscale different than OSS Ray", we just released a great page on this: https://lnkd.in/gnRFJVbq
One of the most common questions I get is "How is Anyscale different than OSS Ray", we just released a great page on this: https://lnkd.in/gnRFJVbq
Liked by Zhe Zhang
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Zhe Zhang in United States
215 others named Zhe Zhang in United States are on LinkedIn
See others named Zhe Zhang