Press Release

15 Jan 2015

WANdisco announces significant new contribution to Apache Hadoop open source project

New HDFS TRUNCATE Feature Sponsored by WANdisco Delivers Major Enhancement to Hadoop Big Data Transaction Handling

WANdisco (LSE: WAND), a leading provider of continuous-availability software for global enterprises to meet the challenges of Big Data, announced today that it had contributed code to the Apache Hadoop open source project that enables changes to the Hadoop Distributed File System (HDFS) to be undone automatically when a transaction is aborted. This new feature, referred to as TRUNCATE, is a standard capability of transactional systems. Previously, if a user mistakenly appended data to an existing file stored in HDFS, their only recourse was to recreate the file by rewriting the contents. In addition, software engineers developing Hadoop Big Data applications were forced to write code to work around this limitation.

WANdisco’s team, led by Dr. Konstantin Shvachko, the company’s Chief Architect for Big Data, who is also a senior committer on the Hadoop Project Management Committee and one of the original developers of HDFS, led the TRUNCATE effort. Other members of the team included Dr. Konstantin Boudnik, Plamen Jeliazkov, and Byron Wong. All Hadoop distributions will be able to leverage this major enhancement and users as well as application developers will benefit greatly.

“TRUNCATE represents a significant step forward that all Hadoop users and application developers will benefit from," said David Richards, CEO and Co-Founder of WANdisco. “WANdisco has been a sponsor of the Apache Software Foundation for many years, with senior committers on staff who have made significant contributions to Apache open source projects. Our work on TRUNCATE further demonstrates WANdisco’s deep and continued commitment to the Apache open source community.”

Further details about HDFS TRUNCATE can be found at:

Learn about five ways your Hadoop deployment can benefit from WANdisco.

About WANdisco

WANdisco is the world leader in Active Data Replication. Its patented WANdisco Fusion technology enables the replication of continuously changing data to the cloud and on-premises data centers with guaranteed consistency, no downtime and no business disruption. It also allows distributed development teams to collaborate as if they are all working in one location. WANdisco has an OEM with IBM as well as partnerships with Amazon Web Services, Cisco, Google Cloud, Hewlett Packard Enterprise, Microsoft Azure, and Oracle to resell its patented technology. WANdisco also works directly with Fortune 1000 companies around the world to ensure their data can give them the real insight they need.

For additional information, please visit

WANdisco plc
Alexandra Gee
VP Marketing & Communications