Hadoop is an open-source software framework that supports the processing and storage of extremely large datasets in a distributed computing environment. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are a common occurrence and should be automatically handled by the framework. It is part of the Apache project sponsored by the Apache Software Foundation.