What is Apache Hadoop and how is it used in big data processing? How does Hadoop store and process large datasets across distributed systems? What are the main components of Hadoop such as HDFS, MapReduce, and YARN? How does Hadoop ensure scalability and fault tolerance in data processing? What are the common real-world applications and limitations of using Apache Hadoop in data engineering?