What is XGBoost and how is it used in machine learning tasks? How does XGBoost improve upon traditional gradient boosting algorithms? What are the key features of XGBoost such as regularization and parallel processing? In what types of problems is XGBoost commonly applied, like classification and regression? What are the advantages and limitations of using XGBoost in real-world applications?