What is Principal Component Analysis (PCA) and why is it used in machine learning and data analysis? How does PCA reduce the dimensionality of large datasets while preserving important information? What are principal components and how are they calculated? How does PCA help improve visualization and model performance? What are the advantages and limitations of using PCA in real-world applications?