What is a Sequence-to-Sequence (Seq2Seq) model, and how does it work in machine learning? What are the main components of a Seq2Seq architecture, such as encoder and decoder? How is Seq2Seq used in tasks like machine translation and text summarization? What are the advantages and limitations of Seq2Seq models? How have modern approaches improved upon traditional Seq2Seq architectures?