What are Markov Chains and how are they used in machine learning and probability modeling? How does the Markov property define state transitions in a system? What are real-world applications of Markov Chains in prediction and sequence modeling? How are Markov Chains used in natural language processing and time-series analysis? What are the advantages and limitations of using Markov Chains in AI systems?