{"id":2901,"date":"2026-04-05T06:02:33","date_gmt":"2026-04-05T06:02:33","guid":{"rendered":"https:\/\/aiopsschool.com\/blog\/?p=2901"},"modified":"2026-06-13T07:22:33","modified_gmt":"2026-06-13T07:22:33","slug":"databricks-tutorials-part-3-ai-ml-end-to-end-lab-using-serverless-notebook-compute","status":"publish","type":"post","link":"https:\/\/aiopsschool.com\/blog\/databricks-tutorials-part-3-ai-ml-end-to-end-lab-using-serverless-notebook-compute\/","title":{"rendered":"Databricks Tutorials \u2013 Part 3 \u2013 AI\/ML end-to-end lab using Serverless notebook compute"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Absolutely. Here is a <strong>notebook-first, fresh-workspace, self-contained Databricks AI\/ML tutorial<\/strong> built around <strong>Serverless notebook compute<\/strong>. I based it on the current Databricks docs as of late March 2026, including serverless notebooks, the official ML quickstart, Unity Catalog model lifecycle, Model Serving, AI Playground, and the current retrieval-agent tutorial. The key fit for your setup is that <strong>serverless notebooks<\/strong> are the right place for Python, MLflow, training, and experiments, while sample data is already available in Databricks through <code>samples<\/code> and <code>\/databricks-datasets<\/code>. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/compute\/serverless\/notebooks\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What this tutorial will show<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">This lab lets a student do the full flow in one workspace: create a notebook, load built-in sample data, write Python, train a model, track experiments with MLflow, register the model in Unity Catalog, deploy it with Mosaic AI Model Serving, then move into the GenAI side with AI Playground and a Databricks-provided retrieval-agent notebook that is explicitly described as <strong>standalone<\/strong> and <strong>ready to run with no setup or data required<\/strong>. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Important reality check<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">For this tutorial, <strong>Serverless notebook compute is enough for the Python\/ML portion<\/strong>. Databricks says that if serverless interactive compute is enabled, all users in the workspace can attach notebooks to serverless compute with no extra permissions beyond the workspace capability itself. For model registration in Unity Catalog, the student still needs the target catalog\/schema permissions: <code>USE CATALOG<\/code>, <code>USE SCHEMA<\/code>, <code>CREATE TABLE<\/code>, and <code>CREATE MODEL<\/code>. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/compute\/serverless\/notebooks\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Databricks AI\/ML end-to-end lab using Serverless notebook compute<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">1. Learning outcome<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">By the end of this lab, the student will have created and run a real notebook-driven workflow that covers:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Python development in a Databricks notebook<\/li>\n\n\n\n<li>MLflow experiment tracking<\/li>\n\n\n\n<li>model training and comparison<\/li>\n\n\n\n<li>Unity Catalog model registration<\/li>\n\n\n\n<li>custom model deployment with Mosaic AI Model Serving<\/li>\n\n\n\n<li>no-code agent prototyping in AI Playground<\/li>\n\n\n\n<li>a code-first agent capstone using Databricks\u2019 official retrieval-agent tutorial notebook (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">2. High-level flow<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Use this mental model throughout the class:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>flowchart LR\n    A&#91;Serverless Notebook] --&gt; B&#91;Create schema in Unity Catalog]\n    B --&gt; C&#91;Load built-in wine sample data]\n    C --&gt; D&#91;Train model with scikit-learn]\n    D --&gt; E&#91;Track runs in MLflow]\n    E --&gt; F&#91;Register best model in Unity Catalog]\n    F --&gt; G&#91;Deploy with Mosaic AI Model Serving]\n    G --&gt; H&#91;Test endpoint]\n    A --&gt; I&#91;AI Playground]\n    I --&gt; J&#91;Export agent notebook]\n    J --&gt; K&#91;Run official retrieval-agent notebook]\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This mirrors the current Databricks platform flow: notebooks for development, MLflow for tracking, Unity Catalog for governed model lifecycle, Model Serving for deployment, and AI Playground \/ Agent Framework for agent prototyping and deployment workflows. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/manage-model-lifecycle\/\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">3. Prerequisites<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">Before starting, verify these in your workspace:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Unity Catalog is enabled<\/strong>.<\/li>\n\n\n\n<li><strong>Serverless notebook compute<\/strong> is available.<\/li>\n\n\n\n<li><strong>Serving<\/strong> appears in the left menu.<\/li>\n\n\n\n<li><strong>Playground<\/strong> appears in the left menu.<\/li>\n\n\n\n<li>You can create objects in at least one catalog and schema.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Databricks documents Unity Catalog as required for serverless notebooks and for the current recommended model lifecycle in Unity Catalog. AI Playground also requires workspace access to foundation models plus Unity Catalog and Agent Framework availability. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/compute\/serverless\/notebooks\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><strong>Instructor note<\/strong><br>If <code>main<\/code> is not writable in your workspace, keep the tutorial exactly the same and only change <code>CATALOG_NAME<\/code> to a catalog where the student has create permissions.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">4. Create the notebook<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">Create a new notebook in Workspace called:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>01_databricks_ai_ml_end_to_end<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Attach it to <strong>Serverless<\/strong> from the compute drop-down. Databricks says that in serverless-enabled workspaces, new notebooks default to serverless on execution if nothing else is selected. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/compute\/serverless\/notebooks\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The notebook shows <strong>Serverless<\/strong> as the active compute.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">5. Optional notebook environment setup<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">Databricks now provides an <strong>Environment<\/strong> side panel for serverless notebooks where you can manage dependencies, memory, usage policy, and base environment. If you later hit package or memory issues, use that panel instead of guessing. The same panel can switch the notebook to higher memory if needed. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/compute\/serverless\/dependencies\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><strong>Instructor note<\/strong><br>For this lab, standard serverless memory is usually enough. Only move to high memory if the notebook throws out-of-memory errors.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">6. Install Python libraries<\/h1>\n\n\n\n<h3 class=\"wp-block-heading\">Cell 1 \u2014 Python<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>%pip install -U mlflow scikit-learn pandas matplotlib hyperopt\ndbutils.library.restartPython()\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Package installation logs, then Python restarts.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This package set matches the official Databricks ML getting-started flow: <code>scikit-learn<\/code> for the model, <code>MLflow<\/code> for experiment tracking, and <code>Hyperopt<\/code> for automated hyperparameter tuning. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">7. Define your catalog, schema, and model names<\/h1>\n\n\n\n<h3 class=\"wp-block-heading\">Cell 2 \u2014 Python<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>import mlflow\nimport pandas as pd\nimport sklearn.metrics\nimport sklearn.model_selection\nimport sklearn.ensemble\nimport matplotlib.pyplot as plt\n\nfrom hyperopt import fmin, tpe, hp, SparkTrials, STATUS_OK\nfrom hyperopt.pyll import scope\n\n# Change only if main is not writable in your workspace\nCATALOG_NAME = \"main\"\nSCHEMA_NAME = \"ai_ml_starter\"\nMODEL_NAME = f\"{CATALOG_NAME}.{SCHEMA_NAME}.wine_quality_model\"\n\n# Explicitly use Unity Catalog for model registry\nmlflow.set_registry_uri(\"databricks-uc\")\n\nspark.sql(f\"CREATE SCHEMA IF NOT EXISTS {CATALOG_NAME}.{SCHEMA_NAME}\")\n\nprint(\"Catalog:\", CATALOG_NAME)\nprint(\"Schema:\", SCHEMA_NAME)\nprint(\"Model:\", MODEL_NAME)\nprint(\"Registry URI set to databricks-uc\")\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Printed values for catalog, schema, and full model name.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Databricks recommends <strong>Models in Unity Catalog<\/strong> for governing and deploying models, and the current docs note that in MLflow 3 the default registry URI is <code>databricks-uc<\/code>, which is the Unity Catalog-backed registry. Setting it explicitly here keeps the tutorial unambiguous for students. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/manage-model-lifecycle\/\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">8. Load built-in sample data<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">Databricks provides sample datasets in two convenient places: the <code>samples<\/code> catalog for table-style data and <code>\/databricks-datasets<\/code> for file-based data. The official ML tutorial uses the wine-quality dataset from <code>\/databricks-datasets<\/code>, so we\u2019ll use that exact path to keep the workflow fresh-workspace friendly and self-contained. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/discover\/databricks-datasets\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cell 3 \u2014 Python<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>white_wine = spark.read.csv(\n    \"\/databricks-datasets\/wine-quality\/winequality-white.csv\",\n    sep=\";\",\n    header=True,\n    inferSchema=True\n)\n\nred_wine = spark.read.csv(\n    \"\/databricks-datasets\/wine-quality\/winequality-red.csv\",\n    sep=\";\",\n    header=True,\n    inferSchema=True\n)\n\nfor c in white_wine.columns:\n    white_wine = white_wine.withColumnRenamed(c, c.replace(\" \", \"_\"))\n\nfor c in red_wine.columns:\n    red_wine = red_wine.withColumnRenamed(c, c.replace(\" \", \"_\"))\n\ndisplay(white_wine.limit(5))\ndisplay(red_wine.limit(5))\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Two preview tables showing red and white wine records.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">9. Save the sample data as your own governed tables<\/h1>\n\n\n\n<h3 class=\"wp-block-heading\">Cell 4 \u2014 Python<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>white_wine.write.mode(\"overwrite\").saveAsTable(f\"{CATALOG_NAME}.{SCHEMA_NAME}.white_wine\")\nred_wine.write.mode(\"overwrite\").saveAsTable(f\"{CATALOG_NAME}.{SCHEMA_NAME}.red_wine\")\n\nprint(f\"Created {CATALOG_NAME}.{SCHEMA_NAME}.white_wine\")\nprint(f\"Created {CATALOG_NAME}.{SCHEMA_NAME}.red_wine\")\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Printed confirmation that both tables were created.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is an important teaching step because it moves the student from \u201creading built-in sample files\u201d to \u201cworking with their own managed Unity Catalog tables,\u201d which is the right foundation for lineage, governance, and downstream model lifecycle. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/manage-model-lifecycle\/\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">10. Validate the data with SQL inside the same notebook<\/h1>\n\n\n\n<h3 class=\"wp-block-heading\">Cell 5 \u2014 SQL<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT quality, COUNT(*) AS cnt\nFROM main.ai_ml_starter.red_wine\nGROUP BY quality\nORDER BY quality;\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A result grid showing counts for each wine quality score.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><strong>Instructor note<\/strong><br>Pause here and explain the first platform lesson: one serverless notebook can mix Python and SQL while operating on the same Unity Catalog tables.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">11. Prepare the machine learning dataset<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">The official Databricks tutorial frames this as a classification problem: predict whether a wine is \u201chigh quality\u201d from its chemical properties. It also combines the red and white datasets and adds an <code>is_red<\/code> feature. We\u2019ll follow that same pattern so the student is aligned with the official learning path. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cell 6 \u2014 Python<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>white_pdf = spark.read.table(f\"{CATALOG_NAME}.{SCHEMA_NAME}.white_wine\").toPandas()\nred_pdf = spark.read.table(f\"{CATALOG_NAME}.{SCHEMA_NAME}.red_wine\").toPandas()\n\nwhite_pdf&#91;\"is_red\"] = 0.0\nred_pdf&#91;\"is_red\"] = 1.0\n\ndata_df = pd.concat(&#91;white_pdf, red_pdf], axis=0)\n\n# Label: high-quality wine means quality &gt;= 7\nlabels = data_df&#91;\"quality\"].astype(\"int\") &gt;= 7\nfeatures = data_df.drop(&#91;\"quality\"], axis=1)\n\nX_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(\n    features,\n    labels,\n    test_size=0.2,\n    random_state=1\n)\n\nprint(\"Train shape:\", X_train.shape)\nprint(\"Test shape:\", X_test.shape)\nprint(\"Positive rate:\", float(labels.mean()))\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Training shape, test shape, and the proportion of \u201chigh-quality\u201d wines.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">12. Train a baseline model and log it with MLflow<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">Databricks\u2019 ML tutorial uses MLflow to track the development process and recommends comparing runs rather than treating the notebook as a dead-end script. That is exactly the behavior we want students to see. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cell 7 \u2014 Python<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>mlflow.autolog()\n\nwith mlflow.start_run(run_name=\"gradient_boost_baseline\"):\n    model = sklearn.ensemble.GradientBoostingClassifier(random_state=0)\n    model.fit(X_train, y_train)\n\n    predicted_probs = model.predict_proba(X_test)\n    roc_auc = sklearn.metrics.roc_auc_score(y_test, predicted_probs&#91;:, 1])\n\n    roc_display = sklearn.metrics.RocCurveDisplay.from_estimator(model, X_test, y_test)\n    roc_display.figure_.savefig(\"\/tmp\/roc_curve.png\")\n\n    mlflow.log_metric(\"test_auc\", roc_auc)\n    mlflow.log_artifact(\"\/tmp\/roc_curve.png\")\n\n    print(\"Baseline test AUC:\", roc_auc)\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">AUC printed in the notebook, plus an MLflow run link in the output.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><strong>Instructor note<\/strong><br>Ask the student to click the run and inspect metrics, parameters, artifacts, and the model. This is the first moment they really \u201csee\u201d experiments rather than just code execution.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">13. Run hyperparameter tuning to create multiple experiments<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">The official tutorial uses Hyperopt to automate hyperparameter tuning and compare model runs. We\u2019ll keep the sweep small so it is classroom-friendly, but still enough to visualize what experimentation looks like. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cell 8 \u2014 Python<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>search_space = {\n    \"n_estimators\": scope.int(hp.quniform(\"n_estimators\", 20, 300, 1)),\n    \"learning_rate\": hp.loguniform(\"learning_rate\", -3, 0),\n    \"max_depth\": scope.int(hp.quniform(\"max_depth\", 2, 6, 1)),\n}\n\ndef train_model(params):\n    mlflow.autolog()\n    with mlflow.start_run(nested=True):\n        model_hp = sklearn.ensemble.GradientBoostingClassifier(\n            random_state=0,\n            **params\n        )\n        model_hp.fit(X_train, y_train)\n\n        predicted_probs = model_hp.predict_proba(X_test)\n        auc = sklearn.metrics.roc_auc_score(y_test, predicted_probs&#91;:, 1])\n\n        mlflow.log_metric(\"test_auc\", auc)\n        return {\"loss\": -auc, \"status\": STATUS_OK}\n\nspark_trials = SparkTrials(parallelism=4)\n\nwith mlflow.start_run(run_name=\"gb_hyperopt\"):\n    best_params = fmin(\n        fn=train_model,\n        space=search_space,\n        algo=tpe.suggest,\n        max_evals=8,\n        trials=spark_trials\n    )\n\nprint(\"Best parameters:\", best_params)\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A printed best-parameter dictionary and several nested MLflow runs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cell 9 \u2014 Python<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>best_run = mlflow.search_runs(\n    order_by=&#91;\"metrics.test_auc DESC\", \"start_time DESC\"],\n    max_results=1\n).iloc&#91;0]\n\nprint(\"Best run_id:\", best_run.run_id)\nprint(\"Best test_auc:\", best_run&#91;\"metrics.test_auc\"])\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The best <code>run_id<\/code> and best AUC score.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><strong>Instructor note<\/strong><br>Open the MLflow experiment UI and sort by <code>test_auc<\/code>. This is the clearest way to teach \u201cwrite code \u2192 experiment code \u2192 compare results.\u201d<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">14. Register the best model in Unity Catalog<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">Databricks recommends the Unity Catalog model registry for the full ML model lifecycle. Models there inherit centralized access control, lineage, auditing, and model discovery across workspaces. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/manage-model-lifecycle\/\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Cell 10 \u2014 Python<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>model_uri = f\"runs:\/{best_run.run_id}\/model\"\nregistered_model = mlflow.register_model(model_uri, MODEL_NAME)\n\nprint(\"Registered model:\", registered_model.name)\nprint(\"Version:\", registered_model.version)\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A printed model name like <code>main.ai_ml_starter.wine_quality_model<\/code> and a version number such as <code>1<\/code>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What to do in the UI<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Open <strong>Catalog<\/strong> and navigate to your catalog and schema. The model should now appear as a governed model object alongside your tables. Databricks explicitly supports models as Unity Catalog assets managed through the same governance plane. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/manage-model-lifecycle\/\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">15. Serve the model<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">Mosaic AI Model Serving is Databricks\u2019 managed deployment layer for AI and ML models. Databricks describes it as a unified interface for deploying, governing, and querying models for real-time and batch inference, exposed as REST APIs and backed by serverless compute. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/model-serving\/\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 15.1 \u2014 Create the endpoint in the UI<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Click <strong>Serving<\/strong> in the sidebar.<\/li>\n\n\n\n<li>Click <strong>Create serving endpoint<\/strong>.<\/li>\n\n\n\n<li>Name it:<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><code>wine-quality-endpoint<\/code><\/p>\n\n\n\n<ol start=\"4\" class=\"wp-block-list\">\n<li>In the served model section, choose your Unity Catalog model:<br><code>main.ai_ml_starter.wine_quality_model<\/code><\/li>\n\n\n\n<li>Select the latest version.<\/li>\n\n\n\n<li>Leave the defaults unless you need special scaling.<\/li>\n\n\n\n<li>Create the endpoint.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Databricks supports creating custom model endpoints through the Serving UI, REST API, or MLflow Deployments SDK; for a student lab, the Serving UI is the simplest path. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/model-serving\/create-manage-serving-endpoints?utm_source=chatgpt.com\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The endpoint status moves from provisioning to <strong>Ready<\/strong>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 15.2 \u2014 Test the endpoint in the UI<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Databricks says the easiest way to query a served custom model is from the <strong>Query endpoint<\/strong> panel in the Serving UI, using an accepted input format such as <code>dataframe_records<\/code>. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/model-serving\/score-custom-model-endpoints?utm_source=chatgpt.com\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Use this request body:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>{\n  \"dataframe_records\": &#91;\n    {\n      \"fixed_acidity\": 7.4,\n      \"volatile_acidity\": 0.70,\n      \"citric_acid\": 0.00,\n      \"residual_sugar\": 1.9,\n      \"chlorides\": 0.076,\n      \"free_sulfur_dioxide\": 11.0,\n      \"total_sulfur_dioxide\": 34.0,\n      \"density\": 0.9978,\n      \"pH\": 3.51,\n      \"sulphates\": 0.56,\n      \"alcohol\": 9.4,\n      \"is_red\": 1.0\n    }\n  ]\n}\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A prediction response showing the model\u2019s classification output.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><strong>Instructor note<\/strong><br>This is the \u201crelease and serve\u201d moment. Make the student say out loud: \u201cMy notebook code became a model, and my model became an endpoint.\u201d<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">16. Prototype an agent in AI Playground<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">Databricks\u2019 current no-code GenAI tutorial uses <strong>AI Playground<\/strong> to compare LLMs, prototype tool-calling agents, and export the result to code. It also states that a tools-enabled model can call the built-in Unity Catalog function <code>system.ai.python_exec<\/code> to execute Python in a sandboxed environment. (<a href=\"https:\/\/docs.databricks.com\/gcp\/en\/getting-started\/gen-ai-llm-agent\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 16.1 \u2014 Compare LLMs<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Click <strong>Playground<\/strong>.<\/li>\n\n\n\n<li>Ask:<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><code>What is a machine learning classifier?<\/code><\/p>\n\n\n\n<ol start=\"3\" class=\"wp-block-list\">\n<li>Add another model with the <strong>+<\/strong> button.<\/li>\n\n\n\n<li>Ask the same question with sync enabled.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Two model responses side by side.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 16.2 \u2014 Prototype a tool-calling agent<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Choose a model labeled <strong>Tools enabled<\/strong>.<\/li>\n\n\n\n<li>Open <strong>Tools<\/strong>.<\/li>\n\n\n\n<li>Add the built-in Unity Catalog function:<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><code>system.ai.python_exec<\/code><\/p>\n\n\n\n<ol start=\"4\" class=\"wp-block-list\">\n<li>Ask:<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><code>Use Python to calculate the average of 9.4, 9.8, 10.0, and 10.2.<\/code><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A response showing that the model invoked the tool and returned the computed result.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 16.3 \u2014 Export the agent to code<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">In AI Playground, click:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Get code \u2192 Create agent notebook<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Databricks says this action generates a Python notebook that defines the agent and deploys it to a model serving endpoint, though the docs also note that this exported notebook currently follows a legacy Model Serving\u2013based authoring workflow and that Databricks now recommends Databricks Apps for long-term agent authoring. For a student lab, this export is still a very strong bridge from no-code to code. (<a href=\"https:\/\/docs.databricks.com\/gcp\/en\/getting-started\/gen-ai-llm-agent\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A new notebook or notebook folder is created in Workspace containing the generated agent code.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">17. Run the official Databricks retrieval-agent notebook<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">For the capstone, do not invent your own RAG corpus. Databricks already provides an official <strong>Build, evaluate, and deploy a retrieval agent<\/strong> tutorial notebook. The docs say the example notebook contains all of the code used in the tutorial, uses a <strong>sample document corpus<\/strong>, and is <strong>ready to run with no setup or data required<\/strong>. (<a href=\"https:\/\/docs.databricks.com\/gcp\/en\/generative-ai\/tutorials\/agent-framework-notebook\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 17.1 \u2014 Import the notebook<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Open the Databricks tutorial page for <strong>Build, evaluate, and deploy a retrieval agent<\/strong> and use the <strong>Open notebook in new tab<\/strong> or <strong>Copy link for import<\/strong> action from that page. (<a href=\"https:\/\/docs.databricks.com\/gcp\/en\/generative-ai\/tutorials\/agent-framework-notebook\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 17.2 \u2014 Run it from top to bottom<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">That notebook demonstrates a fuller agent workflow, including tool use, MLflow tracing, evaluation, and deployment. The same page shows examples of using Unity Catalog functions as tools, enabling <code>mlflow.langchain.autolog()<\/code> for traces, evaluating the agent with <code>mlflow.evaluate(..., model_type=\"databricks-agent\")<\/code>, and deploying the agent to serving after registering it in Unity Catalog. (<a href=\"https:\/\/docs.databricks.com\/gcp\/en\/generative-ai\/tutorials\/agent-framework-notebook\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Expected output<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The student sees an end-to-end agent path beyond the simpler AI Playground prototype: tool registration, traces, evaluation, and deployment.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><strong>Instructor note<\/strong><br>This is the best capstone because it is official, current, and requires no custom data prep from the student.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">18. What the student has visualized end to end<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">At the end of this lab, the student has seen the platform in a complete and realistic order:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Notebook on Serverless<\/strong><\/li>\n\n\n\n<li><strong>built-in sample data<\/strong><\/li>\n\n\n\n<li><strong>own Unity Catalog tables<\/strong><\/li>\n\n\n\n<li><strong>Python model code<\/strong><\/li>\n\n\n\n<li><strong>MLflow experiment runs<\/strong><\/li>\n\n\n\n<li><strong>best-model selection<\/strong><\/li>\n\n\n\n<li><strong>Unity Catalog model registration<\/strong><\/li>\n\n\n\n<li><strong>serverless deployment with Model Serving<\/strong><\/li>\n\n\n\n<li><strong>tool-calling agent prototype in AI Playground<\/strong><\/li>\n\n\n\n<li><strong>official code-first retrieval-agent workflow<\/strong> (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">That is a strong, honest visualization of Databricks AI\/ML capabilities in a fresh workspace.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">19. Troubleshooting notes<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">If package management becomes messy, use the notebook\u2019s <strong>Environment<\/strong> side panel instead of repeatedly reinstalling things. Databricks documents that panel as the central place for dependencies, base environment, and memory settings for serverless notebooks. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/compute\/serverless\/dependencies\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If a student cannot create the schema or register the model, the most likely issue is missing Unity Catalog privileges. The required permissions for the catalog and schema are explicitly listed in the Databricks ML quickstart. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If Model Serving deployment fails, use the endpoint\u2019s logs and remember that the endpoint runs with the creator\u2019s associated identity for accessing Unity Catalog resources. Databricks documents both the creation flow and the creator-identity behavior for serving endpoints. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/model-serving\/create-manage-serving-endpoints?utm_source=chatgpt.com\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If Playground is missing or the tools-enabled model list is empty, the workspace may not have access to foundation models, Unity Catalog-backed AI features, or Agent Framework in that region. Databricks lists those as the prerequisites for the no-code AI Playground \/ agent workflow. (<a href=\"https:\/\/docs.databricks.com\/gcp\/en\/getting-started\/gen-ai-llm-agent\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">20. Cleanup<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">When the class is over, delete the serving endpoint first, then remove the model and tables if you do not need them. Model Serving uses serverless infrastructure, so deleting the endpoint is the most important cost-control step. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/model-serving\/\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Optional cleanup cell \u2014 SQL<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>DROP MODEL IF EXISTS main.ai_ml_starter.wine_quality_model;\nDROP TABLE IF EXISTS main.ai_ml_starter.red_wine;\nDROP TABLE IF EXISTS main.ai_ml_starter.white_wine;\nDROP SCHEMA IF EXISTS main.ai_ml_starter;\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">21. One-line teaching summary<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>This notebook-first lab shows how Databricks lets a student start from built-in sample data, write Python in a serverless notebook, track and compare ML experiments with MLflow, govern models in Unity Catalog, deploy them with Model Serving, and then move into agent development through AI Playground and the official retrieval-agent workflow.<\/strong> (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/compute\/serverless\/notebooks\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Reference<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/gist.github.com\/devops-school\/62c745223291b129008de73d76b2a994\">https:\/\/gist.github.com\/devops-school\/62c745223291b129008de73d76b2a994<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Absolutely. Here is a notebook-first, fresh-workspace, self-contained Databricks AI\/ML tutorial built around Serverless notebook compute. I based it on the [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1023],"tags":[],"class_list":["post-2901","post","type-post","status-publish","format-standard","hentry","category-databricks"],"_links":{"self":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/2901","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/comments?post=2901"}],"version-history":[{"count":2,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/2901\/revisions"}],"predecessor-version":[{"id":2903,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/2901\/revisions\/2903"}],"wp:attachment":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/media?parent=2901"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/categories?post=2901"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/tags?post=2901"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}