{"id":2896,"date":"2026-04-05T05:07:36","date_gmt":"2026-04-05T05:07:36","guid":{"rendered":"https:\/\/aiopsschool.com\/blog\/?p=2896"},"modified":"2026-04-05T06:08:15","modified_gmt":"2026-04-05T06:08:15","slug":"databricks-tutorials-part-2-sql-ai-foundations-lab-using-serverless-starter-warehouse","status":"publish","type":"post","link":"https:\/\/aiopsschool.com\/blog\/databricks-tutorials-part-2-sql-ai-foundations-lab-using-serverless-starter-warehouse\/","title":{"rendered":"Databricks Tutorials \u2013 Part 2 \u2013 SQL &amp; AI Foundations Lab using Serverless Starter Warehouse"},"content":{"rendered":"\n<p>One important correction first: <strong>a Serverless Starter Warehouse alone cannot do the full AI\/ML lifecycle<\/strong>. A notebook attached to a SQL warehouse can run only <strong>SQL and Markdown<\/strong>, not Python. So the right end-to-end design is:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Serverless Starter Warehouse<\/strong> for SQL exploration and validation<\/li>\n\n\n\n<li><strong>Serverless notebook compute<\/strong> for Python, MLflow, training, and experiments<\/li>\n\n\n\n<li><strong>Model Serving<\/strong> for deployment<\/li>\n\n\n\n<li><strong>AI Playground \/ Agent tooling<\/strong> for agent prototyping<\/li>\n<\/ul>\n\n\n\n<p>That split matches the current Databricks product model. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/notebooks\/notebook-compute\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>flowchart LR\n    A&#91;Starter SQL Warehouse&lt;br\/&gt;SQL exploration] --&gt; B&#91;Unity Catalog schema]\n    B --&gt; C&#91;Serverless Notebook&lt;br\/&gt;load built-in sample data]\n    C --&gt; D&#91;MLflow experiment&lt;br\/&gt;train + compare runs]\n    D --&gt; E&#91;Register model&lt;br\/&gt;Unity Catalog]\n    E --&gt; F&#91;Model Serving&lt;br\/&gt;deploy endpoint]\n    F --&gt; G&#91;Test predictions&lt;br\/&gt;UI \/ SDK]\n    C --&gt; H&#91;AI Playground&lt;br\/&gt;LLM + tool agent prototype]\n    H --&gt; I&#91;Export \/ code-first agent path]\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">What this lab will show<\/h2>\n\n\n\n<p>By the end, a student will have done all of this in one workspace:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Used <strong>Starter Warehouse<\/strong> to query data.<\/li>\n\n\n\n<li>Created a <strong>schema<\/strong> in Unity Catalog.<\/li>\n\n\n\n<li>Loaded <strong>Databricks-provided sample data<\/strong> into managed tables.<\/li>\n\n\n\n<li>Written Python in a <strong>serverless notebook<\/strong>.<\/li>\n\n\n\n<li>Trained a model and tracked runs in <strong>MLflow experiments<\/strong>.<\/li>\n\n\n\n<li>Registered the best model in <strong>Unity Catalog Model Registry<\/strong>.<\/li>\n\n\n\n<li>Deployed it with <strong>Mosaic AI Model Serving<\/strong>.<\/li>\n\n\n\n<li>Tested the endpoint.<\/li>\n\n\n\n<li>Prototyped an <strong>AI agent<\/strong> in <strong>AI Playground<\/strong>.<\/li>\n\n\n\n<li>Run a <strong>code-first agent demo notebook<\/strong> that needs no data setup. Databricks provides sample datasets in both the <code>samples<\/code> catalog and <code>\/databricks-datasets<\/code>, and it also provides a standalone agent demo notebook that is ready to run with no setup or data required. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/discover\/databricks-datasets\">Databricks Documentation<\/a>)<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Before you start<\/h1>\n\n\n\n<p>This tutorial assumes:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>your workspace is <strong>Unity Catalog-enabled<\/strong><\/li>\n\n\n\n<li><strong>Serverless Starter Warehouse<\/strong> is already working<\/li>\n\n\n\n<li><strong>serverless notebooks<\/strong> are available<\/li>\n\n\n\n<li><strong>Serving<\/strong> and <strong>Playground<\/strong> are visible in the left menu<\/li>\n<\/ul>\n\n\n\n<p>Serverless notebooks require Unity Catalog, and in most Unity Catalog-enabled workspaces they are available without extra per-user setup. AI Playground requires a workspace\/region that supports Foundation Models. Model Serving and Foundation Model features are region-dependent. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/compute\/serverless\/notebooks\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<p>If <strong>Playground<\/strong> or <strong>Serving<\/strong> is missing, stop there and verify workspace feature availability before continuing. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/large-language-models\/ai-playground\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Part 1 \u2014 Use the Starter Warehouse first<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Step 1. Open SQL Editor<\/h2>\n\n\n\n<p>In Databricks:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Click <strong>SQL Editor<\/strong><\/li>\n\n\n\n<li>Attach <strong>Serverless Starter Warehouse<\/strong><\/li>\n\n\n\n<li>Run this warm-up query<\/li>\n<\/ol>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT * \nFROM samples.tpch.customer\nLIMIT 10;\n<\/code><\/pre>\n\n\n\n<p>Databricks exposes built-in sample data through the <code>samples<\/code> catalog, so this is the fastest way to prove your warehouse is working before you build anything. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/discover\/databricks-datasets\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 2. Create a schema for the lab<\/h2>\n\n\n\n<p>Use this SQL in the same editor:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>CREATE SCHEMA IF NOT EXISTS main.ai_ml_starter;\n<\/code><\/pre>\n\n\n\n<p>If <code>main<\/code> is not writable in your workspace, use any catalog where you have <code>USE CATALOG<\/code>, <code>USE SCHEMA<\/code>, <code>CREATE TABLE<\/code>, and <code>CREATE MODEL<\/code>. Databricks\u2019 ML quickstart explicitly requires those Unity Catalog permissions for the schema where you will write tables and register models. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<p>At this point you have:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>warehouse working<\/li>\n\n\n\n<li>schema ready<\/li>\n\n\n\n<li>no custom data yet<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Part 2 \u2014 Create the notebook and ingest built-in sample data<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Step 3. Create a new notebook<\/h2>\n\n\n\n<p>Create a notebook called:<\/p>\n\n\n\n<p><code>01_ml_lifecycle_serverless<\/code><\/p>\n\n\n\n<p>Then attach it to <strong>Serverless<\/strong> from the compute dropdown. Databricks documents that new notebooks can default to serverless on execution when no other compute is selected. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/compute\/serverless\/notebooks\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 4. Install the Python libraries<\/h2>\n\n\n\n<p>In the first cell, run:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>%pip install -U mlflow scikit-learn pandas matplotlib hyperopt\ndbutils.library.restartPython()\n<\/code><\/pre>\n\n\n\n<p>Serverless notebooks support dependency management through the notebook\/serverless environment, so installing these packages in the notebook is the simplest fresh-workspace path. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/compute\/serverless\/dependencies?utm_source=chatgpt.com\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 5. Add the setup cell<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>import mlflow\nimport pandas as pd\nimport sklearn.metrics\nimport sklearn.model_selection\nimport sklearn.ensemble\nimport matplotlib.pyplot as plt\n\nfrom hyperopt import fmin, tpe, hp, SparkTrials, STATUS_OK\nfrom hyperopt.pyll import scope\n\nmlflow.set_registry_uri(\"databricks-uc\")\n\nCATALOG_NAME = \"main\"\nSCHEMA_NAME = \"ai_ml_starter\"\nMODEL_NAME = f\"{CATALOG_NAME}.{SCHEMA_NAME}.wine_quality_model\"\n\nspark.sql(f\"CREATE SCHEMA IF NOT EXISTS {CATALOG_NAME}.{SCHEMA_NAME}\")\n\nprint(\"Registry URI set to Unity Catalog\")\nprint(\"Model name:\", MODEL_NAME)\n<\/code><\/pre>\n\n\n\n<p>Databricks\u2019 current ML quickstart uses Unity Catalog as the model registry and follows this same pattern: set the registry URI, choose catalog\/schema, then train and register the model there. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 6. Load the built-in wine dataset and save it as Unity Catalog tables<\/h2>\n\n\n\n<p>Run this cell:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>white_wine = spark.read.csv(\n    \"\/databricks-datasets\/wine-quality\/winequality-white.csv\",\n    sep=\";\",\n    header=True,\n    inferSchema=True\n)\n\nred_wine = spark.read.csv(\n    \"\/databricks-datasets\/wine-quality\/winequality-red.csv\",\n    sep=\";\",\n    header=True,\n    inferSchema=True\n)\n\nfor c in white_wine.columns:\n    white_wine = white_wine.withColumnRenamed(c, c.replace(\" \", \"_\"))\n\nfor c in red_wine.columns:\n    red_wine = red_wine.withColumnRenamed(c, c.replace(\" \", \"_\"))\n\nwhite_wine.write.mode(\"overwrite\").saveAsTable(f\"{CATALOG_NAME}.{SCHEMA_NAME}.white_wine\")\nred_wine.write.mode(\"overwrite\").saveAsTable(f\"{CATALOG_NAME}.{SCHEMA_NAME}.red_wine\")\n\nprint(\"Tables created:\")\nprint(f\"{CATALOG_NAME}.{SCHEMA_NAME}.white_wine\")\nprint(f\"{CATALOG_NAME}.{SCHEMA_NAME}.red_wine\")\n<\/code><\/pre>\n\n\n\n<p>This dataset is already available in Databricks under <code>\/databricks-datasets<\/code>, and Databricks\u2019 official ML getting-started tutorial uses these exact wine-quality files for a fresh model-building workflow. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 7. Quick validation in notebook<\/h2>\n\n\n\n<p>Run a SQL cell:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT quality, COUNT(*) AS cnt\nFROM main.ai_ml_starter.red_wine\nGROUP BY quality\nORDER BY quality;\n<\/code><\/pre>\n\n\n\n<p>That confirms:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>your notebook can read the tables<\/li>\n\n\n\n<li>the tables landed in Unity Catalog<\/li>\n\n\n\n<li>the sample data is ready for training<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Part 3 \u2014 Go back to the Starter Warehouse for SQL exploration<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Step 8. Query the same tables from SQL Editor<\/h2>\n\n\n\n<p>Open <strong>SQL Editor<\/strong> again on the <strong>Starter Warehouse<\/strong> and run:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT COUNT(*) AS rows_red\nFROM main.ai_ml_starter.red_wine;\n\nSELECT COUNT(*) AS rows_white\nFROM main.ai_ml_starter.white_wine;\n\nSELECT quality, COUNT(*) AS cnt\nFROM main.ai_ml_starter.red_wine\nGROUP BY quality\nORDER BY quality;\n<\/code><\/pre>\n\n\n\n<p>This is the point students clearly see the platform concept:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>notebook wrote governed tables<\/li>\n\n\n\n<li>warehouse can query them immediately<\/li>\n\n\n\n<li>same governed data supports both engineering and analytics<\/li>\n<\/ul>\n\n\n\n<p>That is one of the core Databricks mental models: workspace assets and SQL assets sit on top of the same governed Unity Catalog objects. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/compute\/sql-warehouse\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Part 4 \u2014 Train the first model in the notebook<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Step 9. Prepare the training data<\/h2>\n\n\n\n<p>Back in the notebook, run:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>white_pdf = spark.read.table(f\"{CATALOG_NAME}.{SCHEMA_NAME}.white_wine\").toPandas()\nred_pdf = spark.read.table(f\"{CATALOG_NAME}.{SCHEMA_NAME}.red_wine\").toPandas()\n\nwhite_pdf&#91;\"is_red\"] = 0.0\nred_pdf&#91;\"is_red\"] = 1.0\n\ndata_df = pd.concat(&#91;white_pdf, red_pdf], axis=0)\n\n# High quality = quality &gt;= 7\nlabels = data_df&#91;\"quality\"].astype(\"int\") &gt;= 7\nfeatures = data_df.drop(&#91;\"quality\"], axis=1)\n\nX_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(\n    features,\n    labels,\n    test_size=0.2,\n    random_state=1\n)\n\nprint(\"Train shape:\", X_train.shape)\nprint(\"Test shape :\", X_test.shape)\n<\/code><\/pre>\n\n\n\n<p>This follows the same official Databricks starter flow: combine red and white wine, create a boolean <code>is_red<\/code>, convert quality to a classification label, and split into train\/test. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 10. Train a baseline model and log it with MLflow<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>mlflow.autolog()\n\nwith mlflow.start_run(run_name=\"gradient_boost_baseline\"):\n    model = sklearn.ensemble.GradientBoostingClassifier(random_state=0)\n    model.fit(X_train, y_train)\n\n    predicted_probs = model.predict_proba(X_test)\n    roc_auc = sklearn.metrics.roc_auc_score(y_test, predicted_probs&#91;:, 1])\n\n    roc_curve = sklearn.metrics.RocCurveDisplay.from_estimator(model, X_test, y_test)\n    roc_curve.figure_.savefig(\"\/tmp\/roc_curve.png\")\n\n    mlflow.log_metric(\"test_auc\", roc_auc)\n    mlflow.log_artifact(\"\/tmp\/roc_curve.png\")\n\n    print(\"Baseline test AUC:\", roc_auc)\n<\/code><\/pre>\n\n\n\n<p>This is the cleanest first model because it is simple, fast, and aligned with Databricks\u2019 official \u201cbuild your first ML model\u201d tutorial. Databricks also documents that MLflow experiments organize training runs, and if no active experiment is set in a notebook, Databricks automatically creates a notebook experiment for you. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 11. Tell the student where to look<\/h2>\n\n\n\n<p>In the notebook, click the <strong>Experiment<\/strong> icon on the right or the run link in the output.<\/p>\n\n\n\n<p>The student should look for:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>parameters<\/li>\n\n\n\n<li>metrics<\/li>\n\n\n\n<li>artifacts<\/li>\n\n\n\n<li>the logged model<\/li>\n<\/ul>\n\n\n\n<p>That is the first \u201cexperimenting with code\u201d moment in the tutorial. Databricks\u2019 MLflow UI is designed exactly for this compare-and-review workflow. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Part 5 \u2014 Tune the model and compare runs<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Step 12. Run a light hyperparameter search<\/h2>\n\n\n\n<p>Use a smaller search than the full docs example so the lab stays practical on a fresh workspace.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>search_space = {\n    \"n_estimators\": scope.int(hp.quniform(\"n_estimators\", 20, 300, 1)),\n    \"learning_rate\": hp.loguniform(\"learning_rate\", -3, 0),\n    \"max_depth\": scope.int(hp.quniform(\"max_depth\", 2, 6, 1)),\n}\n\ndef train_model(params):\n    mlflow.autolog()\n    with mlflow.start_run(nested=True):\n        model_hp = sklearn.ensemble.GradientBoostingClassifier(\n            random_state=0,\n            **params\n        )\n        model_hp.fit(X_train, y_train)\n\n        predicted_probs = model_hp.predict_proba(X_test)\n        auc = sklearn.metrics.roc_auc_score(y_test, predicted_probs&#91;:, 1])\n\n        mlflow.log_metric(\"test_auc\", auc)\n        return {\"loss\": -auc, \"status\": STATUS_OK}\n\nspark_trials = SparkTrials(parallelism=4)\n\nwith mlflow.start_run(run_name=\"gb_hyperopt\"):\n    best_params = fmin(\n        fn=train_model,\n        space=search_space,\n        algo=tpe.suggest,\n        max_evals=8,\n        trials=spark_trials\n    )\n\nprint(best_params)\n<\/code><\/pre>\n\n\n\n<p>Databricks\u2019 official quickstart uses Hyperopt with MLflow integration so multiple runs are logged automatically, and it uses <code>SparkTrials<\/code> for parallel sweeps. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 13. Pick the best run and register the model<\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code>best_run = mlflow.search_runs(\n    order_by=&#91;\"metrics.test_auc DESC\", \"start_time DESC\"],\n    max_results=1\n).iloc&#91;0]\n\nprint(\"Best run_id:\", best_run.run_id)\nprint(\"Best test_auc:\", best_run&#91;\"metrics.test_auc\"])\n\nmodel_uri = f\"runs:\/{best_run.run_id}\/model\"\nregistered_model = mlflow.register_model(model_uri, MODEL_NAME)\n\nprint(\"Registered model:\", registered_model.name)\nprint(\"Version:\", registered_model.version)\n<\/code><\/pre>\n\n\n\n<p>This is the exact lifecycle step that turns an experiment output into a governed, reusable model artifact in Unity Catalog. Databricks\u2019 quickstart uses this same pattern with <code>mlflow.register_model(...)<\/code>. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/ml-get-started\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<p>At this point the student has already seen:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>code<\/li>\n\n\n\n<li>experiments<\/li>\n\n\n\n<li>comparison<\/li>\n\n\n\n<li>registration<\/li>\n<\/ul>\n\n\n\n<p>That is the classic ML lifecycle in Databricks.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Part 6 \u2014 Serve the model<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Step 14. Create the serving endpoint in the UI<\/h2>\n\n\n\n<p>Now move to the <strong>Serving<\/strong> page:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Click <strong>Serving<\/strong><\/li>\n\n\n\n<li>Click <strong>Create serving endpoint<\/strong><\/li>\n\n\n\n<li>Name it: <code>wine-quality-endpoint<\/code><\/li>\n\n\n\n<li>In <strong>Served entities<\/strong>, choose your Unity Catalog model<br><code>main.ai_ml_starter.wine_quality_model<\/code><\/li>\n\n\n\n<li>Pick the latest version<\/li>\n\n\n\n<li>Save<\/li>\n<\/ol>\n\n\n\n<p>Databricks Model Serving is the managed serverless layer for deploying custom ML models and exposing them as REST endpoints. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/model-serving\/\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 15. Test it from the Serving UI<\/h2>\n\n\n\n<p>Once the endpoint is ready, click <strong>Query endpoint<\/strong> and use this payload:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>{\n  \"dataframe_records\": &#91;\n    {\n      \"fixed_acidity\": 7.4,\n      \"volatile_acidity\": 0.70,\n      \"citric_acid\": 0.00,\n      \"residual_sugar\": 1.9,\n      \"chlorides\": 0.076,\n      \"free_sulfur_dioxide\": 11.0,\n      \"total_sulfur_dioxide\": 34.0,\n      \"density\": 0.9978,\n      \"pH\": 3.51,\n      \"sulphates\": 0.56,\n      \"alcohol\": 9.4,\n      \"is_red\": 1.0\n    }\n  ]\n}\n<\/code><\/pre>\n\n\n\n<p>Databricks documents that custom model endpoints accept <code>dataframe_split<\/code> or <code>dataframe_records<\/code>, and the response comes back under the <code>predictions<\/code> key. The Serving UI is the simplest test path. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/model-serving\/score-custom-model-endpoints\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Step 16. Optional: query it from code later<\/h2>\n\n\n\n<p>If you want the student to see programmatic scoring later, Databricks supports:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Serving UI<\/li>\n\n\n\n<li>REST API<\/li>\n\n\n\n<li>MLflow Deployments SDK<\/li>\n\n\n\n<li>SQL via <code>ai_query<\/code> for supported cases<\/li>\n<\/ul>\n\n\n\n<p>Those are the official supported query paths for served models. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/model-serving\/score-custom-model-endpoints\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Part 7 \u2014 Show the LLM \/ agent side of Databricks<\/h1>\n\n\n\n<p>This is a <strong>separate capability track<\/strong> from the sklearn model above. Do not force them into one object. Teach them as two platform capabilities:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>classic ML model lifecycle<\/strong><\/li>\n\n\n\n<li><strong>GenAI \/ agent lifecycle<\/strong><\/li>\n<\/ul>\n\n\n\n<p>That is the honest and clean way to teach Databricks today. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/gen-ai-llm-agent\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Option A \u2014 Fastest path: AI Playground<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Step 17. Open Playground<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Click <strong>Playground<\/strong><\/li>\n\n\n\n<li>Choose a hosted model<\/li>\n\n\n\n<li>Ask a simple prompt:\n<ul class=\"wp-block-list\">\n<li>\u201cExplain in simple terms what a classifier does.\u201d<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Add a second model to compare responses side by side<\/li>\n<\/ol>\n\n\n\n<p>Databricks AI Playground is the low-code place to test prompts, compare models, and prototype agents. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/large-language-models\/ai-playground\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 18. Prototype a tool-calling agent<\/h3>\n\n\n\n<p>In Playground:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Choose a <strong>Tools enabled<\/strong> model<\/li>\n\n\n\n<li>Click <strong>Tools<\/strong><\/li>\n\n\n\n<li>Add <code>system.ai.python_exec<\/code><\/li>\n\n\n\n<li>Ask something like:<br>\u201cUse Python to calculate the average of 9.4, 9.8, 10.0, and 10.2.\u201d<\/li>\n<\/ol>\n\n\n\n<p>Databricks documents <code>system.ai.python_exec<\/code> as a built-in tool students can use when prototyping tool-calling agents. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/gen-ai-llm-agent\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 19. Export the agent to code<\/h3>\n\n\n\n<p>In Playground:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Click <strong>Get code<\/strong><\/li>\n\n\n\n<li>Choose <strong>Create agent notebook<\/strong><\/li>\n<\/ol>\n\n\n\n<p>Databricks says this creates a notebook that defines the agent and deploys it to a serving endpoint. It is the fastest way to show students how a no-code prototype becomes code. Databricks also notes that this exported path currently uses a legacy Model Serving workflow, while Databricks now recommends <strong>Databricks Apps<\/strong> for long-term agent authoring. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/getting-started\/gen-ai-llm-agent\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Option B \u2014 Best code-first path for class: official ready-run agent notebook<\/h2>\n\n\n\n<p>This is the path I would actually use in a classroom because it is more deterministic.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 20. Import the official agent demo notebook<\/h3>\n\n\n\n<p>Use the official Databricks tutorial notebook:<\/p>\n\n\n\n<p><strong>Tutorial: Build, evaluate, and deploy a retrieval agent<\/strong><\/p>\n\n\n\n<p>Databricks says this notebook:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>is standalone<\/li>\n\n\n\n<li>uses a sample document corpus<\/li>\n\n\n\n<li>is <strong>ready to run with no setup or data required<\/strong><\/li>\n<\/ul>\n\n\n\n<p>That makes it perfect for your \u201cfresh environment, no sample data creation\u201d requirement. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/generative-ai\/tutorials\/agent-framework-notebook\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Step 21. Run the notebook top to bottom<\/h3>\n\n\n\n<p>That notebook teaches the student to:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>create an agent<\/li>\n\n\n\n<li>define tools<\/li>\n\n\n\n<li>use a sample corpus<\/li>\n\n\n\n<li>evaluate quality<\/li>\n\n\n\n<li>deploy the agent<\/li>\n<\/ul>\n\n\n\n<p>It is officially maintained by Databricks, so for this part I would prefer the official notebook over rewriting a fragile custom agent lab from scratch. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/generative-ai\/tutorials\/agent-framework-notebook\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Part 8 \u2014 What the student has learned<\/h1>\n\n\n\n<p>By the end of this single lab, the student will have seen:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>SQL analytics<\/strong> on Starter Warehouse<\/li>\n\n\n\n<li><strong>Unity Catalog schema and tables<\/strong><\/li>\n\n\n\n<li><strong>built-in sample data ingestion<\/strong><\/li>\n\n\n\n<li><strong>serverless notebook development<\/strong><\/li>\n\n\n\n<li><strong>MLflow experiment tracking<\/strong><\/li>\n\n\n\n<li><strong>hyperparameter experiments<\/strong><\/li>\n\n\n\n<li><strong>model registration<\/strong><\/li>\n\n\n\n<li><strong>model serving<\/strong><\/li>\n\n\n\n<li><strong>LLM prompting<\/strong><\/li>\n\n\n\n<li><strong>tool-calling agents<\/strong><\/li>\n\n\n\n<li><strong>code-first agent deployment<\/strong><\/li>\n<\/ul>\n\n\n\n<p>That is already a very strong \u201cDatabricks AI\/ML end-to-end\u201d tutorial for a fresh workspace.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Part 9 \u2014 Optional bonus with the Starter Warehouse<\/h1>\n\n\n\n<p>If your workspace supports AI Functions and pay-per-token Foundation Models, you can also show AI directly from SQL with the warehouse.<\/p>\n\n\n\n<p>Example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT ai_query(\n  'databricks-meta-llama-3-3-70b-instruct',\n  'Explain in 40 words what our wine quality classifier does.'\n) AS explanation;\n<\/code><\/pre>\n\n\n\n<p><code>ai_query<\/code> is Databricks\u2019 general AI function for invoking supported model serving endpoints directly from SQL or Python. It works on Databricks SQL, but feature availability depends on workspace\/region and model support. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/sql\/language-manual\/functions\/ai_query\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Part 10 \u2014 Clean up after the lab<\/h1>\n\n\n\n<p>To avoid surprise spend, tell students to clean up:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Stop or leave auto-stop enabled on the <strong>Starter Warehouse<\/strong><\/li>\n\n\n\n<li>Delete the <strong>model serving endpoint<\/strong> when finished<\/li>\n\n\n\n<li>Delete the schema objects if they no longer need them<\/li>\n\n\n\n<li>If they created agent endpoints or apps, remove those too<\/li>\n<\/ol>\n\n\n\n<p>Model Serving runs on serverless infrastructure, and Databricks Apps are billed while running. (<a href=\"https:\/\/docs.databricks.com\/aws\/en\/machine-learning\/model-serving\/\">Databricks Documentation<\/a>)<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">My recommendation for your tutorial format<\/h1>\n\n\n\n<p>Use this as <strong>three classroom chapters<\/strong>:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Chapter 1 \u2014 Data + SQL<\/h3>\n\n\n\n<p>Starter Warehouse, schema creation, built-in samples, SQL queries.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Chapter 2 \u2014 Classic ML<\/h3>\n\n\n\n<p>Serverless notebook, wine dataset, MLflow, experiments, registration, serving.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Chapter 3 \u2014 GenAI \/ Agents<\/h3>\n\n\n\n<p>Playground, tool-calling, export to code, official code-first agent demo notebook.<\/p>\n\n\n\n<p>That is the most honest and technically correct way to teach Databricks in 2026.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>Refernce URL 1 &#8211; <a href=\"https:\/\/gist.github.com\/devops-school\/6528d3f7935300e6d3537a4933a26727\">https:\/\/gist.github.com\/devops-school\/6528d3f7935300e6d3537a4933a26727<\/a><br>Reference URL 2 &#8211; <a href=\"https:\/\/gist.github.com\/devops-school\/3e74e74f525b22bb6ce3b65b11a14b7f\">https:\/\/gist.github.com\/devops-school\/3e74e74f525b22bb6ce3b65b11a14b7f<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>One important correction first: a Serverless Starter Warehouse alone cannot do the full AI\/ML lifecycle. A notebook attached to a [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-2896","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/2896","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/comments?post=2896"}],"version-history":[{"count":4,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/2896\/revisions"}],"predecessor-version":[{"id":2904,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/2896\/revisions\/2904"}],"wp:attachment":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/media?parent=2896"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/categories?post=2896"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/tags?post=2896"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}