Install Apache Kyuubi on Top of Existing Spark

This guide walks you through setting up Apache Kyuubi on a single-node Ubuntu 24.04 server, running on top of an existing Spark installation, without enabling SSL and authentication.

We assume that Apache Spark is already installed following this guide:
Deploy Apache Spark 3.5.6 on a Single Node Bare-Metal Ubuntu 24.04 Server

Important: This setup is not production-ready.
For production environments, you should enable SSL encryption and authentication to protect data and restrict access.


  1. Install Apache Kyuubi
wget https://downloads.apache.org/kyuubi/kyuubi-1.10.2-incubating/apache-kyuubi-1.10.2-incubating-bin.tgz
tar -xzf apache-kyuubi-1.10.2-incubating-bin.tgz
sudo mv apache-kyuubi-1.10.2-incubating-bin /opt/kyuubi

  1. Configure Kyuubi
sudo su - spark

Edit the Spark configuration:

vi /opt/kyuubi/conf/kyuubi-defaults.conf

Add the following:

kyuubi.frontend.bind.host = 0.0.0.0
kyuubi.frontend.protocols = THRIFT_HTTP,THRIFT_BINARY

  1. Confirm Spark Is Installed

Make sure your Spark installation exists at /opt/spark as set up in the previous guide.

configure required environment variables in kyuubi-env.sh

vi /opt/kyuubi/conf/kyuubi-env.sh

Paste:

export SPARK_HOME=/opt/spark

Then:

chmod +x /opt/kyuubi/conf/kyuubi-env.sh
exit

  1. Create systemd Services
sudo vi /etc/systemd/system/kyuubi.service
[Unit]
Description=Apache Kyuubi Server
After=network.target

[Service]
Type=forking
User=ubuntu
Group=ubuntu
Environment=KYUUBI_HOME=/opt/kyuubi
ExecStart=/opt/kyuubi/bin/kyuubi start
ExecStop=/opt/kyuubi/bin/kyuubi stop
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

  1. Start Kyuubi Services
sudo systemctl daemon-reload
sudo systemctl enable --now kyuubi

  1. Test Kyuubi from Jupyter Notebook
import sys
!{sys.executable} -m pip install pyhive thrift thrift-sasl pure-sasl
from pyhive import hive

conn = hive.Connection(
    host='spark.maksonlee.com',
    port=10009,
    username='spark',
    auth='NONE'
)

cursor = conn.cursor()
cursor.execute("SHOW DATABASES")
print(cursor.fetchall())
[('default',), ('thingsboard',)]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top