Skip to content

HIVE-29638: Add AutoScaling to K8s operator#6507

Merged
ayushtkn merged 22 commits into
apache:masterfrom
ayushtkn:K8sautoscaling
Jun 19, 2026
Merged

HIVE-29638: Add AutoScaling to K8s operator#6507
ayushtkn merged 22 commits into
apache:masterfrom
ayushtkn:K8sautoscaling

Conversation

@ayushtkn

@ayushtkn ayushtkn commented May 26, 2026

Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

Add auto scaling to Hive Operator

Why are the changes needed?

Better usage & cloud saving.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manually

Installed Dependencies (ZK, Postgres & Ozone)

helm repo add bitnami https://charts.bitnami.com/bitnami
helm install zookeeper bitnami/zookeeper \
  --set replicaCount=1 --set auth.enabled=false \
  --set image.repository=bitnamilegacy/zookeeper \
  --set image.tag=3.9.3-debian-12-r21 \
  --set global.security.allowInsecureImages=true --wait


helm install postgres bitnami/postgresql \
  --set auth.username=hive --set auth.password=hive123 \
  --set auth.database=metastore --wait


kubectl create secret generic hive-db-secret --from-literal=password=hive123


helm repo add ozone https://apache.github.io/ozone-helm-charts/
helm install ozone ozone/ozone --version 0.2.0 --wait
sleep 50
kubectl exec statefulset/ozone-om -- ozone sh volume create /s3v
kubectl exec statefulset/ozone-om -- ozone sh bucket create /s3v/hive

Started Hive Operator With AutoScaling Enabled (Very Low Thresholds for Testing)

helm install hive ./helm/hive-operator \
  --set cluster.database.type=postgres \
  --set cluster.database.url="jdbc:postgresql://postgres-postgresql:5432/metastore" \
  --set cluster.database.driver="org.postgresql.Driver" \
  --set cluster.database.username=hive \
  --set cluster.database.passwordSecretRef.name=hive-db-secret \
  --set cluster.database.passwordSecretRef.key=password \
  --set cluster.database.driverJarUrl="https://repo1.maven.org/maven2/org/postgresql/postgresql/42.7.5/postgresql-42.7.5.jar" \
  --set cluster.zookeeper.quorum="zookeeper:2181" \
  --set cluster.storage.coreSiteOverrides."fs\.defaultFS"="s3a://hive" \
  --set cluster.storage.coreSiteOverrides."fs\.s3a\.endpoint"="http://ozone-s3g-rest:9878" \
  --set-string cluster.storage.coreSiteOverrides."fs\.s3a\.path\.style\.access"=true \
  --set 'cluster.storage.envVars[0].name=HADOOP_OPTIONAL_TOOLS' \
  --set 'cluster.storage.envVars[0].value=hadoop-aws' \
  --set 'cluster.storage.envVars[1].name=AWS_ACCESS_KEY_ID' \
  --set 'cluster.storage.envVars[1].value=ozone' \
  --set 'cluster.storage.envVars[2].name=AWS_SECRET_ACCESS_KEY' \
  --set 'cluster.storage.envVars[2].value=ozone' \
  --set cluster.hiveServer2.autoscaling.enabled=true \
  --set cluster.metastore.autoscaling.enabled=true \
  --set cluster.llap.autoscaling.enabled=true \
  --set cluster.tezAm.autoscaling.enabled=true \
  --set-string cluster.llap.configOverrides."hive\.llap\.daemon\.task\.scheduler\.wait\.queue\.size"="1" \
  --set cluster.hiveServer2.autoscaling.scaleUpThreshold=1 \
  --set cluster.metastore.autoscaling.scaleUpThreshold=2

Launched Beeline

kubectl exec -it deployment/hive-hiveserver2 -- beeline -u "jdbc:hive2://hive-hiveserver2:10001/;transportMode=http;httpPath=cliservice"

OUTPUTS:

Initial Start -> Only 1 HMS, 1 HS2 (1 == Min Configured)

image

Hits First Beeline Session -> Tez AM, LLAP Daemons starts (Min 1 configured)

image

AutoScaling HS2 to 2 & Tez AM(Reduced max threshold)

image

Tez AM
image

HS2
image

Auto Scaling HMS & LLAP to 2

image

HMS
image

LLAP (Load reduced by the time, query finished :-( )
image

Scale Downs (After Cooling Periods)

Scheduled
image

Done (After waiting for cool down period for specific service)
image

CPU tracking

HS2
image
HMS
image

@difin difin left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhangbutao

Copy link
Copy Markdown
Contributor

Thanx @zhangbutao for the great insights!!!

You hit the nail on the head regarding the shift from "YARN-thinking" to "Kubernetes-native thinking."

  1. Physical vs. Logical Isolation
    You are completely right about Workload Management (WLM). Trying to carve up a single JVM's heap and CPU cycles among competing tenants is incredibly complex and never gives you 100% true isolation. By shifting to Kubernetes, we get true physical isolation via namespaces, cgroups, and dedicated pod resources.
  2. How this could work technically
    What you are describing is entirely feasible. The LLAP instances register themselves in ZooKeeper under a specific app name (defaulting to @llap0). If we update the Operator to support an array of LLAP profiles (e.g., llap-cluster1, llap-cluster2), the Operator would spin up multiple independent StatefulSets, each registering to a different ZK path.

Then, exactly as you said, a user simply sets hive.llap.daemon.service.hosts=@llap-cluster1 in their JDBC string or session. TezAM would look up that specific ZK path, find those specific pods, and route the fragments exclusively to that tenant's dedicated executors.

  1. The Autoscaling Synergy
    The best part is how it ties into the autoscaling logic in this PR! Because each tenant's LLAP cluster would be its own independent K8s StatefulSet, the autoscaler would scale llap-cluster1 and llap-cluster2 completely independently. If user1 isn't running queries, their dedicated LLAP cluster scales to zero, costing nothing, while user2 can comfortably stay scaled up to 100 pods.

This is a fantastic concept for multi-tenancy. Since the core autoscaling loop and K8s operator primitives are established in this PR, building out "Multi-Tenant LLAP Compute Groups" on top of it feels like a perfect follow-up Jira ticket. I think it is definitely worth exploring! I will definitely give it a shot :-)

Your thoughts align completely with mine—this idea is both feasible and highly valuable. The reason I came up with this idea is that other MPP-architecture OLAP analytical engines, such as StarRocks and Doris, already have similar compute-group functionality that effectively isolates multi-tenant workloads. So the solution we've conceived is absolutely feasible and has practical value. Therefore, it is well worth our effort to explore this capability in depth. Thanks @ayushtkn

@zhangbutao zhangbutao left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM

@sonarqubecloud

Copy link
Copy Markdown

@ayushtkn

Copy link
Copy Markdown
Member Author

Thanx @aturoczy , @tanishq-chugh , @difin and @zhangbutao for the reviews!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants