Splunk : Walkthrough to Set Up the Deep Learning Toolkit for Splunk with Amazon EKS

January 21, 2021 at 11:50 pm IST

By Junichi Maruyama January 22, 2021

The Splunk Deep Learning Toolkit (DLTK) is a very powerful tool that allows you to offload compute resources to external container environments. Additionally, you can use GPU or SPARK environments. In last Splunk blog post, The Power of Deep Learning Analytics and GPU Acceleration, you can learn more about building a GPU-based environment.

Splunk DLTK supports Docker as well as Kubernetes and OpenShift as container environments. In this article, we will go through the setup for using DLTK 3.3 and Amazon EKS as a kubernetes environment.

Some Prerequisite

To manage EKS and Kubernetes, you first need to install some CLI tools on your laptop. Please refer to this document for additional details on getting started.

Install awscli
Install ekscli
Install kubectl

Note: To manage EKS, the IAM user must have AmazonEKSClusterPolicy.

Also, please install Splunk DeepLearning Toolkit beforehand. This blog is targeted to DLTK 3.x.

Step Flow Overview

Let's take a look at the set up flow after this. In Amazon EKS, Fargate and Managed Node are available as Computer Nodes, but this time we are using Managed Node. Also, the storage service must support ReadWriteMany, so we used EFS this time. By the way, the default gp2 can be used in DLTK 4.0.

Create EKS cluster with Managed Node
Create and Setup EFS Storage Service for ReadWriteMany support
Create StorageClass and PersisetntVolume for EFS
Configure SecurityGroup for DLTK NodePort access
(Option) : Create new namespace
Setup Splunk DLTK to access EKS
Run the Pod for EKS

Step 1. Create EKS Cluster with Managed Node

First, create an EKS cluster. See here for details.


$ eksctl create cluster    
    --name <>  
    --nodegroup-name <> 
    --region <>  
    --node-type <> 
    --nodes <<1>> 
    --ssh-access 
    --ssh-public-key <>  
    --managed

In this time, we use the t3.medium instance type and one node for verification purposes. You can customize the other items as needed.It will take a while to create a cluster and node group.

Let's check if it has been created successfully.


$ kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.100.0.1           443/TCP   14d

          

$ kubectl get node
NAME                                           STATUS   ROLES    AGE   VERSION
ip-192-168-81-176.us-east-2.compute.internal   Ready       9d    v1.18.9-eks-d1db3c

Step 2. Create and Set Up EFS Storage Service for ReadWriteMany Support

Splunk DLTK 3.x uses volumes with 'ReadWriteMany' for storage, so we have to use EFS service.

For more information on setup, please refer to this document and proceed.

1. Deploy the Amazon EFS CSI driver to an Amazon EKS cluster


$ kubectl apply -k 
'github.com/kubernetes-sigs/aws-efs-csi-driver/deploy/kubernetes/overlays/stable/ecr/?ref=release-1.0'

2. To create an Amazon EFS file system for your Amazon EKS cluster

A. Get the Cluster's CIDR information

Locate the VPC ID for your Amazon EKS cluster. You can find this ID in the Amazon EKS console, or you can use the following AWS CLI command.


$ aws eks describe-cluster --name  --query
'cluster.resourcesVpcConfig.vpcId' --output text

Locate the CIDR range for your cluster's VPC. You can find this in the Amazon VPC console, or you can use the following AWS CLI command.

You'll use this CIDR information at the next step.

B. Create a new security group to allow NFS access.

Create a security group that allows inbound NFS traffic for your Amazon EFS mount points.

Open the Amazon VPC console at https://console.aws.amazon.com/vpc/.
Choose Security Groups in the left navigation panel, and then choose Create security group.
Enter a name and description for your security group, and choose the VPC that your Amazon EKS cluster is using.
Under Inbound rules, select Add rule.
Under Type, select NFS.
Under Source, select Custom, and paste the VPC CIDR range that you obtained in the previous step.
Choose Create security group.

C. Create the Amazon EFS file system for your Amazon EKS cluster.

Open the Amazon Elastic File System console at https://console.aws.amazon.com/efs/.
Choose File systems in the left navigation pane, and then choose Create file system.
On the Create file system page, choose Customize.
On the File system settings page, you don't need to enter or select any information, but can if desired, and then select Next.
On the Network access page, for Virtual Private Cloud (VPC), choose your VPC.
Under Mount targets, if a default security group is already listed, select the X in the top right corner of the box with the default security group name to remove it from each mount point, select the security group that you created in a previous step for each mount target, and then select Next.
On the File system policy page, select Next.
On the Review and create page, select Create.

D. Create Access Point

By Default, only root users can access this file system, so the DLTK cluster will fail to deploy the container. You should create a new access point for it.

Choose Access point in the left navigation pane, and then choose Create access point.
Choose the file system and enter root directory for this access point. (ex. /dltk)
On the root directory creation permissions. Enter owner's uid/gid/permission. (ex. 500/500/0777)

Step 3. Create StorageClass and PersisetntVolume for EFS

StorageClass

Copy and create this yaml file to your local laptop.

storageclass.yaml


kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: <>
provisioner: efs.csi.aws.com
allowVolumeExpansion: true

Deploy this storageclass to your cluster.


$ kubectl apply -f storageclass.yaml

Verify the deployment.


$ kubectl get sc
NAME            PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
efs-sc          efs.csi.aws.com         Delete          Immediate              true                   14d
gp2 (default)   kubernetes.io/aws-ebs   Delete          WaitForFirstConsumer   false                  14d

Persistent Volume

Copy and create this yaml file to your local laptop.

pv.yaml


apiVersion: v1
kind: PersistentVolume
metadata:
  name: <>
spec:
  capacity:
    storage: 20Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Delete
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: <>::<>

Change the name and volumeHandle ('fs-xxxxx' and 'fsap-xxxxxxxx') for your environment. Check your EFS configuration on your AWS console.

Deploy this persistent volume to your cluster.


$ kubectl apply -f pv.yaml

Verify the deployment.


$ kubectl get pv
NAME               CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM          STORAGECLASS   REASON   AGE
dltk-efs-volume    20Gi       RWX            Delete           Available              default/dltk   efs-sc                  25h

Step 4. Configure SecurityGroup for DLTK NodePort Access

DLTK 3.x supports Load Balancer or Node Port as Ingress type for kubernetes. At this time, I use Node Port as an Ingress type.

Find your EKS node on your EC2 console
Open the assigned Security Group. (nodegroup-ng-dltk-remoteAccess)

Add this Node Port range for your Security Group.

30000-32767: Node Port

Step 5. (Optional) Create New Namespace

This step is optional and you may skip it if you would like. If you skip this step, use default namespace for DLTK.

1. Create a new YAML file called my-namespace.yaml with the contents:

my-namespace.yamla


kind: Namespace
metadata:
  name: <>

Change the namespace name <> as you like.

Then run:


$ kubectl apply -f ./my-namespace.yaml

2. Verify your namespace. dltk is my new namespace.


$ kubectl get namespaces
NAME              STATUS   AGE
default           Active   15d
dltk              Active   33h
kube-node-lease   Active   15d
kube-public       Active   15d
kube-system       Active   15d

Step 6. Configure Splunk DLTK Set Up.

Go to Configuration -->Setup on DLTK App.

Node Port Internal Hostname : One of your EKS node's public IP address.
Node Port External Hostname : One of your EKS node's public IP address.
Namespace : This is a namespace created at the previous step.
Storage Class : This is a storage-class created at the previous step.

Step 7. Run the Pod for EKS

Go to Containers. Choose kubernetes on Cluster target. And Start!

Useful Kubectl Commands for Troubleshooting

If you have met any errors for set up, use this command for troubleshooting.

Check the Deployments status

$ kubectl get deployments --namespace=dltk
NAME   READY   UP-TO-DATE   AVAILABLE   AGE
dev        1/1           1                      1                    30h
 
$ kubectl describe deployment dev --namespace=dltk
<< More detail Information>>

Pods status

$ kubectl get pods --namespace=dltk
NAME                          READY     STATUS    RESTARTS   AGE
dev-7f9cdcc6d7-mzcdb   1/1         Running    0                    30h
 
$ kubectl describe pod <> --namespace=dltk
<< More detail Information>>

Persistent Volume Claim

NAME   STATUS   VOLUME             CAPACITY   ACCESS MODES   STORAGECLASS   AGE
dltk   Bound    dltk-efs-volume1   20Gi       RWX            efs-sc         34h
 
$ kubectl describe pvc <> --namespace=dltk
<< More detail Information>>

Persistent Volume

$ kubectl get pv 
NAME               CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM          STORAGECLASS   REASON   AGE
dltk-efs-volume1   20Gi       RWX            Delete           Bound    dltk/dltk      efs-sc                  34h
 
$ kubectl describe pv <>

Container Logs

$ kubectl logs -f <> --namespace=dltk

Monitoring EKS by Splunk Infrastructure Monitoring

Furthermore, you can monitor Amazon EKS using Splunk Infrastructure Monitoring (formerly Signal FX) to monitor the learning load in real-time.

We will not go into the set up of this one. Please refer to the setup guide here.

Summary

Once you complete setting up the DLTK with an EKS environment, you can easily extend and retract the computer resources. Furthermore, multiple DLTKs can share this EKS to optimize resources.

Today, we introduced the set up flow for development and testing purposes. If you need to run this for production, you can talk with your local Splunk engineers.

Finally, I would like to thank Philipp Drieger for his advice and support in writing this blog.

Attachments

Original document
Permalink

Disclaimer

Splunk Inc. published this content on 22 January 2021 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 21 January 2021 18:19:04 UTC

	1st Jan change	Capi.
MICROSOFT CORPORATION	+20.72%	3,371B
SYNOPSYS INC.	+19.63%	94.08B
CADENCE DESIGN SYSTEMS, INC.	+15.88%	86.55B
PALANTIR TECHNOLOGIES INC.	+66.98%	62.51B
DASSAULT SYSTÈMES SE	-21.78%	49.4B
THE TRADE DESK, INC.	+39.09%	48.66B
ATLASSIAN CORPORATION	-24.35%	47.23B
SEA LIMITED	+77.46%	42.4B
TAKE-TWO INTERACTIVE SOFTWARE, INC.	-4.75%	26.84B

1st Jan change

Capi.

MICROSOFT CORPORATION

+20.72%

3,371B

SYNOPSYS INC.

+19.63%

94.08B

CADENCE DESIGN SYSTEMS, INC.

+15.88%

86.55B

PALANTIR TECHNOLOGIES INC.

+66.98%

62.51B

DASSAULT SYSTÈMES SE

-21.78%

49.4B

THE TRADE DESK, INC.

+39.09%

48.66B

ATLASSIAN CORPORATION

-24.35%

47.23B

SEA LIMITED

+77.46%

42.4B

TAKE-TWO INTERACTIVE SOFTWARE, INC.

-4.75%

26.84B

Splunk Inc. Introduces New Security Innovations to Power the SOC of the Future	12/06	CI
Splunk Unveils Next-Generation Data Management Experience At the Edge and Beyond	12/06	CI
Splunk Inc. Introduces Advanced AI Enhancements for Observability, Security and IT Service Intelligence	12/06	CI
Cisco and Splunk Announce Integrated Full-Stack Observability Experience for the Enterprise	05/06	CI
Bitwarden Expands Splunk Cloud Integration for Advanced Event Management	16/05	CI
Splunk Unveils Asset and Risk Intelligence to Revolutionize Proactive Risk Mitigation	06/05	CI
ANALYST RECOMMENDATIONS : Best Buy, Wells Fargo, AMD, Netflix, Nvidia...	20/03
Splunk Inc.(NasdaqGM:SPLK) dropped from FTSE All-World Index	20/03	CI
Splunk Inc.(NasdaqGS:SPLK) dropped from S&P Software & Services Select Industry Index	20/03	CI
Splunk Inc.(NasdaqGS:SPLK) dropped from S&P TMI Index	20/03	CI
Splunk Inc.(NasdaqGS:SPLK) dropped from S&P Global BMI Index	20/03	CI
ANALYST RECOMMENDATIONS : 3M Company, Snowflake, Splunk, Micron, Nvidia...	19/03
How Cisco Will Integrate Splunk Into Company	18/03	MT
Cisco: completes acquisition of Splunk for $28 billion	18/03	CF
Splunk Inc.(NasdaqGS:SPLK) dropped from NASDAQ Composite Index	18/03	CI
Cisco Systems, Inc. completed the acquisition of Splunk Inc. from Hellman & Friedman Capital Partners X, L.P., managed by Hellman & Friedman LLC, BlackRock, Inc., The Vanguard Group, Inc., PRIMECAP Management Company and others for approximately $27 billion..	18/03	CI
Splunk Inc.(NasdaqGS:SPLK) dropped from NASDAQ-100 Index	15/03	CI
Add a little SaaS to your life	14/03
EU Watchdog Green-lights Cisco Systems' Purchase of Splunk	14/03	MT
Cisco gains EU antitrust nod for $28 billion Splunk acquisition	14/03	RE
Oracle posts rise in quarterly profit on strong cloud demand	12/03	RE
Linde to Join Nasdaq-100 Index	11/03	MT
Cisco's Splunk deal set to win unconditional EU antitrust OK, sources say	05/03	RE
GitLab shares drop as 'less conservative' forecast disappoints investors	05/03	RE
Splunk beats quarterly revenue estimates on steady demand for cloud services	28/02	RE

Splunk Inc.

Equities

SPLK

US8486371045

Software

Splunk : Walkthrough to Set Up the Deep Learning Toolkit for Splunk with Amazon EKS

Latest news about Splunk Inc.

Chart Splunk Inc.

Company Profile

Sector Other Software