The Exciting Frontier of
Custom KSQL Functions

Hi, I'm Mitch

Mailchimp

Agenda


Motivation

Why are custom KSQL functions important?

Why are custom KSQL functions exciting?

KSQL functions are shareable

They facilitate exploration of the

current technological landscape

Let's explore

Agenda


Terminology

UDFs


UDAFs


Example I

Basic functions


Concepts

The process of building custom KSQL functions is

easy and repeatable

Maven Archetype

Start with the business logic

Add annotations

Deploy

Verify

Invoke

What about UDAF s ?

Build and Deploy

(same as before)

Agenda


Example II

Sentiment Analysis


Concepts

Sentiment Analysis


Natural Language API

Configs vs Environment Variables

Maximizing throughput

Example III

Coversational interfaces


Concepts

Dialogflow

"Organizations report a reduction of up to 70 percent in call, chat and/or email inquiries after implementing a VCA" - Gartner research

Use cases


Example


input sourced from user

"I would like to book a room" - user123

response generated by Dialogflow via KSQL

"I can help with that. Where would you like to reserve a room?"

hybrid training


How do we safely improve the model over time?

In event-driven architectures, this is easy

"By storing only the events and never the commands, we have a wealth of capability that supports evolutionary change" - Neil Avery
https://www.confluent.io/blog/journey-to-event-driven-part-1-why-event-first-thinking-changes-everything

Neil Avery

Error flows

.
                    _ ._  _ , _ ._
                  (_ ' ( `  )_  .__)
                ( (  (    )   `)  ) _)
              (__ (_   (_ . _) _) ,__)
                  `~~`\ ' . /`~~`
                        ;   ;
                        /   \
          _____________/_ __ \_____________         .

Robin

https://www.confluent.io/blog/kafka-connect-deep-dive-error-handling-dead-letter-queues

Agenda


Example IV

Spam detection


Concepts


Let's see how easy it is to

build & export a model with h2o

Embed the model

Remote vs Embedded

Remote


Kai

https://www.confluent.io/blog/using-apache-kafka-drive-cutting-edge-machine-learning

Checkout Kai's anomaly detection UDF for another h2o example

https://github.com/kaiwaehner/ksql-udf-deep-learning-mqtt-iot

Kai

https://www.confluent.io/blog/using-apache-kafka-drive-cutting-edge-machine-learning

Agenda


Example V

Ruby UDF


Concepts

Installing guest languages

Graal updater (gu)

$ gu install ruby

$ gu available

ComponentId              Version             Component name
----------------------------------------------------------------
python                   1.0.0-rc15          Graal.Python
R                        1.0.0-rc15          FastR
ruby                     1.0.0-rc15          TruffleRuby

Now, let's create a Polyglot UDF!

Gotchas


Possible for full integration into KSQL?

I built a Proof of Concept  (POC)

https://github.com/magicalpipelines/docker-ksql-multilingual-udfs-poc

POC


Inline Python UDF   POC only

The POC shows that polyglot UDFs are possible...

But inline Java UDFs may come first

https://github.com/confluentinc/ksql/pull/2605

Agenda


Recap

What did we learn through these examples?


Vision

We have the ingredients for a rich ecosystem

There should be a community for sharing

KSQL functions


magicalpipelines.com/luna

Simply submit some info about your function at

github.com/magicalpipelines/luna

Then others can discover your function

Now what?

Go build something exciting

Links

Questions?