Skip to main content

Text Generation as a Service with Cloudera Data Science Workbench


Fortunately there is an awesome Text Generating Neural Network in Python 3 with Tensorflow/Keras by Max Woolf.

It is very easy to wrap this in a REST API from CDSW to use with Apache NiFi or microservices in your organization.

Here is my simple CDSW Model:

from time import gmtime, strftime
import os
import time
import psutil
from time import gmtime, strftime

# https://github.com/minimaxir/textgenrnn 
# To Install pip3 install textgenrnn
# Text Generation RNN
#
def textgeneration(args):
  
  # sentence = args["sentence"]
  
  start = time.time()
  textgen = textgenrnn()
  newtextstring = generated_texts = textgen.generate(n=1, temperature=0.5, return_as_list=True)
  end = time.time()
  row = { }
  row['starttime'] = '{0:.2f}'.format(start)
  row['sentence'] = str(newtextstring[0])
  row['endtime'] = '{0:.2f}'.format(end)
  row['runtime'] = '{0:.2f}'.format(end - start)
  row['systemtime'] = datetime.datetime.now().strftime('%m/%d/%Y %H:%M:%S')
  row['cpu'] = psutil.cpu_percent(interval=1)
  row['memory'] = psutil.virtual_memory().percent

  result = row

  return result

Python Setup


pip3.6 install tensorflow
pip3.6 install textgenrnn




Example Run


args = {}
textgeneration(args)
2019-03-13 01:31:59.772430: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
{'cpu': 0.3,
 'endtime': '1552440721.04',
 'memory': 18.6,
 'runtime': '1.29',
 'sentence': "A female programming bank - World's father",
 'starttime': '1552440719.75',
 'systemtime': '03/13/2019 01:32:01'}

Resources:

https://minimaxir.com/2018/05/text-neural-networks/

https://github.com/minimaxir/textgenrnn