When SysOps need workflow.... Introducing Apache NiFi.

If you are a SysOps person, you probably have to deal with a lot of data generated by your servers, and in order to run an efficient IT department, you must be able to receive almost real-time notifications when something goes wrong. While there are already a lot of tools out there, sooner or later we always end up with our own scripts to tie everything together.

When I started experimenting with Apache NiFi, I came to realize that there are better ways to manage your servers' data flow...

What is Apache NiFi?

Apache NiFi is a dataflow tool that is quickly becoming quite popular in the Big Data world. According to the website, NiFi is:

...an easy to use, powerful, and reliable system to process and distribute data.

I think the Apache NiFi guys are being a bit too modest here :-) The way I would describe NiFi is:

Apache NiFi is a web-based tool that allows you to get data from almost any source, and transform/route it to almost any destination using an intuitive WYSIWYG workflow designer.

At the moment, you can receive/send data from/to the following data sources:

  • local files
  • HTTP/HTTPS (very handy if you want to integrate with cloud-based services like PagerDuty, HipChat, Slack, Twilio)
  • Syslog
  • S3
  • Twitter
  • FTP
  • SQS
  • Apache Kafka
  • Probably a lot more... :-)

How can Apache NiFi help in System Operations?

As a system operator, you probably deal with a lot of data already that needs to be processed and evaluated. Over the years, you probably developed your own solutions to deal with this data. Did you ever create scripts for one or more of these tasks:

  • Post an alert to a website when a system goes down?
  • Ship log files to another system for further analysis (via FTP, or to S3)?
  • Send an SMS when something happens that's not supposed to happen?

If the answer is yes to any of the questions, then NiFi might be an asset for your IT environment. True, writing your own scripts to solve those issues can give you a high sense of satisfaction, but the most important issue with this approach is this:

System operators should be focused on the data your environment generates, and not the code that processes that data.

Okay, some readers are probably rolling their eyes right now, but allow me to elaborate. First, let me ask you a few questions about the integration-scripts you developed yourself:

  • Can your script handle a network loss when it is in the middle of processing data?
  • Does it scale up to multiple threads?
  • How well does it perform when it suddenly needs to process more data (like 10x) compared to the usual load?
  • Do you have a central dashboard that shows the data flow happening in your scripts?

As someone in system operations, you probably don't want to deal with all the "details" mentioned above, you just want to get your data, transform it to what you want it to be, and send it to where it needs to go.

Maybe you have a team of coders that can handle those issues mentioned above, but they are probably busy developing your company's product, and probably don't have the resources either to assist you every time. You might consider a proprietary solution, but most of the time you will be stuck with what the vendor offers. You want tools that adapt to your workflow, not the other way around. Apache NiFi is free, and allows you to create any workflow you want, with any data you want.

How does Apache NiFi compare to an ELK stack?

If you are already using an ElasticSearch-LogStash-Kibana (ELK) stack, you might wonder how Apache NiFi fits in. In my opinion, they are two different systems that complement each other:

  • ELK is great for historical analysis of your data.
  • Apache NiFi is great of realtime processing of your data.

Example 1: Building a Syslog server.

I admit, I'm a big fan of ChatOps. Having a chat-room as the primary hub of communication for your operations team encourages teamwork, and makes it a lot easier to work with remote teams in different time zones as they have access to all the conversations that happened when they were still asleep :-)

One of the things I wanted, was a chat-room that acts as a live-feed of all the syslog messages generated by my servers. This is the first workflow I built in NiFi, and I was surprised I had everything up and running in less than 3 hours. Mind you, I had zero experience with NiFi when I built this, so I still needed to get the hang of it. If I had to develop this in a programming language I had no prior experience with, I think it would have taken longer than 3 hours.

I use HipChat for team chatrooms, so I need to format the data to something that HipChat expects, before posting it to the API HTTP server.

Here is what I ended up with:

nifi screenshot

Take a good look at the picture. Even without any NiFi experience, it's quite easy to figure out what's going on:

  • NiFi starts a syslog listener.
  • Some attributes are added which are required for HipChat formatting.
  • If it's an error, we add another attribute that will cause the message to be displayed in red. If not, it is displayed in green.
  • The last steps just transform the data to JSON, add the correct MIME type, and do a HTTP POST to the HipChat API server.

The only thing left to do, was to reconfigure my servers so syslog messages get forwarded to my NiFi server.

The output as shown in HipChat:

hipchat screenshot

The formatting could be improved, but it ain't bad for a first attempt :-)

Example 2: Building a HTTP to FTP gateway

Here is another example that shows how you can easily build a HTTP-to-FTP gateway with NiFi:

nifi screenshot 2

Once again, the flow is quite easy to follow:

  • NiFi listens for HTTP requests. Files can be uploaded via a HTTP POST request.
  • NiFi uploads the file to the FTP server and sends out an e-mail about the successful upload.
  • If the FTP transfer fails, the file is stored locally for further inspection, and an e-mail is sent out to notify the administrators.

Time to implement: 30 minutes more or less. Once again, no coding required.

Batch or real-time? Single-threaded or multi-threaded?

So, is NiFi optimized for real-time processing or batch-processing? The answer is simple: it depends on how you configure it. Every box in the diagram is called a "processor", and its throughput can be configured and tuned to your own wishes:

nifi screenshot 3

Conclusion.

I believe that Apache NiFi is a valuable asset to manage the data flow of your IT environment. I have a simple test to determine if a tool is worthwhile to me or not: if I can come up with more than 3 scenarios where this particular tool can help me, I consider it a winner. Apache NiFi beats that test without any doubt.

While the examples shown here are quite simple, it can handle very complex workflows, allows flows to be arranged in different process groups, and NiFi server also supports clustering.

Additional information.

I only scratched the surface of what Apache NiFi can do. There is a great introduction video from OSCON 2015, given by Joe Witt of HortonWorks. I recommend you check it out.