Evaluating 'Vector' - a new observability tool

Vector.io is a new observability tool, that is marketed as a one size fits all solution, for log parsing, data transformation, metrics aggregation and event collection. According to the creators, it’s Fast, Reliable, Unified, Vendor neutral, Customizable and Concise. Recently I had to make the decision if we should migrate our data pipeline to a new stack, and this tool was recommended by a co-worker, so I decided to make this evaluation.

Claims

They state that improvements compared to other tools like logstash and fluentd are the following:

A richer data model, supporting not only logs but aggregated metrics, fully structured events, etc

  • This is also available for the competitors as well - both Fluentd and Logstash support metrics pielines in different ways.

Programmable transforms written in lua (or eventually wasm) that let you parse, filter, aggregate, and otherwise manipulate your data in arbitrary ways

  • Both Fluentd and Logstash support custom plugin through ruby scripts.
  • My opinion is that Ruby is a bit more modern and has a less inclined learning curve than Lua.

Uncompromising performance and efficiency that enables a huge variety of deployment strategies

  • This one is a bit vague. Both Fluentd and Logstash have been proven to have great performance when configured correctly, allowing the processing of millions of records per second. However, it’s true that due to being written in Rust, Vector features a better memory footprint than Logstash, but the same is true for Fluentd.

Simple to configure

  • This one is kind of true. The configuration has a very simple style and is very easy to read, but not so easy to write. The documentation is not helping either - I will write more about this later.

Advantages and disadvantages

Advantages

  1. Very fast start-up when using a container compared to Logstash - a result from not having to spin up a Runtime Environment.

  2. Easier to track component based data-flow, compared to an event based data-flow in Logstash. By this, I mean that on every step in Vector’s configuration file, you have to specify input source, which makes it very easy to know where your data is coming from and follow what is happening to it. In logstash you’re working with the event directly and when you have a big pieline, it can become quite difficult to know the state of your event.

  3. Easier to read configuration file format. While Fluentd and Logstash use custom formats, Vector can use .toml or .yaml formats, which are a little more human readable.

Disadvantages

  1. Awful documentation - the documentation is structured in a way that is very hard to read and understand. Most of the time, if you copy and paste an example from the documentation, it will not work. Sometimes the example usage shown has incomplete or obsolete information. There have been times, when I configure a component, as written in the documentation, just to be greeted with a warning that its usage is deprecated.

  2. Immature - at the time of writing this article, Timber.io is preparing the first release of Vector.dev and the changes they make from version to version can be breaking. Hopefully from version 1.0.0 the Public API will be more stable, however that is not the case right now.

  3. Difficult setup and debugging of custom scripts. And by this, i mean the errors that I received while trying to write a Lua plugin were not helpful at all - at this time I cannot tell if this is a result of using Lua or just a problem with their implementation of the script interpreter.

  4. They are pushing their own language for transforms called “remap”

  5. They don’t care about the community. I’ve hit difficult issues configuring a pipeline using secure connections to different service and I’ve asked for help several times, just to find out that my posts in Github were deleted withut a notice - very rude and unpleasant.
    There is also this discussion about their initial product, where paying users mention how their support practically non-existent, to the point that they wonder if the service is still alive. At the moment of writing this, the original domainTimber.io already redirects to their new product’s home page at Vector.dev. Furhtermore, their login page shows a message that the service will be discontinued at 23.01.2021. Their last update is from 22.12.2018.

Real-life results

While being satisfied by how Logstash handles our data needs, I was looking for something that starts faster. Due to it’s Java Runtime nature, Logstash is booting up very slowly in a containerized environment, such as Docker, and so far I’ve failed to find a way to fix that.

Start TimeCPU IdleCPU LoadRAM IdleRAM LoadEvents/s
Logstash~200 sec0%60-70%~650MB~1300MB~2000
Vector~10 sec0~0.1%~90%~97MB~700MB~2000

The above table shows the results from my tests. We can see that Vector uses less overall resources, since it does not rely on a runtime-environment, unlike Java. While Vector does satisfy my condition - to have a faster boot time, it uses more CPU power, so you have to be careful about that.

Conclussion

Vector.dev is a promising observability tool, developed by Timber.io and I’m looking forward to see its progress. While it’s purpose it bo a universal tool to replace your existing pipelines, it’s not quite there yet. The immaturity of its ecosystem and the incomplete, and obsolete documentation introduce a steep learning and implementation curve. The development team promises great things for Vector, but in real life cases, they fall a bit short of those promises. Even further, it comes from the same developers of Timer.io, that left their paying customers without support or response to chase wild geese. This would lead us to think, that the project may be discontinued at any given time, if the developers decide they have something better to do. If you already have a working data flow that uses some of the other tools, such as Logstash, Fluentd or Fluentbit, I don’t recommend trying to switch to this Vector - the development cost and insecurity are not worth it. If you’re considering making a new data pipeline(and have some free time to kill), you can give this tool a try.

Reference

Setting up Kibana in a subpath and Nginx reverse proxy How to collect Prometheus metrics from Node.js or PM2 cluster mode

Comments