Remote Sensing, Storage and Retrieval

One of the most challenging technical issues that is faced by large scale remote sensing projects is the ability to collect fast rate, high volume, realtime information. Not to underestimate the most critical issue of all, real time processing of the data.

So the first question that comes to mind is what kind of architecture should we use ? How will we handle the high write rate, where do we store all that information and how do we retrieve it.

After R&D, we came to a nice approach. The usage of InfluxDB, a time series database that is built and made for time series events and a stated use case is sensor networks. The earlier versions of influx was doing ok but did not handle high write rates but with the 0.9 release it has changed completely, they changed nearly everything to become a lot more powerful when it comes to writes and querying information based on time. The only thing you will need to keep in mind is that for safe practice is to utilize Kafka if you have extreme write rates, also avoid allowing your apis to request the whole data set by mistake. Influx is a lot faster when you query out using time frames, and if you are using Influx then you better be doing that as thats what it s all about.

Combining Influx with Grafana gave us a beautiful interface to visualize our weather stations, so I would suggest you take a look at that. One thing we are now looking into is how can we do realtime processing of information before the data is placed in (or piped into) InfluxDB.