ModMS -- A Weather Model Management System for Realtime big data management.
ArabiaWeather has been working with large data sets to generate the best and most accurate forecast for the MENA region. One of the most challenging tasks of developing a fast, reliable and flexible system to harvest and mathematically operate weather models is finding a place to store them and retrieve them fast. At ArabiaWeather we have tried many of the databases out there in the market from SQL to NO-SQL, Key value stores and more. The challenge was to find a DB that can store model information (4 Dimensional, SpatioTemporal) Data and retrieve them fast. Most Databases have a problem retrieving the information fast, or have a problem with the need of large clusters to be able to do multiple operations at once yet they also end up being slow. We found a need to have the recent weather model information stored on a centralized database for our internal usage and not to build clusters to manage multiple model information. After multiple trials and errors with off the shelf tools we have decided to give it a shot our self. We have set off the requirements to have a fast indexing, horizontally scalable with an easy to use developer interface.
From there the short, very dynamic and agile development started to deliver such a system. We wanted to start off indexing GFS, GEM, GFS Marine Information, UKMET, ECMWF and our own in-house generated WRFs. The goal was to be able to index them fast, with high volume and query them fast with the ability to do mathematical operations on them. By utilizing some of the most interesting algorithms on market used in the weather tech space combined with a dynamic and agile compiled language (D-Lang) we at ArabiaWeather have been successful to generate a fully horizontally scalable, vertically scalable and very fast Management System. The initial benchmarks running on a DS2 Azure cloud instance (2 Cores, 7 GB Ram and an SSD Drive) was capable of writing 1.8 Million Records of weather parameters a second, store up to the test 20 Billion records and not only that but also retrieve 1.6K Requests a second from the server at 100 concurrent users. The server utilizes only 1 processor to serve the data and 1 processor to index the data.
With such an achievement within ArabiaWeather in a short period of time we aim to now expand on the system to index even more models, and go back further in historical data without putting overhead on the hardware rather than figuring out the fastest and most efficient way possible. The faster we process data the faster we provide it to our B2C and B2B client base.