Introduction to Big GeoData: How to make it work

Introduction to Big GeoData: How to make it work
N-iX
2018-12-20T08:10:19+00:00

Nowadays, processing a big amount of data in the Big Data world is quite a routine. A great number of companies provide common Big Data ETL solutions using Hadoop utilities, which already became really popular and common. These usually include such pipeline as extracting necessary data (structured o...

Introduction to Big GeoData: How to make it work
By Oles Laba December 20, 2018

Nowadays, processing a big amount of data in the Big Data world is quite a routine. A great number of companies provide common Big Data ETL solutions using Hadoop utilities, which already became really popular and common. These usually include such pipeline as extracting necessary data (structured or not), transforming it to a user-needed form (apparently, it can be in most cases shaped as structured tables) and loading it to easily accessible storage for final users. This scenario covers most of the current projects.

Nevertheless, what should we do if we have some specific geodata coordinates or shapes? Ordinary loading of transformed data is not enough for the final user. They need something more. That’s the crux of the matter.

So here we will show you a brief description of how to work with it.

Back-End – What is under the hood?

Geo-Storage

Where do we store the geo-data? To answer this question, we need to ask “What do we want to get in the end?” We want to store the data somewhere and somehow. You can say, “Why don’t we use RDBS?” and that’s a good question. The answer to this question could be “yes”, but following a certain condition. What we’re trying to state here is that this approach is fine in case your data has a predefined schema and it’s not growing fast. Sounds good. But let’s say you want to have a possibility to work directly with geodata and all necessary tools for it. Or even more, let’s assume we have a real-time data processor pipeline which sends us an endless amount of schema-less data. Will the RDBMS be enough?

The most common way to store and process geo-data is to use NoSQL, which already has a good build on-top engine to store, process and transform geodata, similar to Accumulo or HBase for storing with GeoMesa.

Hello GeoMesa

As it’s been mentioned before, GeoMesa is one of the biggest and most popular on-top of the NoSQL DB engines, which gives the possibility of geospatial querying and performing analytics. GeoMesa includes a large number of functions for working with geodata, supporting common geo-types in the WKT format (see next chapter). Also, it implements command-line tools with various commands and provides API for Java language as well as Big Data tools, for example, Spark, which simplifies its interaction with a developer.

Geo-Types

The geodata domain presupposes a specific way of storing data, i.e. the “Well-Known Text” (WKT). In the context of the raw theory, WKT is a text markup language for the representation of geometric objects on a map, spatial reference system, etc. Briefly, this is a unified way of creating Points, Lines, Polygons, Multi-polygons, etc. from Integers, Decimals, Strings.

Here is how it should look like in examples:

how to work with big geodata. examples
These data types are common for all geo engines.

Front-End – Visualization engine

GeoServer as a geospatial view engine

So, now we have data stored in the proper way. However, how can we use this data for visualization? The simplest way to do that is to use GeoServer – a software server designed specifically for creating maps, viewing them, editing geospatial data or simply querying it with one of the geospatial standard languages. Let’s consider it in a broader context now.

GeoServer (as GeoMesa) uses CQL (Common Query Language) for querying data. It is a very simple matter for people who know SQL as far as their syntaxes are rather similar except for some limitations.

Moreover, GeoServer still does not run out of possibilities. It also has the implementations of geospatial protocols as for instance:

  • WFS (Web Map Service) – by users request it generates map images from geospatial data as a response. For example, let’s look closely at some analytics on cell phone usage in the USA. The next picture is made from GeoServer with “Layer Preview” tool:

how to work with big geodata. examples

Going further, we can add style and put it as a separate layer on some map UI:

how to work with big geodata. examples

However, there are cases when the visualization part can take more time for rendering than we can afford. In this situation, we might use “Image Mosaic” layer. Basically, this layer is a mosaic of georeferenced rasters. That means, we can generate GeoTiff file from our data based on some main value which will bring the color to the final raster. Here’s how it looks on GeoServer:

how to work with big geodata. examples

As in the previous example, we can add style and put it as a separate layer on some map UI:

how to work with big geodata. examples

  • WFS (Web Feature Service) – by users request it generates geographical features from geospatial data as a response. For example, we can get a response in JSON format:

how to work with big geodata. examples. code

Using this JSON we can create some shapes and features on UI with the help of OpenLayers.

how to work with big geodata. examples

  • WPS (Web Processing Service) – defines how a client can request the execution of a process and how the output from the process is handled;

In case you need something more, something special and custom – GeoServer perfectly allows you to build your own extensions of each of these services.

Conclusion. Putting it all together

Everything mentioned above gives a user a wide range of possibilities for integrating big data into the geospatial word. Moreover, you do not need to worry about configurations, for example, how the storage distributes the data.

All these technologies can be definitely used separately from each other. However, whenever you what to build an End-To-End application which has raw data and a final picture of it with the flexible and easy-supporting environment – this is the best choice.

Introduction to Big GeoData: How to make it work

HAVE A QUESTION?

SPEAK TO AN EXPERT

SHARE:
By Oles Laba December 20, 2018
Expertise
Machine Learning & AI
N-iX utilizes machine learning, artificial intelligence, [...]
Expertise
Big Data
N-iX big data developers and analysts help enterprises [...]
Expertise
Data Science
N-iX delivers Data Science as a Service (DSaaS) and [...]
Case study
Cloud Solution Development for Vable – a UK-based Content Automation Company
Vable is a UK-based content automation platform that [...]
Clients
Lebara
Lebara is one of Europe’s fastest growing mobile [...]
Clients
Gogo
Gogo is a leading provider of in-flight connectivity [...]
Clients
Vable
Vable is an established UK company providing an effective [...]

About N-iX

N-iX is an Eastern European provider of software development services with 900+ expert software engineers onboard that power innovative technology businesses. Since 2002 we have formed strategic partnerships with a variety of global industry leaders including OpenText, Novell, Lebara, Currencycloud and over 50 other medium and large-scale businesses. With delivery centers in Ukraine, Poland, Bulgaria, and Belarus, we deliver excellence in software engineering and deep expertise in a range of verticals including finance, healthcare, hospitality, telecom, energy and enterprise content management helping our clients to innovate and implement technology transformations.

Connect with our experts
Get in touch