'Big Data' is a catch-all term for data that won't fit usual containers, and can't be processed using traditional means. Big Data can be described in terms of:
For the volume of data, 'big' is a relative term, but size impacts when the data doesn’t fit onto a single server, because relational databases don’t scale well across multiple machines.
Data from networked sensors, smartphones, video surveillance, mouse clicks, etc. are continuously streamed.
The most difficult aspect of Big Data involves the lack of structure (not size (IS)). This lack of structure is a challenge because:
When the volume of data is too large to fit on a single server, the processing must be distributed across multiple machines. Functional programming can be used, because it makes it easier to write correct and efficient distributed code.
This is because functional programming languages support:
Fact-based models represent a dataset using facts stored in a graph.
A graph schema defines the structure of the dataset. It defines the types of nodes, edges, and properties (facts) defined in the dataset. New types of information can be added by defining new nodes, edges, and properties in the graph schema.